Introduction to Neural Networks for Beginners: Understanding the Brains Behind AI

5.22K 0 0 0 0

📗 Chapter 3: Training Neural Networks

From Raw Data to Smart Predictions — How Neural Networks Learn


🧠 Introduction

You’ve designed a neural network, set up the input/output layers, defined hidden layers, and chosen activation functions. But how does the network learn?

Training a neural network means teaching it to adjust its internal settings (weights and biases) so it makes accurate predictions on new data.

In this chapter, we’ll explore:

  • The key steps in training a neural network
  • How loss functions measure performance
  • How gradients are used to optimize the model
  • What backpropagation is and how it works
  • Real-world training examples using Python (Keras)
  • Techniques to improve training, like dropout, batch size, and learning rate

📘 Section 1: What Does “Training” Mean?

Training a neural network involves:

  1. Feeding input data
  2. Making predictions (forward pass)
  3. Calculating the error (loss function)
  4. Adjusting weights and biases (backpropagation)
  5. Repeating until the model performs well

📘 Section 2: Forward Propagation (Quick Review)

This is where:

  • Each input passes through layers
  • The output is computed at the final layer

Example:

python

 

output = activation(weight * input + bias)


📘 Section 3: Loss Functions — Measuring How Wrong the Model Is

The loss function quantifies the difference between the predicted output and the actual label.

🔧 Common Loss Functions:

Loss Function

Use Case

Mean Squared Error (MSE)

Regression problems

Binary Crossentropy

Binary classification

Categorical Crossentropy

Multi-class classification


💻 Example: MSE in Python

python

 

import numpy as np

y_true = np.array([2.5])

y_pred = np.array([3.0])

loss = np.mean((y_true - y_pred) ** 2)

print("MSE:", loss)


📘 Section 4: Backpropagation — Learning from Mistakes

Backpropagation is the process of updating the weights in a neural network by:

  • Calculating the error at the output
  • Propagating that error backward
  • Adjusting weights based on the gradient of the loss

This is where gradient descent comes in.


🧠 Gradient Descent

It helps the model "descend" the slope of the loss function by adjusting weights in the direction that reduces the loss.

🧮 Weight Update Rule:

plaintext

 

new_weight = old_weight - learning_rate * gradient


📘 Section 5: Learning Rate

The learning rate determines how big a step the optimizer takes during weight updates.

Learning Rate

Effect

Too high

May overshoot minimum

Too low

Slow training, may get stuck

A good starting point: 0.001 or 0.01


📘 Section 6: Optimizers

Optimizers control how weights are updated.

🔧 Common Optimizers:

Optimizer

Description

SGD

Basic stochastic gradient descent

Adam

Adaptive learning rates (most used)

RMSprop

Good for non-stationary objectives


📘 Section 7: Epochs, Batches, and Iterations

  • Epoch: One full pass through the entire training dataset
  • Batch: Subset of data passed through at once
  • Iteration: One update step (per batch)

Example:

If you have 1000 samples, a batch size of 100, and 5 epochs, you’ll perform:

sql

 

10 iterations per epoch × 5 epochs = 50 iterations


📘 Section 8: Training a Neural Network with Keras

python

 

from keras.models import Sequential

from keras.layers import Dense

 

# Create model

model = Sequential()

model.add(Dense(32, input_dim=4, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

 

# Compile model

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

 

# Dummy data

import numpy as np

X = np.random.rand(100, 4)

y = np.random.randint(2, size=100)

 

# Train

model.fit(X, y, epochs=10, batch_size=8)


📘 Section 9: Evaluating Model Performance

After training, evaluate the model:

python

 

loss, accuracy = model.evaluate(X, y)

print("Loss:", loss, "Accuracy:", accuracy)

Also use:

  • Confusion matrix
  • ROC-AUC for binary classifiers
  • Mean Absolute Error (MAE) for regression

📘 Section 10: Improving Training Performance

🔧 Tips:

Technique

Description

Dropout

Prevents overfitting by randomly disabling neurons during training

Early Stopping

Stops training when no improvement is seen

Learning Rate Scheduler

Adjusts learning rate during training

Data Normalization

Helps model converge faster and more accurately


💡 Adding Dropout Example:

python

 

from keras.layers import Dropout

 

model = Sequential()

model.add(Dense(64, input_dim=4, activation='relu'))

model.add(Dropout(0.3))  # 30% neurons off each pass

model.add(Dense(1, activation='sigmoid'))


Chapter Summary Table


Term

Meaning

Forward Propagation

Passing inputs through layers to generate output

Loss Function

Measures prediction error

Backpropagation

Updates weights to reduce loss

Optimizer

Determines how weights are adjusted

Epoch/Batch

Defines how often and how much data is used per update

Dropout

Technique to avoid overfitting

Back

FAQs


1. What is a neural network in simple terms?

Answer: A neural network is a computer system designed to recognize patterns, inspired by how the human brain works. It learns from examples and improves its accuracy over time, making it useful for tasks like image recognition, language translation, and predictions.

2. What are the basic components of a neural network?

  • Input layer (receives data)
  • Hidden layers (process the data)
  • Output layer (returns the result)
  • Weights and biases (learned during training)
  • Activation functions (introduce non-linearity)

3. How does a neural network learn?

Answer: It learns through a process called training, which involves:

  • Making a prediction (forward pass)
  • Comparing it to the correct output (loss function)
  • Adjusting weights using backpropagation and optimization
  • Repeating this until the predictions are accurate

4. Do I need a strong math background to understand neural networks?

Answer: Basic understanding of algebra and statistics helps, but you don’t need advanced math to get started. Many tools like Keras or PyTorch simplify the process so you can learn through experimentation and visualization.

5. What are some real-life applications of neural networks?

  • Facial recognition systems
  • Voice assistants like Siri or Alexa
  • Email spam filters
  • Medical image diagnostics
  • Stock market prediction
  • Chatbots and translation apps

6. What’s the difference between a neural network and deep learning?

Answer: Neural networks are the building blocks of deep learning. When we stack multiple hidden layers together, we get a deep neural network — the foundation of deep learning models.

7. What is an activation function, and why is it important?

 Answer: An activation function decides whether a neuron should be activated or not. It introduces non-linearity to the model, allowing it to solve complex problems. Common ones include ReLU, Sigmoid, and Tanh.

8. What’s the difference between supervised learning and neural networks?

Answer: Supervised learning is a type of machine learning where models learn from labeled data. Neural networks can be used within supervised learning as powerful tools to handle complex data like images, audio, and text.

9. Are neural networks always better than traditional machine learning?

Answer: Not always. Neural networks require large datasets and computing power. For small datasets or structured data, simpler models like decision trees or SVMs may perform just as well or better.

10. How can I start building my first neural network?

Answer: Start with:

  • Python
  • Libraries like Keras or PyTorch
  • Simple datasets like Iris, MNIST, or Titanic
    Follow tutorials, practice coding, and visualize how data flows through the network.