What is Backpropagation?
Backpropagation (backward propagation of errors) is the algorithm used to train neural networks. It calculates the gradient of the loss function with respect to each weight by propagating errors backward through the network.
The process involves: (1) forward pass to get predictions, (2) calculate loss, (3) backward pass to compute gradients, and (4) update weights using gradients. This allows the network to learn from its mistakes.
The Learning Process
Backpropagation enables neural networks to learn by adjusting weights:
print("Training a Neural Network with Backpropagation:")
print("=" * 50")
print("\nStep 1: Forward Pass")
print(" - Input data flows through network")
print(" - Each layer computes: output = activation(weights × inputs + bias)")
print(" - Final output is prediction")
print("\nStep 2: Calculate Loss")
print(" - Compare prediction with actual target")
print(" - Loss function measures error (e.g., Mean Squared Error)")
print(" - Example: loss = (predicted - actual)²")
print("\nStep 3: Backward Pass (Backpropagation)")
print(" - Calculate gradient of loss with respect to each weight")
print(" - Start from output layer, work backward")
print(" - Use chain rule of calculus")
print(" - Gradient tells us: which direction to adjust weight")
print("\nStep 4: Update Weights")
print(" - Adjust weights using gradients")
print(" - weight_new = weight_old - learning_rate × gradient")
print(" - Learning rate controls step size")
print("\nRepeat Steps 1-4 for many iterations (epochs)!")
Gradient Calculation
Backpropagation uses the chain rule to calculate gradients:
print("Gradient Calculation in Backpropagation:")
print("=" * 50")
print("\nGoal: Calculate d(loss)/d(weight)")
print(" - How much does loss change when weight changes?")
print("\nUsing Chain Rule:")
print(" d(loss)/d(weight) = d(loss)/d(output) × d(output)/d(weight)")
print(" - Break complex derivative into simpler parts")
print("\nExample Calculation:")
print(" Forward: output = sigmoid(weight × input + bias)")
print(" Loss: loss = (output - target)²")
print(" "))
print(" d(loss)/d(output) = 2 × (output - target)")
print(" d(output)/d(weight) = sigmoid'(weight×input+bias) × input")
print(" "))
print(" Gradient = 2 × (output - target) × sigmoid'(...) × input")
print("\nKey Insight:")
print(" - Gradients flow backward from output to input")
print(" - Each layer's gradient depends on next layer's gradient")
print(" - Start from output, work backward through layers")
Weight Updates
Weights are updated using gradient descent:
print("Weight Update Process:")
print("=" * 50")
current_weight = 0.5
gradient = 0.3
learning_rate = 0.1
print(f"Current weight: {current_weight}")
print(f"Gradient: {gradient}")
print(f"Learning rate: {learning_rate}")
new_weight = current_weight - learning_rate * gradient
print(f"\nUpdated weight: {new_weight}")
print(" Formula: weight_new = weight_old - α × gradient")
print("\nLearning Rate Impact:")
print(" - Too small: Slow learning")
print(" - Too large: Overshoot, unstable training")
print(" - Just right: Efficient convergence")
print("\nMultiple Updates:")
print(" - Process repeats for each training example")
print(" - Or batch updates (average gradients over batch)")
print(" - Weights gradually improve over many iterations")
Exercise: Understand Backpropagation
Complete the exercise on the right side:
- Task 1: Simulate forward pass and calculate prediction
- Task 2: Calculate loss (difference between prediction and target)
- Task 3: Calculate gradient (derivative of loss with respect to weight)
- Task 4: Update weight using gradient descent rule
Write your code to simulate the backpropagation learning process!
💡 Learning Tip
Practice is essential. Try modifying the code examples, experiment with different parameters, and see how changes affect the results. Hands-on experience is the best teacher!
🎉
Lesson Complete!
Great work! Continue to the next lesson.