Backpropagation Algorithm
Discover how neural networks learn by propagating errors backward and updating weights through gradient descent.
Inputs
Training Parameters
Backpropagation Algorithm
1. Forward Pass
Calculate activations from input to output:
a = σ(Wx + b) where σ is sigmoid activation
2. Calculate Error
Compute the difference between prediction and target:
Error = Target - Output
Loss = ½(Target - Output)²
3. Backward Pass
Calculate deltas (error signals) from output to input:
δ_output = Error × σ'(output)
δ_hidden = (δ_next · W^T) × σ'(activation)
4. Update Weights
Adjust weights using gradient descent:
W_new = W_old + η × δ × a_prev
b_new = b_old + η × δ
where η is the learning rate
Key Concepts
- Sigmoid Function: σ(x) = 1/(1 + e^(-x)) - squashes values to (0,1)
- Delta (δ): Measures how much a neuron contributes to the error
- Gradient: Direction and magnitude of weight update
- Learning Rate: Controls step size during weight updates
- Weight Color: Green = positive connection, Red = negative (inhibitory)
Learning Objectives
- ✓Understand the forward and backward passes in neural networks
- ✓Observe how errors propagate from output to input layers
- ✓See how weights are updated using gradient descent
- ✓Learn the role of learning rate in network training
- ✓Visualize loss minimization over training epochs
🔄 Algorithm Steps
1Forward Pass
Compute activations from input to output layer. Each neuron applies weights, adds bias, and passes through activation function (sigmoid).
2Calculate Error
Compare network output with target value. Compute loss (Mean Squared Error) to measure how far off the prediction is.
3Backward Pass
Calculate deltas (δ) for each neuron from output to input. Delta represents how much that neuron contributed to the error.
4Update Weights
Adjust all weights and biases using gradient descent. Move in direction that reduces error, scaled by learning rate.
💡 Experimentation Tips
- •Learning Rate: Try high values (1.5+) to see oscillation, low values (0.1) for slow convergence
- •Watch Deltas: Enable gradient display to see error signals propagating backward
- •Weight Colors: Green = excitatory (positive), Red = inhibitory (negative)
- •Training Speed: Adjust animation speed to observe individual updates or fast convergence
- •Different Targets: Change target output to see network adapt to new goals
🌍 Real-World Applications
Image Recognition
Deep neural networks use backprop to learn features from millions of images for classification tasks.
Natural Language Processing
Language models like GPT use backpropagation to learn word patterns and generate human-like text.
Speech Recognition
Voice assistants train neural networks with backprop to convert speech to text accurately.
Autonomous Vehicles
Self-driving cars use neural networks trained with backprop to detect objects and make driving decisions.
Medical Diagnosis
AI systems learn to detect diseases from medical images using backpropagation-trained networks.
Game AI
AlphaGo and other game-playing AIs use backprop to learn winning strategies from experience.
📚 Key Concepts Explained
Activation Function (Sigmoid)
Squashes neuron outputs to range (0,1). Formula: σ(x) = 1/(1 + e^(-x))
Delta (δ)
Error signal for each neuron. Shows how much it contributed to total error.
Gradient
Direction and magnitude of steepest descent. Used to update weights toward lower error.
Learning Rate (η)
Step size for weight updates. Too high causes oscillation, too low slows convergence.
Loss Function (MSE)
Mean Squared Error: ½(target - output)². Measures prediction accuracy.