Intermediate

Backpropagation Algorithm

Discover how neural networks learn by propagating errors backward and updating weights through gradient descent.

Gradient DescentWeight UpdatesError PropagationNeural Network Training
Input Layer0.50x10.30x2Hidden Layer0.000.000.00Output Layer0.00Target: 0.80Error: 0.800Weight Colors:PositiveNegative
Epoch
0
Loss (MSE)
0.0000
Output
0.0000

Inputs

Training Parameters

Backpropagation Algorithm

1. Forward Pass

Calculate activations from input to output:

a = σ(Wx + b) where σ is sigmoid activation

2. Calculate Error

Compute the difference between prediction and target:

Error = Target - Output
Loss = ½(Target - Output)²

3. Backward Pass

Calculate deltas (error signals) from output to input:

δ_output = Error × σ'(output)
δ_hidden = (δ_next · W^T) × σ'(activation)

4. Update Weights

Adjust weights using gradient descent:

W_new = W_old + η × δ × a_prev
b_new = b_old + η × δ

where η is the learning rate

Key Concepts

  • Sigmoid Function: σ(x) = 1/(1 + e^(-x)) - squashes values to (0,1)
  • Delta (δ): Measures how much a neuron contributes to the error
  • Gradient: Direction and magnitude of weight update
  • Learning Rate: Controls step size during weight updates
  • Weight Color: Green = positive connection, Red = negative (inhibitory)

Learning Objectives

  • Understand the forward and backward passes in neural networks
  • Observe how errors propagate from output to input layers
  • See how weights are updated using gradient descent
  • Learn the role of learning rate in network training
  • Visualize loss minimization over training epochs

🔄 Algorithm Steps

1Forward Pass

Compute activations from input to output layer. Each neuron applies weights, adds bias, and passes through activation function (sigmoid).

2Calculate Error

Compare network output with target value. Compute loss (Mean Squared Error) to measure how far off the prediction is.

3Backward Pass

Calculate deltas (δ) for each neuron from output to input. Delta represents how much that neuron contributed to the error.

4Update Weights

Adjust all weights and biases using gradient descent. Move in direction that reduces error, scaled by learning rate.

💡 Experimentation Tips

  • Learning Rate: Try high values (1.5+) to see oscillation, low values (0.1) for slow convergence
  • Watch Deltas: Enable gradient display to see error signals propagating backward
  • Weight Colors: Green = excitatory (positive), Red = inhibitory (negative)
  • Training Speed: Adjust animation speed to observe individual updates or fast convergence
  • Different Targets: Change target output to see network adapt to new goals

🌍 Real-World Applications

Image Recognition

Deep neural networks use backprop to learn features from millions of images for classification tasks.

Natural Language Processing

Language models like GPT use backpropagation to learn word patterns and generate human-like text.

Speech Recognition

Voice assistants train neural networks with backprop to convert speech to text accurately.

Autonomous Vehicles

Self-driving cars use neural networks trained with backprop to detect objects and make driving decisions.

Medical Diagnosis

AI systems learn to detect diseases from medical images using backpropagation-trained networks.

Game AI

AlphaGo and other game-playing AIs use backprop to learn winning strategies from experience.

📚 Key Concepts Explained

Activation Function (Sigmoid)

Squashes neuron outputs to range (0,1). Formula: σ(x) = 1/(1 + e^(-x))

Delta (δ)

Error signal for each neuron. Shows how much it contributed to total error.

Gradient

Direction and magnitude of steepest descent. Used to update weights toward lower error.

Learning Rate (η)

Step size for weight updates. Too high causes oscillation, too low slows convergence.

Loss Function (MSE)

Mean Squared Error: ½(target - output)². Measures prediction accuracy.