Visualizing Neural Network Forward and Backpropagation in Three.js
January 4, 2026
I wanted to create a simple visual representation of how neural networks actually work, not just the static diagrams you see in textbooks, but something that shows signals flowing forward and errors propagating backward. The result is the 3D animation on my homepage.
The challenge was representing the math visually while keeping it performant in the browser. Here's how I approached it.
The Animation State Machine
The visualization cycles through six phases that mirror actual neural network training:
phase: "idle" | "input" | "propagate" | "output" | "backprop" | "weight_update";
Each phase has its own duration, and the full cycle runs about 12 seconds:
export const ANIMATION_CONFIG = {
cycleDuration: 12000,
inputDuration: 800,
propagationDuration: 1000,
outputDuration: 600,
backpropDuration: 1000,
weightUpdateDuration: 600,
};
Forward Propagation: The Math
To keep it simple, during forward propagation, each neuron computes a weighted sum of its inputs, then applies an activation function. For hidden layers, I use ReLU. It simply returns the input if positive, otherwise zero:
function relu(x: number): number {
return Math.max(0, x);
}
The output layer uses softmax to convert raw scores into probabilities. Each value becomes e^x / sum(e^x) for all
values:
function softmax(values: number[]): number[] {
const max = Math.max(...values);
const exps = values.map((v) => Math.exp(v - max)); // numerical stability
const sum = exps.reduce((a, b) => a + b, 0);
return exps.map((e) => e / sum);
}
One important implementation detail: in the actual code I don’t repeatedly scan the full connections array while
animating.
Instead I build a lookup map once so I can go from (fromNeuron, toNeuron) → weight in O(1). That keeps the animation
smooth.
const connectionWeightByKey = new Map<string, number>();
for (const c of connections) {
connectionWeightByKey.set(`${c.from}->${c.to}`, c.weight);
}
Each layer's activation is computed by summing weighted inputs from the previous layer:
function propagateLayer(layerIndex: number): void {
if (layerIndex === 0 || layerIndex >= layerCount) return;
const prevLayer = neuronsByLayer[layerIndex - 1];
const currentLayer = neuronsByLayer[layerIndex];
state.activations[layerIndex] = currentLayer.map((neuron) => {
let sum = 0;
prevLayer.forEach((prevNeuron, j) => {
const w = connectionWeightByKey.get(
`${prevNeuron.index}->${neuron.index}`,
);
if (w !== undefined) {
const prevActivation = state.activations[layerIndex - 1][j];
sum += prevActivation * w;
}
});
// Hidden layers: ReLU
// Output layer: raw logits (softmax applied after)
return layerIndex === layerCount - 1 ? sum : relu(sum + 0.5);
});
// Softmax on output layer
if (layerIndex === layerCount - 1) {
state.activations[layerIndex] = softmax(state.activations[layerIndex]);
}
}
A quick note on the + 0.5: in a real neural network you’d typically have a learnable bias term per neuron.
Here it’s just a small constant offset so more hidden neurons light up (and the visualization looks better).
Backpropagation: Computing Gradients
The backpropagation phase is where the network "learns." In the visualization I treat the output error as:
gradient = activation - target
This mirrors the common softmax + cross-entropy gradient shape (the familiar p - y form) when the target is a one-hot
label or a probability distribution. In my case the target is random because the goal is to show the flow, not to
actually train on a dataset—so it’s a visualization-friendly error signal rather than a true loss gradient.
For hidden layers, the error is propagated backward through the weights, multiplied by the derivative of ReLU (which is 1 if the activation was positive, 0 otherwise).
Also important: the input layer doesn’t have an activation function here, so I skip computing gradients for it.
function calculateGradients(): void {
// Random "target" distribution for visualization
const rawTarget = neuronsByLayer[layerCount - 1].map(() => Math.random());
const targetSum = rawTarget.reduce((acc, value) => acc + value, 0);
state.target =
targetSum > 0 ? rawTarget.map((value) => value / targetSum) : rawTarget;
// Output layer error
state.gradients[layerCount - 1] = state.activations[layerCount - 1].map(
(a, i) => a - state.target[i],
);
// Backpropagate through hidden layers (skip input layer l = 0)
for (let l = layerCount - 2; l >= 1; l--) {
state.gradients[l] = neuronsByLayer[l].map((neuron, j) => {
let error = 0;
neuronsByLayer[l + 1].forEach((nextNeuron, k) => {
const w = connectionWeightByKey.get(
`${neuron.index}->${nextNeuron.index}`,
);
if (w !== undefined) {
error += state.gradients[l + 1][k] * w;
}
});
// ReLU derivative
const reluDerivative = state.activations[l][j] > 0 ? 1 : 0;
return error * reluDerivative;
});
}
// Keep input gradients at 0
state.gradients[0] = neuronsByLayer[0].map(() => 0);
}
Weight Updates (Visualized)
I also visualize a "weight update" signal without actually mutating weights each cycle. For each connection, I compute a weight-gradient-like value:
dW ≈ error_at_to * activation_at_from
This is enough to color/flash the connections in a way that looks like learning.
function getWeightGradient(connectionIndex: number): number {
const conn = connections[connectionIndex];
const toNeuron = neurons[conn.to];
const fromNeuron = neurons[conn.from];
const toLayer = toNeuron.layer;
const fromLayer = fromNeuron.layer;
const indexInToLayer = neuronsByLayer[toLayer].findIndex(
(n) => n.index === toNeuron.index,
);
const indexInFromLayer = neuronsByLayer[fromLayer].findIndex(
(n) => n.index === fromNeuron.index,
);
const error = state.gradients[toLayer]?.[indexInToLayer] ?? 0;
const activation = state.activations[fromLayer]?.[indexInFromLayer] ?? 0;
return error * activation;
}
If I ever want to turn this into a tiny “real trainer”, the missing piece is just applying weight -= lr * dW.
Visual Representation with Sparks
The sparks traveling along connections represent signal flow. Their color and size encode the weight magnitude:
export function getSparkColor(
weight: number,
activation: number = 1,
): THREE.Color {
const absWeight = Math.abs(weight);
const negativeColor = new THREE.Color(0x0891b2); // Cyan - negative weights
const neutralColor = new THREE.Color(0xeab308); // Yellow - neutral
const positiveColor = new THREE.Color(0xea580c); // Orange - positive weights
let baseColor: THREE.Color;
if (weight >= 0) {
baseColor = neutralColor.clone().lerp(positiveColor, absWeight * 1.2);
} else {
baseColor = neutralColor.clone().lerp(negativeColor, absWeight * 1.2);
}
const brightBoost = 2.5 + activation * 2.0;
baseColor.multiplyScalar(brightBoost);
return baseColor;
}
During backpropagation, sparks travel in reverse with a different color palette (teal → pink → orange) based on gradient magnitude:
export function getBackpropSparkColor(
gradient: number,
intensity: number = 1,
): THREE.Color {
const absGradient = Math.abs(gradient);
const weakError = new THREE.Color(0x14b8a6); // Teal - weak gradient
const midError = new THREE.Color(0xec4899); // Pink - medium gradient
const strongError = new THREE.Color(0xf97316); // Orange - strong gradient
let baseColor: THREE.Color;
if (absGradient < 0.5) {
baseColor = weakError.clone().lerp(midError, absGradient * 2);
} else {
baseColor = midError.clone().lerp(strongError, (absGradient - 0.5) * 2);
}
return baseColor;
}
Performance Considerations
The visualization uses Three.js InstancedMesh for neurons and a particle system for sparks. This lets the GPU handle
thousands of objects efficiently. Post-processing with UnrealBloomPass adds the glow effect, with different settings
for light and dark modes.
The biggest CPU-side optimization is avoiding repeated connection scans: forward pass and backprop both use a precomputed (from → to) weight lookup, so as the network grows, the animation stays smooth.
The entire animation runs at 60fps on most devices, making the math behind neural networks tangible and interactive.