Implement XOR Problem using Multi-Layered Perceptron

- February 04, 2026

XoR Problem:

Below is a clear, step-by-step implementation of the XOR problem using a Multi-Layer Perceptron (MLP). I will first explain the theory, then show the mathematical steps, and finally provide a simple implementation.

Step 1: Understand the XOR Problem

The XOR (Exclusive OR) function outputs:

x₁	x₂	XOR
0	0	0
0	1	1
1	0	1
1	1	0

Key observation:

XOR is not linearly separable, so it cannot be solved by a single-layer perceptron.

Hence, we need a Multi-Layer Perceptron with at least one hidden layer.

Step 2: Network Architecture

We choose a 2–2–1 MLP architecture:

Input layer: 2 neurons (x₁, x₂)
Hidden layer: 2 neurons
Output layer: 1 neuron

Activation function:

Hidden layer → Sigmoid
Output layer → Sigmoid

Step 3: Initialize Parameters

Let:

Weights from input to hidden layer →

Bias for hidden layer →

Weights from hidden to output layer →

Bias for output layer →

Initialize weights and biases with small random values.

Step 4: Forward Propagation

4.1 Hidden Layer Computation

4.2 Output Layer Computation

Where sigmoid function is:

Step 5: Loss Function

Use Mean Squared Error (MSE):

Step 6: Backpropagation

6.1 Output Layer Error

6.2 Hidden Layer Error

Where:

Step 7: Update Weights and Biases

Using Gradient Descent:

Where is the learning rate.

Step 8: Algorithm (Step-by-Step)

Initialize weights and biases
Perform forward propagation
Compute loss
Perform backpropagation
Update weights and biases
Repeat for multiple epochs

import numpy as np
import matplotlib.pyplot as plt

# -------------------------------
# Step 1: XOR Dataset
# -------------------------------
X = np.array([[0, 0],
              [0, 1],
              [1, 0],
              [1, 1]])

y = np.array([[0],
              [1],
              [1],
              [0]])

# -------------------------------
# Step 2: Initialize Parameters
# -------------------------------
np.random.seed(0)

W1 = np.random.randn(2, 2)
b1 = np.zeros((1, 2))

W2 = np.random.randn(2, 1)
b2 = np.zeros((1, 1))

# -------------------------------
# Step 3: Activation Functions
# -------------------------------
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1 - x)

# -------------------------------
# Step 4: Training Parameters
# -------------------------------
learning_rate = 0.1
epochs = 10000
losses = []

# -------------------------------
# Step 5: Training Loop
# -------------------------------
for epoch in range(epochs):

    # Forward Propagation
    z1 = np.dot(X, W1) + b1
    a1 = sigmoid(z1)

    z2 = np.dot(a1, W2) + b2
    y_pred = sigmoid(z2)

    # Mean Squared Error Loss
    loss = np.mean((y - y_pred) ** 2)
    losses.append(loss)

    # Backpropagation
    delta_output = (y_pred - y) * sigmoid_derivative(y_pred)
    delta_hidden = np.dot(delta_output, W2.T) * sigmoid_derivative(a1)

    # Update Weights and Biases
    W2 -= learning_rate * np.dot(a1.T, delta_output)
    b2 -= learning_rate * np.sum(delta_output, axis=0, keepdims=True)

    W1 -= learning_rate * np.dot(X.T, delta_hidden)
    b1 -= learning_rate * np.sum(delta_hidden, axis=0, keepdims=True)

# -------------------------------
# Step 6: Plot Training Loss Graph
# -------------------------------
plt.figure()
plt.plot(losses)
plt.xlabel("Epochs")
plt.ylabel("Mean Squared Error")
plt.title("Training Loss Curve for XOR using MLP")
plt.show()

# -------------------------------
# Step 7: Final Prediction
# -------------------------------
print("Final Predicted Output:")
print(np.round(y_pred))

Search This Blog

Cnuinformatica

Implement XOR Problem using Multi-Layered Perceptron

XoR Problem:

Comments

Popular posts from this blog

Principal Component Analysis

About me

Keras