Keras
Understanding Keras Models: Your Neural Network Blueprint
Think of a Keras model as the blueprint or architecture for building neural networks. Keras is designed to be simple and lets you create models quickly, while running on powerful engines like TensorFlow in the background.
The Three Ways to Build a Keras Model
1. Sequential API: The Straightforward Stack
What it is: The easiest way to build a model, like stacking Lego blocks in a single, straight line.
Best for: Standard networks where data flows directly from one layer to the next with no fancy detours.
Limitation: It's too rigid for complex designs that need multiple inputs, outputs, or layers that share connections.
2. Functional API: The Flexible Designer
What it is: A more powerful approach that lets you create complex, branch-like architectures.
Best for: Advanced designs with multiple inputs/outputs, skip connections (like shortcuts in the network), or layers that are shared.
Example: Building sophisticated models like ResNet.
3. Model Subclassing: The Custom Workshop
What it is: The ultimate level of control. You build the model from scratch by writing your own code.
Best for: Researchers and experts who need to create entirely new, custom network designs that don't fit standard patterns.
The Trade-Offs: What to Keep in Mind
1. Sequential Models Are Simple, But Limited
The easy-to-use Sequential API can't handle complex designs. If your network needs branches or multiple outputs, you must use the Functional API or Model Subclassing.
2. Debugging Can Be Tricky
While Keras is user-friendly, error messages can sometimes be confusing because they come from the complex engine (TensorFlow) running underneath.
3. You're One Step Removed from the Engine
Keras is a high-level tool, which means it hides the complex, low-level details. This is great for simplicity, but it means you have less fine-grained control compared to coding directly in TensorFlow.
4. A Note on Backend Changes
In the past, Keras switched between different underlying engines, which could cause issues for older projects. This is mostly stable now, but it's good to be aware of.
In a nutshell: Keras gives you three tools—from the simple "Lego stack" (Sequential) to the "custom workshop" (Subclassing). Your choice depends on how complex your network needs to be, balancing ease of use with flexibility and control.
------------------------------------------------------------------------------------------------------------------
1. Sample Code:
# Keras Demo with CIFAR-10 Dataset (Color images, 10 classes)
# Perfect step-up from MNIST/Fashion-MNIST
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
print(f"TensorFlow: {tf.__version__}")
print(f"Keras: {keras.__version__}\n")
# -----------------------------
# 1. Load CIFAR-10 dataset
# -----------------------------
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
# Class names
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
# -----------------------------
# 2. Preprocess data
# -----------------------------
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
print(f"Training samples: {x_train.shape[0]} → {x_train.shape}")
print(f"Test samples: {x_test.shape[0]}\n")
# -----------------------------
# 3. Build a stronger CNN for color images
# -----------------------------
model = keras.Sequential([
# Block 1
keras.layers.Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(32,32,3)),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(32, (3,3), activation='relu'),
keras.layers.MaxPooling2D((2,2)),
keras.layers.Dropout(0.25),
# Block 2
keras.layers.Conv2D(64, (3,3), padding='same', activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(64, (3,3), activation='relu'),
keras.layers.MaxPooling2D((2,2)),
keras.layers.Dropout(0.25),
# Block 3
keras.layers.Conv2D(128, (3,3), padding='same', activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(128, (3,3), activation='relu'),
keras.layers.GlobalAveragePooling2D(), # Replaces Flatten + Dense
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation='softmax')
])
model.summary()
# -----------------------------
# 4. Compile
# -----------------------------
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# -----------------------------
# 5. Callbacks
# -----------------------------
callbacks = [
keras.callbacks.EarlyStopping(patience=8, restore_best_weights=True),
keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=4, min_lr=1e-7)
]
# -----------------------------
# 6. Train
# -----------------------------
print("\nTraining on CIFAR-10...\n")
history = model.fit(
x_train, y_train,
epochs=100, # EarlyStopping will stop earlier
batch_size=128,
validation_data=(x_test, y_test),
callbacks=callbacks,
verbose=1
)
# -----------------------------
# 7. Final evaluation
# -----------------------------
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"\nFinal Test Accuracy: {test_acc:.4f} ({test_acc*100:.2f}%)")
# -----------------------------
# 8. Visualize predictions
# -----------------------------
predictions = model.predict(x_test)
pred_labels = np.argmax(predictions, axis=1)
plt.figure(figsize=(15, 8))
for i in range(12):
plt.subplot(3, 4, i+1)
plt.imshow(x_test[i])
color = 'green' if pred_labels[i] == y_test[i][0] else 'red'
plt.title(f"Pred: {class_names[pred_labels[i]]}\nTrue: {class_names[y_test[i][0]]}",
color=color, fontsize=10)
plt.axis('off')
plt.suptitle("CIFAR-10 Sample Predictions", fontsize=16)
plt.tight_layout()
plt.show()
# -----------------------------
# 9. Training curves
# -----------------------------
plt.figure(figsize=(12,4))
plt.subplot(1,2,1)
plt.plot(history.history['accuracy'], label='Train Acc')
plt.plot(history.history['val_accuracy'], label='Val Acc')
plt.title('Accuracy')
plt.legend()
plt.subplot(1,2,2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Val Loss')
plt.title('Loss')
plt.legend()
plt.tight_layout()
plt.show()
# -----------------------------
# 10. Save model
# -----------------------------
model.save('cifar10_keras_demo.keras')
print("\nModel saved as 'cifar10_keras_demo.keras'")
---------------------------------------------------------------------------------------------------------------
2. Sample code:
# Keras Model Demo: Fashion MNIST Classification
# Using the high-level Keras API (TensorFlow 2.x+)
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
# 1. Load the Fashion MNIST dataset (10 classes of clothing)
fashion_mnist = keras.datasets.fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
# Class names for visualization
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# 2. Preprocess the data
x_train = x_train / 255.0
x_test = x_test / 255.0
# Add channel dimension for CNN (required even for simple models)
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)
print(f"Training samples: {x_train.shape[0]}")
print(f"Test samples: {x_test.shape[0]}")
# 3. Build a Convolutional Neural Network using Keras Sequential API
model = keras.Sequential([
# Feature extraction layers
keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.MaxPooling2D((2,2)),
keras.layers.Conv2D(64, (3,3), activation='relu'),
keras.layers.MaxPooling2D((2,2)),
keras.layers.Conv2D(64, (3,3), activation='relu'),
# Classification layers
keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.5), # Prevent overfitting
keras.layers.Dense(10, activation='softmax') # 10 classes
])
# Alternative: Functional API style (more flexible)
# inputs = keras.Input(shape=(28, 28, 1))
# x = keras.layers.Conv2D(32, 3, activation='relu')(inputs)
# ... build layers ...
# outputs = keras.layers.Dense(10, activation='softmax')(x)
# model = keras.Model(inputs=inputs, outputs=outputs)
# 4. Compile the model
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Show model architecture
model.summary()
# 5. Train the model with callbacks
callbacks = [
keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True),
keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=2)
]
print("\nTraining the model...")
history = model.fit(
x_train, y_train,
epochs=30,
batch_size=64,
validation_split=0.2, # Use 20% of training data for validation
callbacks=callbacks,
verbose=1
)
# 6. Evaluate on test set
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)
print(f"\nTest Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")
# 7. Make predictions
predictions = model.predict(x_test)
predicted_labels = np.argmax(predictions, axis=1)
# 8. Visualize some predictions
plt.figure(figsize=(15, 6))
for i in range(10):
plt.subplot(2, 5, i+1)
plt.imshow(x_test[i].reshape(28, 28), cmap='gray')
plt.title(f"Pred: {class_names[predicted_labels[i]]}\nTrue: {class_names[y_test[i]]}",
color='green' if predicted_labels[i] == y_test[i] else 'red')
plt.axis('off')
plt.suptitle("Sample Predictions (Green = Correct, Red = Wrong)")
plt.tight_layout()
plt.show()
# 9. Plot training history
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(history.history['accuracy'], label='Training Accuracy')
ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
ax1.set_title('Model Accuracy')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Accuracy')
ax1.legend()
ax2.plot(history.history['loss'], label='Training Loss')
ax2.plot(history.history['val_loss'], label='Validation Loss')
ax2.set_title('Model Loss')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Loss')
ax2.legend()
plt.tight_layout()
plt.show()
# 10. Save the entire model (architecture + weights + optimizer state)
model.save('fashion_mnist_keras_model.keras')
print("\nModel saved as 'fashion_mnist_keras_model.keras'")
# Load and test the saved model
loaded_model = keras.models.load_model('fashion_mnist_keras_model.keras')
loaded_prediction = loaded_model.predict(x_test[:1])
print(f"Loaded model prediction for first image: {class_names[np.argmax(loaded_prediction)]}")
---------------------------------------------------------------
2. Sample code
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
# Load the Fashion MNIST dataset
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# Class names for the 10 fashion categories
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print("Dataset shape:")
print(f"Training images: {train_images.shape}")
print(f"Training labels: {train_labels.shape}")
print(f"Test images: {test_images.shape}")
print(f"Test labels: {test_labels.shape}")
# Preprocess the data (normalize pixel values to 0-1)
train_images = train_images / 255.0
test_images = test_images / 255.0
# Build the neural network model
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)), # Input layer: flatten 28x28 images
keras.layers.Dense(128, activation='relu'), # Hidden layer: 128 neurons with ReLU
keras.layers.Dense(10, activation='softmax') # Output layer: 10 classes with softmax
])
# Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Display model architecture
print("\nModel architecture:")
model.summary()
# Train the model
print("\nTraining the model...")
history = model.fit(train_images, train_labels,
epochs=5,
validation_data=(test_images, test_labels))
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=0)
print(f"\nTest accuracy: {test_acc:.4f}")
# Make a prediction on a single test image
predictions = model.predict(test_images)
# Display the first test image and its prediction
def plot_prediction(i):
plt.figure(figsize=(6,3))
# Plot the image
plt.subplot(1,2,1)
plt.imshow(test_images[i], cmap=plt.cm.binary)
plt.title(f"True: {class_names[test_labels[i]]}")
plt.axis('off')
# Plot the prediction probabilities
plt.subplot(1,2,2)
prediction = predictions[i]
plt.barh(range(10), prediction)
plt.yticks(range(10), class_names)
plt.title(f"Predicted: {class_names[np.argmax(prediction)]}")
plt.tight_layout()
plt.show()
# Show prediction for the first test image
plot_prediction(0)
# Plot training history
def plot_training_history(history):
plt.figure(figsize=(12,4))
plt.subplot(1,2,1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend()
plt.subplot(1,2,2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.tight_layout()
plt.show()
plot_training_history(history)
Comments