Keras

Understanding Keras Models: Your Neural Network Blueprint

Think of a Keras model as the blueprint or architecture for building neural networks. Keras is designed to be simple and lets you create models quickly, while running on powerful engines like TensorFlow in the background.

The Three Ways to Build a Keras Model

1. Sequential API: The Straightforward Stack

What it is: The easiest way to build a model, like stacking Lego blocks in a single, straight line.

Best for: Standard networks where data flows directly from one layer to the next with no fancy detours.

Limitation: It's too rigid for complex designs that need multiple inputs, outputs, or layers that share connections.

2. Functional API: The Flexible Designer

What it is: A more powerful approach that lets you create complex, branch-like architectures.

Best for: Advanced designs with multiple inputs/outputs, skip connections (like shortcuts in the network), or layers that are shared.

Example: Building sophisticated models like ResNet.

3. Model Subclassing: The Custom Workshop

What it is: The ultimate level of control. You build the model from scratch by writing your own code.

Best for: Researchers and experts who need to create entirely new, custom network designs that don't fit standard patterns.

The Trade-Offs: What to Keep in Mind

1. Sequential Models Are Simple, But Limited

The easy-to-use Sequential API can't handle complex designs. If your network needs branches or multiple outputs, you must use the Functional API or Model Subclassing.

2. Debugging Can Be Tricky

While Keras is user-friendly, error messages can sometimes be confusing because they come from the complex engine (TensorFlow) running underneath.

3. You're One Step Removed from the Engine

Keras is a high-level tool, which means it hides the complex, low-level details. This is great for simplicity, but it means you have less fine-grained control compared to coding directly in TensorFlow.

4. A Note on Backend Changes

In the past, Keras switched between different underlying engines, which could cause issues for older projects. This is mostly stable now, but it's good to be aware of.

In a nutshell: Keras gives you three tools—from the simple "Lego stack" (Sequential) to the "custom workshop" (Subclassing). Your choice depends on how complex your network needs to be, balancing ease of use with flexibility and control.

------------------------------------------------------------------------------------------------------------------

1. Sample Code:

# Keras Demo with CIFAR-10 Dataset (Color images, 10 classes)

# Perfect step-up from MNIST/Fashion-MNIST

import tensorflow as tf

from tensorflow import keras

import matplotlib.pyplot as plt

import numpy as np

print(f"TensorFlow: {tf.__version__}")

print(f"Keras: {keras.__version__}\n")

# -----------------------------

# 1. Load CIFAR-10 dataset

# -----------------------------

(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

# Class names

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',

'dog', 'frog', 'horse', 'ship', 'truck']

# -----------------------------

# 2. Preprocess data

# -----------------------------

x_train = x_train.astype("float32") / 255.0

x_test = x_test.astype("float32") / 255.0

print(f"Training samples: {x_train.shape[0]} → {x_train.shape}")

print(f"Test samples: {x_test.shape[0]}\n")

# -----------------------------

# 3. Build a stronger CNN for color images

# -----------------------------

model = keras.Sequential([

# Block 1

keras.layers.Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(32,32,3)),

keras.layers.BatchNormalization(),

keras.layers.Conv2D(32, (3,3), activation='relu'),

keras.layers.MaxPooling2D((2,2)),

keras.layers.Dropout(0.25),

# Block 2

keras.layers.Conv2D(64, (3,3), padding='same', activation='relu'),

keras.layers.BatchNormalization(),

keras.layers.Conv2D(64, (3,3), activation='relu'),

keras.layers.MaxPooling2D((2,2)),

keras.layers.Dropout(0.25),

# Block 3

keras.layers.Conv2D(128, (3,3), padding='same', activation='relu'),

keras.layers.BatchNormalization(),

keras.layers.Conv2D(128, (3,3), activation='relu'),

keras.layers.GlobalAveragePooling2D(), # Replaces Flatten + Dense

keras.layers.Dropout(0.5),

keras.layers.Dense(10, activation='softmax')

])

model.summary()

# -----------------------------

# 4. Compile

# -----------------------------

model.compile(

optimizer=keras.optimizers.Adam(learning_rate=0.001),

loss='sparse_categorical_crossentropy',

metrics=['accuracy']

)

# -----------------------------

# 5. Callbacks

# -----------------------------

callbacks = [

keras.callbacks.EarlyStopping(patience=8, restore_best_weights=True),

keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=4, min_lr=1e-7)

]

# -----------------------------

# 6. Train

# -----------------------------

print("\nTraining on CIFAR-10...\n")

history = model.fit(

x_train, y_train,

epochs=100, # EarlyStopping will stop earlier

batch_size=128,

validation_data=(x_test, y_test),

callbacks=callbacks,

verbose=1

)

# -----------------------------

# 7. Final evaluation

# -----------------------------

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

print(f"\nFinal Test Accuracy: {test_acc:.4f} ({test_acc*100:.2f}%)")

# -----------------------------

# 8. Visualize predictions

# -----------------------------

predictions = model.predict(x_test)

pred_labels = np.argmax(predictions, axis=1)

plt.figure(figsize=(15, 8))

for i in range(12):

plt.subplot(3, 4, i+1)

plt.imshow(x_test[i])

color = 'green' if pred_labels[i] == y_test[i][0] else 'red'

plt.title(f"Pred: {class_names[pred_labels[i]]}\nTrue: {class_names[y_test[i][0]]}",

color=color, fontsize=10)

plt.axis('off')

plt.suptitle("CIFAR-10 Sample Predictions", fontsize=16)

plt.tight_layout()

plt.show()

# -----------------------------

# 9. Training curves

# -----------------------------

plt.figure(figsize=(12,4))

plt.subplot(1,2,1)

plt.plot(history.history['accuracy'], label='Train Acc')

plt.plot(history.history['val_accuracy'], label='Val Acc')

plt.title('Accuracy')

plt.legend()

plt.subplot(1,2,2)

plt.plot(history.history['loss'], label='Train Loss')

plt.plot(history.history['val_loss'], label='Val Loss')

plt.title('Loss')

plt.legend()

plt.tight_layout()

plt.show()

# -----------------------------

# 10. Save model

# -----------------------------

model.save('cifar10_keras_demo.keras')

print("\nModel saved as 'cifar10_keras_demo.keras'")

---------------------------------------------------------------------------------------------------------------

2. Sample code:

# Keras Model Demo: Fashion MNIST Classification

# Using the high-level Keras API (TensorFlow 2.x+)

import tensorflow as tf

from tensorflow import keras

import matplotlib.pyplot as plt

import numpy as np

print(f"TensorFlow version: {tf.__version__}")

print(f"Keras version: {keras.__version__}")

# 1. Load the Fashion MNIST dataset (10 classes of clothing)

fashion_mnist = keras.datasets.fashion_mnist

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

# Class names for visualization

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',

'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# 2. Preprocess the data

x_train = x_train / 255.0

x_test = x_test / 255.0

# Add channel dimension for CNN (required even for simple models)

x_train = x_train.reshape(-1, 28, 28, 1)

x_test = x_test.reshape(-1, 28, 28, 1)

print(f"Training samples: {x_train.shape[0]}")

print(f"Test samples: {x_test.shape[0]}")

# 3. Build a Convolutional Neural Network using Keras Sequential API

model = keras.Sequential([

# Feature extraction layers

keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),

keras.layers.MaxPooling2D((2,2)),

keras.layers.Conv2D(64, (3,3), activation='relu'),

keras.layers.MaxPooling2D((2,2)),

keras.layers.Conv2D(64, (3,3), activation='relu'),

# Classification layers

keras.layers.Flatten(),

keras.layers.Dense(128, activation='relu'),

keras.layers.Dropout(0.5), # Prevent overfitting

keras.layers.Dense(10, activation='softmax') # 10 classes

])

# Alternative: Functional API style (more flexible)

# inputs = keras.Input(shape=(28, 28, 1))

# x = keras.layers.Conv2D(32, 3, activation='relu')(inputs)

# ... build layers ...

# outputs = keras.layers.Dense(10, activation='softmax')(x)

# model = keras.Model(inputs=inputs, outputs=outputs)

# 4. Compile the model

model.compile(

optimizer='adam',

loss='sparse_categorical_crossentropy',

metrics=['accuracy']

)

# Show model architecture

model.summary()

# 5. Train the model with callbacks

callbacks = [

keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True),

keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=2)

]

print("\nTraining the model...")

history = model.fit(

x_train, y_train,

epochs=30,

batch_size=64,

validation_split=0.2, # Use 20% of training data for validation

callbacks=callbacks,

verbose=1

)

# 6. Evaluate on test set

test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)

print(f"\nTest Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")

# 7. Make predictions

predictions = model.predict(x_test)

predicted_labels = np.argmax(predictions, axis=1)

# 8. Visualize some predictions

plt.figure(figsize=(15, 6))

for i in range(10):

plt.subplot(2, 5, i+1)

plt.imshow(x_test[i].reshape(28, 28), cmap='gray')

plt.title(f"Pred: {class_names[predicted_labels[i]]}\nTrue: {class_names[y_test[i]]}",

color='green' if predicted_labels[i] == y_test[i] else 'red')

plt.axis('off')

plt.suptitle("Sample Predictions (Green = Correct, Red = Wrong)")

plt.tight_layout()

plt.show()

# 9. Plot training history

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

ax1.plot(history.history['accuracy'], label='Training Accuracy')

ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')

ax1.set_title('Model Accuracy')

ax1.set_xlabel('Epoch')

ax1.set_ylabel('Accuracy')

ax1.legend()

ax2.plot(history.history['loss'], label='Training Loss')

ax2.plot(history.history['val_loss'], label='Validation Loss')

ax2.set_title('Model Loss')

ax2.set_xlabel('Epoch')

ax2.set_ylabel('Loss')

ax2.legend()

plt.tight_layout()

plt.show()

# 10. Save the entire model (architecture + weights + optimizer state)

model.save('fashion_mnist_keras_model.keras')

print("\nModel saved as 'fashion_mnist_keras_model.keras'")

# Load and test the saved model

loaded_model = keras.models.load_model('fashion_mnist_keras_model.keras')

loaded_prediction = loaded_model.predict(x_test[:1])

print(f"Loaded model prediction for first image: {class_names[np.argmax(loaded_prediction)]}")

---------------------------------------------------------------

2. Sample code

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Load the Fashion MNIST dataset
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# Class names for the 10 fashion categories
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print("Dataset shape:")
print(f"Training images: {train_images.shape}")
print(f"Training labels: {train_labels.shape}")
print(f"Test images: {test_images.shape}")
print(f"Test labels: {test_labels.shape}")

# Preprocess the data (normalize pixel values to 0-1)
train_images = train_images / 255.0
test_images = test_images / 255.0

# Build the neural network model
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)), # Input layer: flatten 28x28 images
keras.layers.Dense(128, activation='relu'), # Hidden layer: 128 neurons with ReLU
keras.layers.Dense(10, activation='softmax') # Output layer: 10 classes with softmax
])

# Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

# Display model architecture
print("\nModel architecture:")
model.summary()

# Train the model
print("\nTraining the model...")
history = model.fit(train_images, train_labels,
epochs=5,
validation_data=(test_images, test_labels))

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=0)
print(f"\nTest accuracy: {test_acc:.4f}")

# Make a prediction on a single test image
predictions = model.predict(test_images)

# Display the first test image and its prediction
def plot_prediction(i):
plt.figure(figsize=(6,3))

# Plot the image
plt.subplot(1,2,1)
plt.imshow(test_images[i], cmap=plt.cm.binary)
plt.title(f"True: {class_names[test_labels[i]]}")
plt.axis('off')

# Plot the prediction probabilities
plt.subplot(1,2,2)
prediction = predictions[i]
plt.barh(range(10), prediction)
plt.yticks(range(10), class_names)
plt.title(f"Predicted: {class_names[np.argmax(prediction)]}")
plt.tight_layout()
plt.show()

# Show prediction for the first test image
plot_prediction(0)

# Plot training history
def plot_training_history(history):
plt.figure(figsize=(12,4))

plt.subplot(1,2,1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend()

plt.subplot(1,2,2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.tight_layout()
plt.show()
plot_training_history(history)

Search This Blog

Cnuinformatica

Keras

Understanding Keras Models: Your Neural Network Blueprint

1. Sequential API: The Straightforward Stack

2. Functional API: The Flexible Designer

3. Model Subclassing: The Custom Workshop

Comments

Popular posts from this blog

About me

A set of documents that need to be classified, use the Naive Bayesian Classifier