Keras: Compiling, Training, and Evaluating Models
Once you've defined your neural network architecture using Keras's Sequential or Functional API, the next steps involve configuring the learning process (compile), training the model on data (fit), and assessing its performance (evaluate or predict).
1. Compiling the Model (model.compile())
The compile() method configures the model for training. You specify the optimizer, loss function, and metrics to monitor.
Parameters:
optimizer: The algorithm used to update the model's weights during training.- String identifier:
'adam','sgd','rmsprop','adagrad', etc. - Optimizer object:
tf.keras.optimizers.Adam(learning_rate=0.001). This gives more control over hyperparameters.
- String identifier:
loss: The function used to quantify the difference between the model's predictions and the true labels. The model will try to minimize this value during training.- String identifier:
'binary_crossentropy','sparse_categorical_crossentropy','mean_squared_error', etc. - Loss function object:
tf.keras.losses.BinaryCrossentropy().
- String identifier:
metrics: A list of metrics to be evaluated by the model during training and testing. These are usually for human readability and don't directly influence the training process (unless used for early stopping, etc.).- String identifier:
'accuracy','mse','mae','AUC', etc.
- String identifier:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Simple binary classification model
model = keras.Sequential([
layers.InputLayer(input_shape=(10,)),
layers.Dense(32, activation='relu'),
layers.Dense(1, activation='sigmoid') # Output for binary classification
])
# Compile the model
model.compile(optimizer='adam',
loss='binary_crossentropy', # Appropriate for binary classification with sigmoid output
metrics=['accuracy'])
model.summary()
print("\nModel compiled with Adam optimizer, binary crossentropy loss, and accuracy metric.")
Multiple Outputs (Functional API)
If your model has multiple outputs, you can specify different loss functions and weights for each output.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Multi-output model (from keras_models_api.md)
input_tensor = keras.Input(shape=(64,), name='input_layer')
x = layers.Dense(32, activation='relu')(input_tensor)
output_binary = layers.Dense(1, activation='sigmoid', name='binary_output')(x)
output_multi = layers.Dense(5, activation='softmax', name='multi_class_output')(x)
model_multi = keras.Model(inputs=input_tensor, outputs=[output_binary, output_multi])
model_multi.compile(
optimizer='adam',
loss={'binary_output': 'binary_crossentropy', 'multi_class_output': 'sparse_categorical_crossentropy'},
loss_weights={'binary_output': 1.0, 'multi_class_output': 0.5}, # Can assign different weights to losses
metrics={'binary_output': 'accuracy', 'multi_class_output': 'accuracy'}
)
print("\nMulti-output Model compiled with custom losses and weights.")
2. Training the Model (model.fit())
The fit() method trains the model for a fixed number of epochs (iterations over the dataset).
Parameters:
x: Input data (NumPy array,tf.Tensor,tf.data.Dataset, etc.).y: Target data (labels).batch_size: Number of samples per gradient update.epochs: Number of times to iterate over the entirexandydata.verbose: How much output to show during training (0=silent, 1=progress bar, 2=one line per epoch).validation_data: Tuple(x_val, y_val)to evaluate the loss and metrics on at the end of each epoch.validation_split: Fraction of the training data to be used as validation data.callbacks: List ofkeras.callbacks.Callbackinstances (e.g.,EarlyStopping,ModelCheckpoint).
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Prepare dummy data for binary classification
num_samples = 1000
x_data = np.random.rand(num_samples, 10).astype('float32')
y_data = np.random.randint(0, 2, size=(num_samples, 1)).astype('float32')
# Define and compile the model
model_fit = keras.Sequential([
layers.InputLayer(input_shape=(10,)),
layers.Dense(32, activation='relu'),
layers.Dense(1, activation='sigmoid')
])
model_fit.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Train the model
print("\nTraining Model...")
history = model_fit.fit(
x_data, y_data,
batch_size=32,
epochs=10,
validation_split=0.2, # Use 20% of training data for validation
verbose=1
)
# Access training history
print("\nTraining History Keys:", history.history.keys())
# Example: print last epoch's accuracy
print(f"Last epoch training accuracy: {history.history['accuracy'][-1]:.4f}")
print(f"Last epoch validation accuracy: {history.history['val_accuracy'][-1]:.4f}")
3. Evaluating the Model (model.evaluate())
The evaluate() method calculates the loss and metrics for the model in "test mode". It will not train the model.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Use the trained model_fit from above
# Prepare dummy test data
x_test_data = np.random.rand(200, 10).astype('float32')
y_test_data = np.random.randint(0, 2, size=(200, 1)).astype('float32')
# Evaluate the model on test data
print("\nEvaluating Model on Test Data...")
loss, accuracy = model_fit.evaluate(x_test_data, y_test_data, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
4. Making Predictions (model.predict())
The predict() method generates predictions (e.g., class probabilities for classification, continuous values for regression) for new input samples.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Use the trained model_fit from above
# Generate some new samples for prediction
new_samples = np.random.rand(5, 10).astype('float32')
# Get raw predictions (probabilities for sigmoid output)
raw_predictions = model_fit.predict(new_samples)
print("\nRaw predictions (probabilities):\n", raw_predictions)
# Convert probabilities to class labels (for binary classification)
predicted_classes = (raw_predictions > 0.5).astype(int)
print("\nPredicted classes:\n", predicted_classes)
5. Callbacks
Callbacks are objects that can perform actions at various stages of training (e.g., at the start/end of an epoch, before/after a batch).
Common Callbacks:
ModelCheckpoint: Saves the model (or weights) after every epoch if it improves on a validation metric.EarlyStopping: Stops training when a monitored metric has stopped improving.ReduceLROnPlateau: Reduces the learning rate when a metric has stopped improving.TensorBoard: For logging metrics and visualizing training graphs.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
import numpy as np
# Prepare dummy data
num_samples = 1000
x_data = np.random.rand(num_samples, 10).astype('float32')
y_data = np.random.randint(0, 2, size=(num_samples, 1)).astype('float32')
# Model definition (same as before)
model_callbacks = keras.Sequential([
layers.InputLayer(input_shape=(10,)),
layers.Dense(32, activation='relu'),
layers.Dense(1, activation='sigmoid')
])
model_callbacks.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Define callbacks
checkpoint = ModelCheckpoint(filepath='best_model.h5', monitor='val_loss', save_best_only=True, verbose=1)
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True, verbose=1)
# Train with callbacks
print("\nTraining Model with Callbacks...")
history_callbacks = model_callbacks.fit(
x_data, y_data,
batch_size=32,
epochs=50, # Set a high number of epochs, EarlyStopping will stop it
validation_split=0.2,
callbacks=[checkpoint, early_stopping],
verbose=0 # Set verbose to 0 to see callback messages
)
print("\nTraining finished, best model weights restored if EarlyStopping triggered.")
Further Topics:
- Custom Loss Functions and Metrics
- Custom Callbacks
- Learning Rate Schedules
- Data Generators and
fit_generator()(for large datasets) - Distributed Training (
tf.distribute)
Mastering the compilation, training, and evaluation process is key to effectively developing and deploying deep learning models with Keras.
```