Lesson 44: Transfer Learning

What is Transfer Learning?

Transfer learning is a powerful machine learning technique where a model trained on one task is reused or adapted for a related task. Instead of training a model from scratch, you leverage knowledge gained from a pre-trained model, saving time and computational resources while often achieving better performance.

Think of it like learning to drive: if you already know how to ride a bicycle, you can transfer skills like balance and coordination to driving a car, rather than starting completely from scratch.

Transfer Learning Approach
# Without Transfer Learning: Train from scratch
model = create_model()
model.train_on_large_dataset(millions_of_images)  # Takes weeks, needs GPU

# With Transfer Learning: Use pre-trained model
pretrained_model = load_pretrained_model()  # Already trained on ImageNet
fine_tune_for_your_task(pretrained_model, your_data)  # Hours/days, much less data needed!

How Transfer Learning Works

Transfer learning typically involves three main approaches:

Feature Extraction: Use the pre-trained model as a fixed feature extractor, then train only a new classifier on top
Fine-tuning: Unfreeze some layers of the pre-trained model and retrain them with your data
Full Transfer: Replace the final layers and retrain the entire model (less common)

Transfer Learning with Keras
from tensorflow import keras
from tensorflow.keras.applications import VGG16

# Load pre-trained VGG16 model (trained on ImageNet)
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze the base model layers (don't retrain them)
base_model.trainable = False

# Add custom classification layers on top
model = keras.Sequential([
    base_model,  # Pre-trained feature extractor
    keras.layers.GlobalAveragePooling2D(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')  # Your 10 classes
])

print("Transfer learning: Reusing learned features!")

When to Use Transfer Learning

Transfer learning is particularly effective when:

Limited Data: You have a small dataset; pre-trained models provide rich features
Similar Domains: Your task is related to the pre-trained model's original task
Computational Constraints: Training from scratch is too expensive or time-consuming
Quick Prototyping: You need to test ideas rapidly before investing in full training

Fine-tuning a Pre-trained Model
# Step 1: Load and freeze pre-trained model
base_model = VGG16(weights='imagenet', include_top=False)
base_model.trainable = False

# Step 2: Add custom layers
model = keras.Sequential([...])

# Step 3: Train the new layers
model.compile(...)
model.fit(your_data, epochs=10)

# Step 4: Unfreeze and fine-tune
base_model.trainable = True
# Fine-tune only the later layers (don't retrain early layers)
for layer in base_model.layers[:-4]:
    layer.trainable = False

model.compile(optimizer=keras.optimizers.Adam(1e-5), ...)  # Lower learning rate
model.fit(your_data, epochs=5)

Popular Pre-trained Models

Many pre-trained models are available for transfer learning:

Image Classification: VGG16/19, ResNet50, InceptionV3, MobileNet (trained on ImageNet)
Natural Language Processing: BERT, GPT, Word2Vec, GloVe (trained on large text corpora)
Object Detection: YOLO, Faster R-CNN (trained on COCO dataset)
Medical Imaging: Models trained on medical image datasets

💡 Why Transfer Learning Works

Deep neural networks learn hierarchical features: early layers detect basic patterns (edges, textures), while deeper layers recognize complex concepts (objects, faces). These early layers are often generalizable across tasks, making them perfect candidates for transfer learning!

Practical Applications

Transfer learning has revolutionized many fields:

Computer Vision: Medical diagnosis, autonomous vehicles, quality control in manufacturing
NLP: Sentiment analysis, chatbots, language translation, content moderation
Audio Processing: Speech recognition, music classification, sound event detection
Industry-Specific: Retinal disease detection, agricultural crop monitoring, satellite image analysis

Common Challenges

While powerful, transfer learning has considerations:

Domain Mismatch: Pre-trained model may not be suitable if your data is very different
Overfitting: With small datasets, fine-tuning can lead to overfitting; use regularization
Choosing What to Freeze: Deciding which layers to freeze vs. fine-tune requires experimentation
Learning Rate: Use lower learning rates when fine-tuning pre-trained layers

💡 Learning Tip

Start with feature extraction (frozen base model), then try fine-tuning if needed. Use data augmentation to increase your dataset size. Monitor validation performance to avoid overfitting!

Exercise: Implement Transfer Learning

In the exercise on the right, you'll implement transfer learning by loading a pre-trained model, freezing its layers, adding custom classification layers, and fine-tuning for your specific task.

This hands-on exercise will help you understand how to leverage pre-trained models effectively and adapt them for new problems.

Transfer Learning