Embark on a journey of knowledge! Take the quiz and earn valuable credits.
Take A QuizChallenge yourself and boost your learning! Start the quiz now to earn credits.
Take A QuizUnlock your potential! Begin the quiz, answer questions, and accumulate credits along the way.
Take A Quiz
Introduction
In this chapter, we will dive deeper into advanced neural
network architectures, building upon the fundamental concepts covered in
earlier chapters. We'll explore models that go beyond basic feedforward
networks, specifically focusing on Convolutional Neural Networks (CNNs),
Recurrent Neural Networks (RNNs), and Generative Adversarial Networks
(GANs). Each of these architectures serves a unique purpose in the machine
learning ecosystem, enabling the model to handle complex tasks such as image
classification, sequence prediction, and image generation.
We'll start with Convolutional Neural Networks, commonly
used for image processing tasks, and then move on to Recurrent Neural Networks
for sequential data. Finally, we'll explore GANs for generating new data,
offering insights into deep generative models.
By the end of this chapter, you'll be equipped with the
knowledge to implement these advanced neural network models using PyTorch and
understand their real-world applications.
5.1 Convolutional Neural Networks (CNNs)
What are CNNs?
Convolutional Neural Networks (CNNs) are a specialized type
of neural network designed for processing grid-like data, such as images. CNNs
are particularly powerful in tasks like image recognition, classification, and
object detection. They operate by convolving filters (kernels) over the input
image to extract local features, followed by pooling layers to reduce
dimensionality and a final fully connected layer to make predictions.
CNNs consist of several key layers:
Building a Simple CNN in PyTorch
We will build a simple CNN to classify images from the CIFAR-10
dataset, which contains 60,000 32x32 color images across 10 classes.
Code Sample:
import
torch
import
torch.nn as nn
import
torch.optim as optim
from
torchvision import datasets, transforms
#
Define data transformation (normalize images)
transform
= transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5),
(0.5, 0.5, 0.5))])
#
Download and load CIFAR-10 dataset
train_dataset
= datasets.CIFAR10('.', train=True, download=True, transform=transform)
test_dataset
= datasets.CIFAR10('.', train=False, download=True, transform=transform)
train_loader
= torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader
= torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)
#
Define the CNN architecture
class
CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
# Convolutional layers
self.conv1 = nn.Conv2d(3, 32,
kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(32, 64,
kernel_size=3, padding=1)
# Pooling layer
self.pool = nn.MaxPool2d(2, 2)
# Fully connected layers
self.fc1 = nn.Linear(64 * 8 * 8, 512)
self.fc2 = nn.Linear(512, 10) # 10 output units for 10 classes (CIFAR-10)
def forward(self, x):
x =
self.pool(torch.relu(self.conv1(x))) #
Apply first conv and pool
x =
self.pool(torch.relu(self.conv2(x))) #
Apply second conv and pool
x = x.view(-1, 64 * 8 * 8) # Flatten the tensor
x = torch.relu(self.fc1(x)) # Apply fully connected layer with ReLU
x = self.fc2(x) # Output layer
return x
#
Instantiate the model
model
= CNN()
#
Define loss function and optimizer
criterion
= nn.CrossEntropyLoss()
optimizer
= optim.Adam(model.parameters(), lr=0.001)
Explanation:
Training the CNN
To train the CNN, we follow a similar process as we did with
simpler models: feed data through the network, compute the loss, and update the
weights using backpropagation.
Code Sample:
#
Training the CNN
num_epochs
= 5
for
epoch in range(num_epochs):
model.train()
running_loss = 0.0
for data, target in train_loader:
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}/{num_epochs},
Loss: {running_loss/len(train_loader)}")
Explanation:
5.2 Recurrent Neural Networks (RNNs)
What are RNNs?
Recurrent Neural Networks (RNNs) are designed for processing
sequential data. Unlike CNNs, which handle spatial data like images, RNNs are
used for tasks that involve sequences, such as time series prediction, natural
language processing (NLP), and speech recognition.
RNNs have an internal state (or memory) that gets updated as
new data arrives. This state helps the network maintain context over time,
which is important for sequence-based tasks. However, traditional RNNs suffer
from issues like vanishing gradients when trying to model long-term
dependencies.
To address this, we use Long Short-Term Memory (LSTM)
networks or Gated Recurrent Units (GRUs), which help retain long-term
dependencies by controlling the flow of information using gates.
Building an RNN in PyTorch
We will now build a simple RNN for sentiment analysis on
text data. For simplicity, we'll use the IMDB dataset, a binary
classification task (positive or negative sentiment).
Code Sample:
from
torchtext.datasets import IMDB
from
torch.utils.data import DataLoader
import
torch.nn as nn
#
Load the IMDB dataset
train_data,
test_data = IMDB(split='train'), IMDB(split='test')
#
Define the RNN architecture
class
RNN(nn.Module):
def __init__(self, vocab_size,
embedding_dim, hidden_dim, output_dim):
super(RNN, self).__init__()
self.embedding =
nn.Embedding(vocab_size, embedding_dim)
self.rnn = nn.RNN(embedding_dim,
hidden_dim)
self.fc = nn.Linear(hidden_dim,
output_dim)
def forward(self, x):
embedded = self.embedding(x)
rnn_out, hidden = self.rnn(embedded)
out = self.fc(rnn_out[-1])
return out
#
Define the model
model
= RNN(vocab_size=5000, embedding_dim=100, hidden_dim=128, output_dim=1)
Explanation:
Training the RNN
#
Define loss function and optimizer
criterion
= nn.BCEWithLogitsLoss() # Binary Cross
Entropy loss
optimizer
= optim.Adam(model.parameters(), lr=0.001)
#
Training loop for RNN
num_epochs
= 5
for
epoch in range(num_epochs):
model.train()
running_loss = 0.0
for data, target in train_loader:
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}/{num_epochs},
Loss: {running_loss/len(train_loader)}")
5.3 Generative Adversarial Networks (GANs)
What are GANs?
Generative Adversarial Networks (GANs) are a class of neural
networks designed to generate new data samples. They consist of two networks:
The two networks are trained together in a competitive
manner, with the generator trying to create realistic data and the
discriminator trying to correctly classify real vs. fake data.
Building a Simple GAN
In this example, we will create a simple GAN that generates
images similar to the MNIST dataset.
Code Sample:
class
Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.fc1 = nn.Linear(100, 128)
self.fc2 = nn.Linear(128, 784) # 28x28 image flattened
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.tanh(self.fc2(x)) # Output values between -1 and 1
return x.view(-1, 1, 28, 28)
class
Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 1) # Output is probability (real or fake)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.sigmoid(self.fc2(x)) # Output between 0 and 1
return x
Explanation:
Training the GAN
#
Define the loss function and optimizers
criterion
= nn.BCELoss() # Binary Cross Entropy
Loss for real vs fake classification
optimizer_g
= optim.Adam(generator.parameters(), lr=0.0002)
optimizer_d
= optim.Adam(discriminator.parameters(), lr=0.0002)
#
Training loop for GAN
for
epoch in range(50):
for data, _ in train_loader:
# Train Discriminator
optimizer_d.zero_grad()
real_data = data.view(-1, 784)
output_real = discriminator(real_data)
loss_real = criterion(output_real,
torch.ones_like(output_real)) # Real
label is 1
noise = torch.randn(batch_size, 100)
fake_data = generator(noise)
output_fake =
discriminator(fake_data.detach())
loss_fake = criterion(output_fake,
torch.zeros_like(output_fake)) # Fake
label is 0
loss_d = loss_real + loss_fake
loss_d.backward()
optimizer_d.step()
# Train Generator
optimizer_g.zero_grad()
output_fake = discriminator(fake_data)
loss_g = criterion(output_fake,
torch.ones_like(output_fake)) #
Generator wants fake data to be real
loss_g.backward()
optimizer_g.step()
print(f"Epoch {epoch+1}, Loss D: {loss_d.item()},
Loss G: {loss_g.item()}")
5.4 Summary of Advanced Neural Network Models
Model |
Best For |
Key Features |
CNN |
Image classification,
object detection |
Convolutional layers
for extracting features from images |
RNN |
Sequence
modeling, NLP, time series |
Recurrent
layers for handling sequential data |
GAN |
Data generation, image
synthesis |
Generator vs.
Discriminator for creating new data |
Conclusion
In this chapter, we explored advanced neural network
architectures like Convolutional Neural Networks (CNNs), Recurrent
Neural Networks (RNNs), and Generative Adversarial Networks (GANs).
Each of these models has its unique strengths and is suited for specific tasks.
CNNs are ideal for image processing, RNNs are used for sequence data, and GANs
are powerful tools for generating new data.
PyTorch is an open-source deep learning framework developed by Facebook’s AI Research lab (FAIR), known for its dynamic computation graph and flexibility.
PyTorch uses dynamic computation graphs, making it more flexible and easier to debug, while TensorFlow traditionally used static computation graphs, although TensorFlow 2.0 now supports dynamic graphs.
You can install PyTorch via pip with pip install torch torchvision torchaudio or through conda with conda install pytorch torchvision torchaudio cpuonly -c pytorch.
A tensor is a multi-dimensional array similar to a NumPy array but optimized for GPU acceleration, making it the core data structure in PyTorch.
autograd is PyTorch’s automatic differentiation system that computes gradients for backpropagation during training.
You can define a neural network by subclassing torch.nn.Module and defining the network architecture in the __init__ and forward methods.
Transfer learning involves using a pre-trained model on a large dataset and fine-tuning it for a specific task. In PyTorch, you can use pre-trained models from torchvision.models and modify the final layer.
You can evaluate a model using the model.eval() mode and run the model on test data to compute metrics like accuracy or loss.
Models are saved using torch.save(model.state_dict(), 'model.pth') and loaded with model.load_state_dict(torch.load('model.pth')).
Yes, PyTorch models can be deployed using tools like TorchServe for server-side deployment, or converted to TensorFlow Lite or ONNX for mobile and embedded applications.
Please log in to access this content. You will be redirected to the login page shortly.
LoginReady to take your education and career to the next level? Register today and join our growing community of learners and professionals.
Comments(0)