Deep Learning Fundamentals: Neural Networks Explained

Deep learning has revolutionized AI. Let's break down the fundamental concepts that power modern machine learning systems.

What is Deep Learning?

Deep learning uses artificial neural networks with multiple layers to learn hierarchical representations of data.

Neural Network Basics

The Neuron

python

import numpy as np

def neuron(inputs, weights, bias):
    # Weighted sum + bias
    z = np.dot(inputs, weights) + bias
    # Activation function (ReLU)
    return max(0, z)

Multi-Layer Network

python

import torch.nn as nn

class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.layer1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        x = self.layer1(x)
        x = self.relu(x)
        x = self.layer2(x)
        return x

Key Concepts

1. Activation Functions

ReLU - Most common, prevents vanishing gradients
Sigmoid - Outputs 0-1, good for probabilities
Softmax - Multi-class classification

2. Loss Functions

MSE - Regression tasks
Cross-Entropy - Classification tasks
Custom losses - Task-specific optimization

3. Backpropagation

The algorithm that trains neural networks by:

Forward pass - compute predictions
Calculate loss
Backward pass - compute gradients
Update weights

python

# Training loop
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

for epoch in range(num_epochs):
    # Forward pass
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    
    # Backward pass
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

Common Architectures

Convolutional Neural Networks (CNNs)

Perfect for image data:

python

class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.fc = nn.Linear(64 * 8 * 8, 10)
    
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 64 * 8 * 8)
        return self.fc(x)

Transformers

State-of-the-art for NLP:

Self-attention mechanism
Positional encoding
Multi-head attention
Feed-forward layers

Training Best Practices

Data Preprocessing

- Normalization

- Augmentation

- Train/val/test splits

Regularization

- Dropout

- L1/L2 regularization

- Batch normalization

Optimization

- Adam optimizer (good default)

- Learning rate scheduling

- Gradient clipping

Monitoring

- Training vs validation loss

- Early stopping

- Tensorboard visualization

Common Pitfalls

Overfitting - Model memorizes training data
Vanishing gradients - Use ReLU, batch norm
Poor initialization - Use Xavier/He initialization
Learning rate - Too high → divergence, too low → slow training

Conclusion

Deep learning is powerful but requires understanding of fundamentals. Start with simple architectures, understand the math, and gradually tackle more complex problems.

What is Deep Learning?

Neural Network Basics

The Neuron

Multi-Layer Network

Key Concepts

1. Activation Functions

2. Loss Functions

3. Backpropagation

Common Architectures

Convolutional Neural Networks (CNNs)

Transformers

Training Best Practices

Common Pitfalls

Conclusion

References & Further Reading