Artificial Neural Networks (ANNs)

Artificial Neural Networks (also known as Multi-Layer Perceptrons or MLPs) are the foundation of Deep Learning. While Core ML models (like Random Forests) require humans to explicitly engineer features, Deep Learning models have the capacity to automatically discover and extract features from raw data.

1. Core Architecture

Input Layer: Receives the raw data (e.g., pixel values, word embeddings).
Hidden Layers: Intermediary layers where the "learning" occurs. Deep Learning simply means having many hidden layers.
Output Layer: Produces the final prediction (e.g., a single node for binary classification, or 10 nodes for predicting digits 0-9).
Weights & Biases: The mathematical connections between the layers. Training the network is simply the process of finding the optimal matrix of weights that minimizes prediction error.

2. The Forward Pass & Activation Functions

Data moves forward through the network via matrix multiplication: Output = (Inputs * Weights) + Bias. However, this is just linear algebra. To learn complex real-world patterns, we must introduce non-linearity. We do this by wrapping the output of every neuron in an Activation Function. - ReLU (Rectified Linear Unit): The industry standard for hidden layers. $f(x) = \max(0, x)$. It entirely ignores negative signals and passes positive signals through. It trains extremely fast. - Sigmoid: Squashes output strictly between 0 and 1. Used strictly on the final Output Layer for Binary Classification. - Softmax: Used on the final Output Layer for Multi-Class Classification. It converts raw network outputs into a clean probability distribution that sums to 1.0 (e.g., [10% Cat, 85% Dog, 5% Car]).

3. Backpropagation & Optimization

When the network makes a prediction during training, we calculate the error (Loss Function). Backpropagation calculates the mathematical derivative (the gradient) of this error with respect to every single weight in the network, working backward from the output layer to the input. An Optimizer (like Stochastic Gradient Descent or Adam) then mathematically updates every weight slightly to reduce the error for the next pass.

How to execute the examples:

Go to the Examples/ folder and run the scripts: python ANN_TensorFlow.py python ANN_PyTorch.py