⬡ Hub
Skip to content

AI/ML Definitions Summary

A consolidated glossary of key terms and concepts in Artificial Intelligence and Machine Learning.

General AI/ML

  • Artificial Intelligence (AI): The simulation of human intelligence processes by machines, especially computer systems.
  • Machine Learning (ML): A subset of AI that enables systems to learn from data and improve from experience without being explicitly programmed.
  • Deep Learning (DL): A subset of ML utilizing neural networks with three or more layers to learn complex patterns from large amounts of data.
  • Data Science: The interdisciplinary field of extracting knowledge and insights from data.
  • Dataset: A collection of data used for training/testing. (Features = Columns, Samples = Rows).
  • Feature: An individual measurable property or variable (input).
  • Label/Target: The outcome variable we want to predict (output).
  • Training/Validation/Test Split: Dividing data to Train (learn), Validate (tune hyperparameters), and Test (evaluate final performance).
  • Overfitting: Model performs well on training data but poorly on test data (high variance).
  • Underfitting: Model performs poorly on both training and test data (high bias).
  • Bias-Variance Tradeoff: The struggle to minimize both bias (error from erroneous assumptions) and variance (error from sensitivity to small fluctuations).

Supervised Learning

The machine learns from labeled data (Input + Correct Output).

1. Regression

Goal: Predict a continuous numerical value. * Linear Regression: Fits a straight line ensuring the sum of squared errors is minimal. Best for simple relationships. * Polynomial Regression: Models non-linear relationships by raising features to a power (e.g., $x^2$, $x^3$). * Ridge (L2) & Lasso (L1) Regression: Linear regression with "Regularization" to prevent overfitting. L1 can zero out coefficients (feature selection); L2 shrinks them. * Support Vector Regression (SVR): Finds a hyperplane deviation boundary to fit the data.

2. Classification

Goal: Predict a categorical class label (Discrete). * Binary Classification: Two classes (e.g., Spam vs. Not Spam). * Multi-Class Classification: More than two classes (e.g., Handwritten digits 0-9). * Logistic Regression: Uses a Sigmoid function to squeeze output between 0 and 1 (probability). used for Binary Classification. * Decision Trees: splits data into branches like a flowchart based on feature values. Prone to overfitting. * Random Forest: An "Ensemble" of many Decision Trees. Reduces overfitting by averaging results (Bagging). * Support Vector Machines (SVM): Finds the "Maximum Margin Hyperplane" that best separates the classes. Effective in high dimensions. * K-Nearest Neighbors (KNN): "Lazy learner" that classifies a point based on the majority class of its 'K' nearest neighbors. * Naive Bayes: Probabilistic classifier based on Bayes' Theorem. Assumes independence between features. Good for text classification.

Unsupervised Learning

The machine learns from unlabeled data (Input only, No Output).

  • Clustering: Grouping similar data points together.
    • K-Means: Partitions data into K distinct clusters.
    • Hierarchical Clustering: Builds a tree of clusters.
  • Dimensionality Reduction: Reducing the number of input variables while retaining important information.
    • PCA (Principal Component Analysis): Projects data onto lower dimensions (Principal Components) to maximize variance.
    • t-SNE: Non-linear technique mainly for data visualization.

Deep Learning

  • Neural Network (ANN): Computing system inspired by the biological neural networks of animal brains.
  • Neuron/Perceptron: The basic unit, applying weights, bias, and an activation function to inputs.
  • Activation Functions:
    • Sigmoid: $1/(1+e^{-x})$. Output 0 to 1. Vanishing gradient problem.
    • ReLU (Rectified Linear Unit): $max(0, x)$. Standard for hidden layers.
    • Softmax: Converts a vector of numbers into a probability distribution (sum = 1). Used for output layer in multi-class classification.
  • CNN (Convolutional Neural Network): Specialized for grid data (images). Uses Convolution (filters) and Pooling (downsampling) layers to detect spatial hierarchies.
  • RNN (Recurrent Neural Network): Specialized for sequential data (time-series, text). Has "memory" of previous inputs. Prone to vanishing/exploding gradients.
  • LSTM/GRU: Advanced RNNs with "Gates" to handle long-term dependencies.

Generative AI & LLMs

  • Generative AI: Creates new content (text, image, audio) rather than just classifying existing data.
  • Transformer: Architecture introduced in 2017 relying entirely on "Attention" mechanisms. Parallelizable and scalable.
    • Self-Attention: Weighs the importance of words in a sentence relative to each other.
  • LLM (Large Language Model): Probabilistic model trained on massive text corpora to predict the next token.
  • Token: The basic unit of text (word part) processed by LLMs ($~0.75$ words).
  • Embedding: Vector representation of a token capturing semantic meaning.
  • Context Window: Limit on the amount of text the model can consider at one time.
  • Hallucination: Confident but factually incorrect generation.
  • RAG (Retrieval-Augmented Generation): Fetching external data to include in the prompt context to improve accuracy.
  • Fine-Tuning: specialized training on a smaller dataset to adapt a pre-trained model to specific tasks.
  • RLHF: Reinforcement Learning from Human Feedback. Tuning models to be helpful and safe using human preferences.
  • Prompt Engineering: The art of crafting inputs (prompts) to get the best output from an LLM.

Model Evaluation Metrics

  • Accuracy: (Correct Predictions) / (Total Predictions).
  • Precision: (True Positives) / (True Positives + False Positives). "Quality".
  • Recall (Sensitivity): (True Positives) / (True Positives + False Negatives). "Quantity".
  • F1-Score: Harmonic mean of Precision and Recall.
  • Confusion Matrix: Table layout that visualizes the performance of an algorithm.
  • MSE (Mean Squared Error): Average squared difference between estimated values and the actual value (Regression).