Complete Machine Learning Vocabulary

Algorithm

An algorithm is a step-by-step computational procedure used to solve a problem or perform a task. In machine learning, algorithms refer to the methods used to identify patterns in data and make predictions or decisions based on it. Examples include linear regression, decision trees, and neural networks.

Artificial Intelligence (AI)

Artificial Intelligence (AI) is the broader field encompassing the creation of intelligent systems capable of performing tasks that typically require human intelligence. Machine learning is a subset of AI focused on enabling machines to learn from data.

Training Data

Training data is the dataset used to train a machine learning model. It contains input-output pairs (features and labels) that the model uses to learn patterns and relationships.

Test Data

Test data is a separate dataset used to evaluate a trained machine learning model. It helps to assess how well the model generalizes to unseen data.

Features

Features are individual measurable properties or characteristics of the data used as input for a machine learning model. For example, in a dataset of houses, features might include square footage, number of bedrooms, and location.

Labels

Labels are the outputs or target variables in a dataset used for supervised learning. For example, in a house price prediction task, the label would be the price of the house.

Model

model is the mathematical representation of a machine learning algorithm trained on data. It is used to make predictions or decisions based on input data.

Supervised Learning

Supervised Learning is a type of machine learning where the model is trained on labeled data. The goal is to learn a mapping from inputs (features) to outputs (labels). Examples include regression and classification tasks.

Unsupervised Learning

Unsupervised Learning is a type of machine learning where the model is trained on unlabeled data. The goal is to identify patterns or structures in the data, such as clustering or dimensionality reduction.

Reinforcement Learning

Reinforcement Learning is a type of machine learning where an agent learns to make decisions by interacting with an environment and receiving rewards or penalties based on its actions.

Overfitting

Overfitting occurs when a machine learning model learns the noise or random fluctuations in the training data instead of the underlying patterns. This leads to poor performance on unseen data.

Underfitting

Underfitting occurs when a machine learning model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test datasets.

Hyperparameters

Hyperparameters are configuration settings used to control the training process of a machine learning model. Examples include learning rate, batch size, and the number of layers in a neural network.

Cross-Validation

Cross-Validation is a technique used to evaluate the performance of a machine learning model by dividing the dataset into multiple subsets. The model is trained on some subsets and tested on others, ensuring a robust evaluation.

Gradient Descent

Gradient Descent is an optimization algorithm used to minimize the loss function by iteratively adjusting the model’s parameters in the direction of the negative gradient.

Loss Function

loss function quantifies the difference between the predicted outputs of a model and the actual target values. It provides feedback to optimize the model’s parameters during training. Examples include mean squared error (MSE) and cross-entropy loss.

Epoch

An epoch is one complete pass through the entire training dataset during the training process of a machine learning model.

Batch Size

Batch size refers to the number of training examples used in one iteration of the training process. Smaller batch sizes lead to more frequent updates, while larger batch sizes provide more stable updates.

Activation Function

An activation function determines the output of a neural network node based on its input. Common activation functions include sigmoid, ReLU (Rectified Linear Unit), and tanh.

Feature Engineering

Feature Engineering involves creating, selecting, and transforming features to improve the performance of a machine learning model. Examples include normalization, one-hot encoding, and feature scaling.

Feature Scaling

Feature Scaling is a technique used to normalize the range of features in the dataset to ensure that they contribute equally to the model’s predictions. Methods include Min-Max Scaling and Standardization.

One-Hot Encoding

One-Hot Encoding is a method for converting categorical variables into binary vectors, where each category is represented by a unique vector with a single 1 and the rest 0s.

Normalization

Normalization scales the input features to a range of [0, 1] or [-1, 1], ensuring that all features have the same scale. It is commonly used in gradient-based algorithms.

Standardization

Standardization transforms features to have a mean of 0 and a standard deviation of 1, ensuring that the data follows a standard normal distribution.

Confusion Matrix

confusion matrix is a table used to evaluate the performance of a classification model. It summarizes true positives, true negatives, false positives, and false negatives.

Precision

Precision is the ratio of true positive predictions to the total number of positive predictions made by the model. It measures the accuracy of positive predictions.

Recall

Recall, also known as sensitivity or true positive rate, is the ratio of true positive predictions to the total number of actual positives in the dataset.

F1 Score

The F1 Score is the harmonic mean of precision and recall. It is used as a single metric to evaluate the balance between precision and recall.

ROC Curve

An ROC Curve (Receiver Operating Characteristic Curve) is a graphical representation of a classification model’s performance across different threshold values. It plots the true positive rate against the false positive rate.

AUC

AUC (Area Under the Curve) measures the area under the ROC Curve, providing a single metric to evaluate a classification model’s ability to distinguish between classes.

Ensemble Learning

Ensemble Learning combines the predictions of multiple models to improve overall performance. Techniques include bagging (e.g., Random Forest) and boosting (e.g., Gradient Boosting).

Bagging

Bagging (Bootstrap Aggregating) is an ensemble technique that trains multiple models on different subsets of the data and averages their predictions to reduce variance.

Boosting

Boosting is an ensemble technique that sequentially trains models, with each new model focusing on the errors of the previous ones. Examples include AdaBoost and XGBoost.

Dimensionality Reduction

Dimensionality Reduction reduces the number of features in a dataset while retaining its essential information. Techniques include Principal Component Analysis (PCA) and t-SNE.

Principal Component Analysis (PCA)

PCA is a technique for dimensionality reduction that transforms features into a new set of orthogonal components, ranked by the amount of variance they capture in the data.

Clustering

Clustering is an unsupervised learning technique that groups data points into clusters based on similarity. Examples include K-Means, DBSCAN, and hierarchical clustering.

K-Means

K-Means is a clustering algorithm that partitions data into clusters by minimizing the sum of squared distances between data points and their cluster centroids.

Silhouette Score

The Silhouette Score measures the quality of clustering by evaluating how similar a data point is to its own cluster compared to other clusters. Higher scores indicate better-defined clusters.

Neural Network

Neural Network is a machine learning model inspired by the structure of the human brain. It consists of layers of interconnected nodes (neurons) that process data and learn patterns.

Deep Learning

Deep Learning is a subset of machine learning that uses neural networks with many layers (deep networks) to learn complex patterns in data. It is widely used in image recognition, natural language processing, and speech recognition.

Backpropagation

Backpropagation is the algorithm used to train neural networks by propagating the error backward through the network and updating the weights to minimize the loss function.

Dropout

Dropout is a regularization technique for neural networks where randomly selected neurons are ignored during training to prevent overfitting.

ReLU

ReLU (Rectified Linear Unit) is a commonly used activation function in neural networks, defined as . It introduces non-linearity and helps the model learn complex patterns.

Transfer Learning

Transfer Learning is a technique where a pre-trained model is fine-tuned on a new task. It is commonly used in deep learning to leverage models trained on large datasets for tasks with limited data.

Hyperparameter Tuning

Hyperparameter Tuning is the process of selecting the best hyperparameters for a machine learning model to optimize its performance. Methods include grid search and random search.

Grid Search

Grid Search is a method for hyperparameter tuning where all possible combinations of hyperparameters are systematically tested to find the best configuration.

Random Search

Random Search is a method for hyperparameter tuning where random combinations of hyperparameters are tested to find a good configuration with less computational effort compared to grid search.

Overfitting

Overfitting occurs when a machine learning model learns the noise and details in the training data, resulting in poor performance on unseen data.

Underfitting

Underfitting occurs when a machine learning model is too simple to capture the underlying patterns in the data, leading to poor performance on both training and test data.

Regularization

Regularization is a technique used to prevent overfitting by adding a penalty to the loss function. Common methods include L1 regularization (Lasso) and L2 regularization (Ridge).

L1 Regularization (Lasso)

L1 Regularization adds the absolute values of the coefficients as a penalty term to the loss function, encouraging sparsity in the model.

L2 Regularization (Ridge)

L2 Regularization adds the squared values of the coefficients as a penalty term to the loss function, reducing the magnitude of the coefficients without making them zero.