ML – Gradient based models – AI school

ML - Part 4 - Gradient Based Models

Gradient descent is an iterative optimization algorithm used in machine learning to minimize a cost or loss function. It is a key technique in training models to find the optimal values of their parameters. Here’s an overview of gradient descent in machine learning:

Optimization Objective: In machine learning, models are trained to minimize a cost or loss function that measures the difference between predicted values and the actual values in the training data. The goal of gradient descent is to find the set of model parameters that minimize this loss function.
Gradient Descent Algorithm: Gradient descent starts with an initial set of parameter values and iteratively updates them to minimize the loss function. The algorithm calculates the gradients of the loss function with respect to each parameter, indicating the direction of steepest descent.
Update Rule: The parameter updates in gradient descent are determined by the gradients. In each iteration, the parameters are adjusted by subtracting a fraction of the gradient multiplied by the learning rate. The learning rate controls the step size of the updates and influences the convergence and speed of the algorithm.
Batch Gradient Descent: In batch gradient descent, the algorithm computes the gradients and updates the parameters using the entire training dataset in each iteration. It can be computationally expensive for large datasets but generally converges to the global minimum of the loss function.
Stochastic Gradient Descent (SGD): In stochastic gradient descent, the algorithm updates the parameters using one randomly selected training instance at a time. SGD is computationally efficient and often converges faster, but the parameter updates may be noisy and exhibit more fluctuation.
Mini-batch Gradient Descent: Mini-batch gradient descent strikes a balance between batch gradient descent and stochastic gradient descent. It updates the parameters using a small subset (mini-batch) of the training data in each iteration. This approach combines the efficiency of SGD with the stability of batch gradient descent.
Convergence and Termination: Gradient descent iterations continue until a stopping criterion is met, such as reaching a predefined number of iterations or the change in the loss function falling below a certain threshold. At convergence, the algorithm finds parameter values that minimize the loss function.

Machine Learning – Part 4 – Gradient Based Models (Pre-Cursor To Deep Learning)

Gradient descent is an iterative optimization algorithm used in machine learning to minimize a cost or loss function