Basic ML Models
K-Nearest Neighbors (KNN):
- KNN is a supervised learning algorithm used for classification and regression tasks.
- It works by finding the K nearest neighbors in the training dataset to a new data point and making predictions based on the neighbors’ labels (for classification) or their average value (for regression).
- KNN is a non-parametric algorithm, meaning it does not make assumptions about the underlying data distribution.
Linear Regression:
- Linear regression is a supervised learning algorithm used for predicting a continuous target variable based on one or more input features.
- It assumes a linear relationship between the input features and the target variable and aims to find the best-fit line that minimizes the sum of squared errors.
- Linear regression can provide insights into the strength and direction of relationships between variables and is widely used for prediction and inference tasks.
Polynomial Regression:
- Polynomial regression is an extension of linear regression that models the relationship between input features and the target variable using polynomial functions.
- It allows for capturing non-linear relationships between the variables by including higher-degree polynomial terms.
- Polynomial regression can provide a better fit to the data when the relationship is not strictly linear, but it can also be prone to overfitting if the degree of the polynomial is too high.
K-means Clustering:
- K-means is an unsupervised learning algorithm used for clustering analysis.
- It aims to partition the data into K clusters by minimizing the within-cluster sum of squared distances.
- K-means starts by randomly assigning K centroids and iteratively updates them by assigning data points to the nearest centroid and recalculating the centroid positions.