Course 4 : ML – Primer (end2end Process flow)
Course Curriculum
Machine Learning – Key concepts

Machine Learning: Concepts, Models, and Platform Insights
01:12:43 
ML – intuitive understanding
47:18 
Distance and similarity measures – Primer
35:59
Supervised models – Classification – K nearest neighbors

Concepts on Nearest neighbors
15:39 
Choosing K
10:17 
Break the ties (prediction)
10:17 
Importance of scaling the data
05:44 
Handling Categorical Data
10:58 
Model evaluation methods (metrics, PR Curves, CV)
34:33 
Tuning for performance
16:43 
KNN as regressor
08:37 
Saving/loading the model
08:06 
Limitations of KNN , speed up options
20:08 
MCQs : Topic End : KNNs and related concepts
Some practical DS applications with KNN

Recommender systems
00:00 
Anomaly detection
00:00 
Geographical Data Analysis:
00:00 
Customer Segmentation
00:00
Basics of linear regression

basic intuition (using python code)

statistical way (python code)

Matrix method (slides + code)

sklearn implementation on advertising dataset

Model evaluations (learning curve and cross validations)

Test of assumptions (adv dataset)

MSE plot (Code)

Save/load model

effect of OHE (code)

effect of multicollinearity (code)

with non linear data (code)
Assumptions of Linear Regression
Linear regression is based on several assumptions, including linearity (the relationship between the predictors and the target is linear), independence of errors, constant variance of errors (homoscedasticity), absence of multicollinearity, and normal distribution of errors.

Linearity
00:00 
Independence of Errors
00:00 
Homoscedasticity
00:00 
Normality of Errors
00:00 
No Multicollinearity
00:00 
No Endogeneity
00:00
Unsupervised models : high level understanding
Unsupervised machine learning is a branch of machine learning where the model is trained on unlabeled data without any explicit target variable.

Example in brief : Clustering
00:00 
Example in brief ; Dimensionality Reduction:
00:00 
Examples in Brief : Anomaly Detection:
00:00 
Examples in brief : Generative Models
00:00 
Examples in brief : Association Rule Learning
00:00 
Examples in brief : Feature Learning
00:00
Unsupervised models : Clustering
Clustering is a technique in machine learning that aims to group similar data points together based on their intrinsic characteristics or similarities. It is a form of unsupervised learning, which means that it does not require labeled data or prior knowledge about the groups.
The goal of clustering is to discover inherent patterns or structures in the data without any predefined classes or categories. It is often used as an exploratory technique to gain insights into the data, identify natural groupings, or find hidden patterns that may not be apparent from a simple visual inspection.
Clustering algorithms assign data points to clusters based on certain criteria, such as proximity or similarity measures. The similarity between data points is determined by comparing their features or attributes. The algorithm iteratively assigns data points to clusters, aiming to maximize the similarity within clusters and maximize the dissimilarity between different clusters.
The result of clustering is a partitioning of the data into distinct clusters, where data points within the same cluster are more similar to each other than to those in other clusters. The number of clusters can be predefined or determined automatically by the algorithm based on certain criteria.
Clustering has various applications in different domains, such as customer segmentation in marketing, image segmentation in computer vision, document clustering in text mining, anomaly detection in cybersecurity, and many more. It can provide valuable insights, facilitate data exploration, and support decisionmaking processes.
It is important to note that clustering is different from classification, which is a form of supervised learning where data points are assigned to predefined classes based on labeled training data. Clustering, on the other hand, does not rely on labeled data and seeks to discover patterns or groupings on its own.

Kmeans Clustering – overview
00:00 
Kmeans – sklearn implementation
00:00 
K means – clean data vs noisy data
00:00 
How to determine the value of K : Elbow Method
00:00 
Evaluation Metrics
00:00 
Limitations of Kmeans
00:00 
Scalability issue (Kmeans)
00:00 
Variants and Extensions of Kmeans
00:00
Student Ratings & Reviews

LevelBeginner

Duration2 hours

Last UpdatedMarch 20, 2024
Material Includes
 Self learning videos (lesson wise)
 Python Notebook files
 Datasets
 Quizzes
 Assignments
Introduction
Machine Learning (ML) stands at the forefront of technological advancement, transforming the way we process information, make predictions, and automate decisionmaking.
This primer serves as an introductory guide to the complex yet fascinating world of machine learning, describing its fundamental concepts, applications, and the underlying principles that drive its evolution.
At the heart of machine learning lies the notion of learning from data, allowing systems to improve their performance over time without explicit programming.
The journey begins by understanding the core definition of machine learning—a field where algorithms ingest data, learn patterns, and use this acquired knowledge to make predictions or decisions.
This iterative learning process forms the bedrock of intelligent computing. Data, the Lifeblood of Machine Learning: An exploration of machine learning commences with the pivotal role played by data.
Unlike traditional programming, where rules are explicitly defined, machine learning models thrive on diverse datasets. The narrative delves into the significance of data samples for ML, emphasizing the quality, diversity, and representativeness of data as essential ingredients for model success.
What is a Model: The concept of a model in machine learning takes center stage, representing the mathematical representation of patterns inferred from data. Models are designed to generalize their learnings to new, unseen data, allowing for predictions beyond the training set.
The distinction between traditional programming and machine learning highlights the paradigm shift from rulebased systems to dynamic models that adapt to changing scenarios.
The Evolution of Machine Learning: As the primer unfolds, it traces the evolution of machine learning, showcasing how it has progressed from rulebased systems to more sophisticated models. This evolution mirrors the shift from manually crafted features to the automated extraction of relevant features from data—a transition that has significantly enhanced the adaptability and complexity of machine learning models.
Stages of Machine Learning: To demystify the machine learning journey, the stages of ML are illustrated, providing a visual roadmap. From data collection and preprocessing to model training, evaluation, and deployment, each stage contributes to the development of a robust and effective machine learning system.
Diverse Landscape: The primer delves into the diverse landscape of machine learning, categorizing it into types such as supervised and unsupervised learning. It introduces reinforcement learning as a paradigm where models learn through interaction with an environment, mimicking the way humans learn by trial and error.
Practical Applications and Popular Platforms: The narrative extends to the practical realm, exploring typical tasks that machine learning excels at. It highlights the impact of machine learning in realworld scenarios, from image recognition and language processing to recommendation systems.
An overview of popular machine learning platforms, such as Microsoft Azure Machine Learning and Amazon SageMaker, provides insights into the tools that power machine learning endeavors. In the Python ecosystem, essential components for classical machine learning are introduced, showcasing the key role Python plays in developing and implementing machine learning solutions.