 OR # Machine Learning

This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); #### Course Details

Duration: 30 hours Effort:5 hours per week

Price With GST: ₹23600/-

Subject: Data Science Level: Beginner
##### Prerequisites
Students are expected to have the following background: Knowledge of basic computer science principles and skills, at a level sufficient to write a reasonably non-trivial computer program. Familiarity with the probability theory. Familiarity with linear algebra.

### What you'll learn

• Introduction to the course
• What is machine learning?
• Supervised learning - introduction
• Unsupervised learning - introduction
• Linear Regression
• Linear regression - implementation (cost function)
• A deeper insight into the cost function - simplified cost function
• So no need to change alpha over time
• Linear regression with gradient descent
• Matrices - overview
• Vectors - overview
• Matrix manipulation
• Implementation/use
• Matrix multiplication properties
• Inverse and transpose operations
• Linear regression with multiple features
• Gradient descent for multiple variables
• Gradient Decent in practice: 1 Feature Scaling
• Learning Rate a
• Features and polynomial regression
• Normal equation
• The problem of overfitting
• Cost function optimization for regularization
• Regularized linear regression
• Regularization with the normal equation
• Advanced optimization of regularized linear regression
• Neural networks - Overview and summary
• Model representation 1
• Model representation II
• Neural network example - computing a complex, nonlinear function of the input
• Multiclass classification
• Neural network cost function
• Summary of what's about to go down
• Back propagation algorithm
• Back propagation intuition
• Implementation notes - unrolling parameters (matrices)
• Random initialization
• Putting it all together
• Deciding what to try next
• Evaluating a hypothesis
• Model selection and training validation test sets
• Diagnosis - bias vs. variance
• Regularization and bias/variance
• Learning curves
• Machine learning systems design
• Prioritizing what to work on - spam classification example
• Error metrics for skewed analysis
• Trading off precision and recall
• Data for machine learning
• Support Vector Machine (SVM) - Optimization objective
• Large margin intuition
• Large margin classification mathematics (optional)
• Kernels - 1: Adapting SVM to non-linear classifiers
• Kernels II
• Unsupervised learning - introduction
• K-means algorithm
• K means optimization objective
• How do we choose the number of clusters?
• Motivation 1: Data compression
• Motivation 2: Visualization
• Principle Component Analysis (PCA): Problem Formulation
• PCA Algorithm
• Reconstruction from Compressed Representation
• Choosing the number of Principle Components
• Anomaly detection - problem motivation
• The Gaussian distribution (optional)
• Anomaly detection algorithm
• Developing and evaluating and anomaly detection system
• Anomaly detection vs. supervised learning
• Choosing features to use
• Multivariate Gaussian distribution
• Applying multivariate Gaussian distribution to anomaly detection
• Recommender systems - introduction
• Content based recommendation
• Collaborative filtering - overview
• Vectorization: Low rank matrix factorization
• Implementation detail: Mean Normalization
• Learning with large datasets
• Online learning
• Map reduce and data parallelism
• Problem description and pipeline
• Sliding window image analysis
##### Course Video
• 005-Linear-Regression Preview
• 005-Linear-regression-implementation (cost function)
• 006-A-deeper-insight-into-the-cost-function-simplified-cost-function
• 008-So-no-need-to-change-alpha-over-time
• 010-Matrices-overview Preview
• 011-Vectors-overview
• 012-Matrix-manipulation
• 013-Implementation/use
• 014-Matrix-multiplication-properties
• 015-Inverse-and-transpose-operations
• 022-The-problem-of-overfitting Preview
• 023-Cost-function-optimization-for-regularization
• 024-Regularized-linear-regression
• 025-Regularization-with-the-normal-equation
• 032-Neural-network-cost-function Preview
• 034-Back-propagation-algorithm
• 035-Back-propagation-intuition
• 036-Implementation-notes-unrolling-parameters (matrices)
• 038-Random-initialization
• 039-Putting-it-all-together
• 040-Deciding what to try next Preview
• 041-Evaluating-a-hypothesis
• 042-Model-selection-and-training-validation-test-sets
• 042-Diagnosis-bias vs. variance
• 043-Regularization-and-bias/variance
• 043-Learning-curves
• 058-Motivation-1: Data-compression Preview
• 059-Motivation-2: Visualization
• 060-Principle-Component-Analysis-(PCA): Problem-Formulation
• 061-PCA-Algorithm
• 062-Reconstruction-from-Compressed-Representation
• 063-Choosing-the-number-of-Principle-Components
• 065-Anomaly-detection-problem-motivation Preview
• 066-The-Gaussian-distribution(optional)
• 067-Anomaly-detection-algorithm
• 068-Developing-and-evaluating-and-anomaly-detection-system
• 069-Anomaly-detection vs. supervised-learning
• 070-Choosing-features-to-use
• 071-Multivariate-Gaussian-distribution
• 072-Applying-multivariate-Gaussian-distribution-to-anomaly-detection
##### 1. What is the difference between supervised and unsupervised machine learning?.

Supervised learning requires training using labelled data. For example, in order to do classification, which is a supervised learning task, you’ll first need to label the data you’ll use to train the model to classify data into your labelled groups. Unsupervised learning, in divergence, does not require labeling data explicitly.

##### 2. How KNN is different from k-means clustering?

The crucial difference between both is, K-Nearest Neighbor is a supervised classification algorithm, whereas k-means is an unsupervised clustering algorithm. While the procedure may seem similar at first, what it really means is that in order to K-Nearest Neighbors to work, you need labelled data which you want to classify an unlabeled point into. In k-means clustering it requires set of unlabeled points and a threshold only. The algorithm will take that unlabeled data and will learn how to cluster them into groups by computing the mean of the distance between different points. 