Data Science - Machine Learning
Data Science - Machine Learning
Data Science - Machine Learning
Perhaps the most popular data science methodologies come from machine learning.
What distinguishes machine learning from other computer guided decision processes is
that it builds prediction algorithms using data. Some of the most popular products that
use machine learning include the handwriting readers implemented by the postal
service, speech recognition, movie recommendation systems, and spam detectors.
In this course, you will learn popular machine learning algorithms, principal component
analysis, and regularization by building a movie recommendation system. You will learn
about training data, a set of data used to discover potentially predictive relationships,
and how the data can come in the form of the outcome we want to predict and features
that we will use to predict this outcome. As you build the movie recommendation
system, you will learn how to train algorithms using training data so you can predict the
outcome for future datasets. You will also learn about overtraining and techniques to
avoid it such as cross-validation. All of these skills are fundamental to machine learning.
The class notes for this course series can be found in Professor Irizarry's freely
available Introduction to Data Science book.
Course overview
There are six major sections in this course: introduction to machine learning; machine
learning basics; linear regression for prediction, smoothing, and working with matrices;
distance, knn, cross validation, and generative models; classification with more than two
classes and the caret package; and model fitting and recommendation systems.
01. Welcome to Data Science - Machine Learning
Section 3: Linear Regression for Prediction, Smoothing, and Working with Matrices
In this section, you'll learn why linear regression is a useful baseline approach but is
often insufficiently flexible for more complex analyses, how to smooth noisy data, and
how to use matrices for machine learning.
Section 5: Classification with More than Two Classes and the Caret Package
In this section, you'll learn how to overcome the curse of dimensionality using methods
that adapt to higher dimensions and how to use the caret package to implement many
different machine learning algorithms.
Need help? Visit edX Support via the Support tab or visit the Help Center.
01. Welcome to Data Science - Machine Learning
Links:
HarvardX Professional Certificate in Data Science - Link: https://www.edx.org/professional-
certificate/harvardx-data-science