Machine Learning in Matlab - Module 2
Machine Learning in Matlab - Module 2
Machine Learning in Matlab - Module 2
Introduction
https://vimeo.com/204795912
bballPlayers.txt bballStats.txt
Use unsupervised learning techniques to group observations based on a set of explanatory variables
and discover natural patterns in a data set.
https://vimeo.com/204796757
similarity?
These players have played a different number of games. A better similarity measure will use
the statistics averaged over the number of games played.
Each statistic has different units and scales. When using the distance measure, statistical
data with wider scales will be given more importance.
In this lesson, you will try to correct these two shortcomings of the distance measurement by:
calculating the statistics per game, and
normalizing the data such that each variable spans zero with unit standard deviation.
Given a matrix of statistics, stats, and a vector containing the number of games played, GP, how
can you calculate the player statistics per game?
You will have to divide each row of the stats matrix by the corresponding row of the GP vector.
However, using an element-wise division operator / will generate an error because the dimensions
of statsand GP are not consistent.
bsxfun
In such cases, you can use the function bsxfun . This functions replicates the inputs so that they
have the same size and then performs the operation specified.
Consider a small example in which you need to compare a vector with a matrix.
You can use bsxfun as shown below. Note that @gt refers to the built-in ‘greater than’ function.
bsxfun works in the following way:
Task 2
Subtract x from each column of y. Assign the result to a variable named z.
Use the bsxfun function with the function handle @minus to subract the values.
>> results = bsxfun(@minus,A,B);
Normalizing Data
Congratulations! You have completed this lesson. NEXT LESSON
A common way to normalize raw data is to subtract the average value of a variable from each
element of the variable, then to scale it with a measure of spread, such as standard deviation.
If you just want to do the common normalization to zero mean and unit standard deviation, you
can use the zscore function.
>> Z = zscore(X)
Outputs
Outputs
Task 2
Use the zscore function to normalize the values in yDiv zero mean and unit standard deviation. Name
the result yNorm.
o
2. What is the result of the following code?50
normalizing data q2
o
o
Complete Quiz
Download
Tasks
Task 1
The data table contains the information and statistics for several basketball players.
Particularly, the sixth variable, GP, contains the number of games played. Columns seven
through the end contain player statistics across all games.
Replace the statistics of each player (variable numbers seven onwards) by the statistics per game.
You can extract the values in the table data to a numeric matrix, use curly braces to index into
columns 7:end.
Y = table{:,colVals};
Then, use the bsxfun function to divide, using the function handle @rdivide, each column in
the extracted data by the values in data.GP.
Y = bsxfun(@rdivide,Y,table.divisorVar);
Finally, index into data using curly braces to replace the old values.
table{:,colVals} = Y;
Task 2
Task 2
Shift and scale columns seven through the end of data so that the values in each column are normalized
to zero mean and unit standard deviation.
You will overwrite the values in the table data, so use curly braces to index into columns 7:end.
Then, use the zscore function on that same data.
Y = table{:,colVals};
Y = zscore(Y);
table{:,colVals} = Y;