C2_W1_Assignment
C2_W1_Assignment
C2_W1_Assignment
July 5, 2024
In this exercise, you will use a neural network to recognize the hand-written digits zero and one.
2 Outline
• Section ??
• Section ??
– Section ??
– Section ??
– Section ??
– Section ??
∗ Section ??
– Section ??
∗ Section ??
– Section ??
∗ Section ??
– Section ??
– Section ??
NOTE: To prevent errors from the autograder, you are not allowed to edit or delete non-graded
cells in this notebook . Please also refrain from adding any new cells. Once you have passed
this assignment and want to experiment with any of the non-graded code, you may follow the
instructions at the bottom of this notebook.
## 1 - Packages
First, let’s run the cell below to import all the packages that you will need during this assignment.
- numpy is the fundamental package for scientific computing with Python. - matplotlib is a popular
library to plot graphs in Python. - tensorflow a popular platform for machine learning.
[2]: import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import matplotlib.pyplot as plt
1
from autils import *
%matplotlib inline
import logging
logging.getLogger("tensorflow").setLevel(logging.ERROR)
tf.autograph.set_verbosity(0)
− − −(x(1) ) − −−
− − −(x(2) ) − −−
X= ..
.
− − −(x(m) ) − −−
• The second part of the training set is a 1000 x 1 dimensional vector y that contains labels for
the training set
2
– y = 0 if the image is of the digit 0, y = 1 if the image is of the digit 1.
1 This is a subset of the MNIST handwritten digit dataset (http://yann.lecun.com/exdb/mnist/)
#### 2.2.1 View the variables Let’s get more familiar with your dataset.
- A good place to start is to print out each variable and see what it contains.
The code below prints elements of the variables X and y.
[4]: print ('The first element of X is: ', X[0])
3
9.03795598e-01 1.04481574e-01 -1.66424973e-02 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 2.59875260e-05
-3.10606987e-03 7.52456076e-03 1.77539831e-01 7.92890120e-01
9.65626503e-01 4.63166079e-01 6.91720680e-02 -3.64100526e-03
-4.12180405e-02 -5.01900656e-02 1.56102907e-01 9.01762651e-01
1.04748346e+00 1.51055252e-01 -2.16044665e-02 0.00000000e+00
0.00000000e+00 0.00000000e+00 5.87012352e-05 -6.40931373e-04
-3.23305249e-02 2.78203465e-01 9.36720163e-01 1.04320956e+00
5.98003217e-01 -3.59409041e-03 -2.16751770e-02 -4.81021923e-03
6.16566793e-05 -1.23773318e-02 1.55477482e-01 9.14867477e-01
9.20401348e-01 1.09173902e-01 -1.71058007e-02 0.00000000e+00
0.00000000e+00 1.56250000e-04 -4.27724104e-04 -2.51466503e-02
1.30532561e-01 7.81664862e-01 1.02836583e+00 7.57137601e-01
2.84667194e-01 4.86865128e-03 -3.18688725e-03 0.00000000e+00
8.36492601e-04 -3.70751123e-02 4.52644165e-01 1.03180133e+00
5.39028101e-01 -2.43742611e-03 -4.80290033e-03 0.00000000e+00
0.00000000e+00 -7.03635621e-04 -1.27262443e-02 1.61706648e-01
7.79865383e-01 1.03676705e+00 8.04490400e-01 1.60586724e-01
-1.38173339e-02 2.14879493e-03 -2.12622549e-04 2.04248366e-04
-6.85907627e-03 4.31712963e-04 7.20680947e-01 8.48136063e-01
1.51383408e-01 -2.28404366e-02 1.98971950e-04 0.00000000e+00
0.00000000e+00 -9.40410539e-03 3.74520505e-02 6.94389110e-01
1.02844844e+00 1.01648066e+00 8.80488426e-01 3.92123945e-01
-1.74122413e-02 -1.20098039e-04 5.55215142e-05 -2.23907271e-03
-2.76068376e-02 3.68645493e-01 9.36411169e-01 4.59006723e-01
-4.24701797e-02 1.17356610e-03 1.88929739e-05 0.00000000e+00
0.00000000e+00 -1.93511951e-02 1.29999794e-01 9.79821705e-01
9.41862388e-01 7.75147704e-01 8.73632241e-01 2.12778350e-01
-1.72353349e-02 0.00000000e+00 1.09937426e-03 -2.61793751e-02
1.22872879e-01 8.30812662e-01 7.26501773e-01 5.24441863e-02
-6.18971913e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 -9.36563862e-03 3.68349741e-02 6.99079299e-01
1.00293583e+00 6.05704402e-01 3.27299224e-01 -3.22099249e-02
-4.83053002e-02 -4.34069138e-02 -5.75151144e-02 9.55674190e-02
7.26512627e-01 6.95366966e-01 1.47114481e-01 -1.20048679e-02
-3.02798203e-04 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 -6.76572712e-04 -6.51415556e-03 1.17339359e-01
4.21948410e-01 9.93210937e-01 8.82013974e-01 7.45758734e-01
7.23874268e-01 7.23341725e-01 7.20020340e-01 8.45324959e-01
8.31859739e-01 6.88831870e-02 -2.77765012e-02 3.59136710e-04
7.14869281e-05 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 1.53186275e-04 3.17353553e-04 -2.29167177e-02
-4.14402914e-03 3.87038450e-01 5.04583435e-01 7.74885876e-01
9.90037446e-01 1.00769478e+00 1.00851440e+00 7.37905042e-01
2.15455291e-01 -2.69624864e-02 1.32506127e-03 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 2.36366422e-04
-2.26031454e-03 -2.51994485e-02 -3.73889910e-02 6.62121228e-02
4
2.91134498e-01 3.23055726e-01 3.06260315e-01 8.76070942e-02
-2.50581917e-02 2.37438725e-04 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 6.20939216e-18 6.72618320e-04 -1.13151411e-02
-3.54641066e-02 -3.88214912e-02 -3.71077412e-02 -1.33524928e-02
9.90964718e-04 4.89176960e-05 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
m, n = X.shape
5
for i,ax in enumerate(axes.flat):
# Select random indices
random_index = np.random.randint(m)
6
### 2.3 Model representation
The neural network you will use in this assignment is shown in the figure below. - This has three
dense layers with sigmoid activations. - Recall that our inputs are pixel values of digit images. -
Since the images are of size 20 × 20, this gives us 400 inputs
• The parameters have dimensions that are sized for a neural network with 25 units in layer 1,
15 units in layer 2 and 1 output unit in layer 3.
– Recall that the dimensions of these parameters are determined as follows:
∗ If network has sin units in a layer and sout units in the next layer, then
· W will be of dimension sin × sout .
· b will a vector with sout elements
– Therefore, the shapes of W, and b, are
7
∗ layer1: The shape of W1 is (400, 25) and the shape of b1 is (25,)
∗ layer2: The shape of W2 is (25, 15) and the shape of b2 is: (15,)
∗ layer3: The shape of W3 is (15, 1) and the shape of b3 is: (1,) >Note: The bias
vector b could be represented as a 1-D (n,) or 2-D (1,n) array. Tensorflow utilizes a
1-D representation and this lab will maintain that convention.
### 2.4 Tensorflow Model Implementation
Tensorflow models are built layer by layer. A layer’s input dimensions (sin above) are calculated for
you. You specify a layer’s output dimensions and this determines the next layer’s input dimension.
The input dimension of the first layer is derived from the size of the input data specified in the
model.fit statement below. >Note: It is also possible to add an input layer that specifies the
input dimension of the first layer. For example:
tf.keras.Input(shape=(400,)), #specify input shape
We will include that here to illuminate some model sizing.
### Exercise 1
Below, using Keras Sequential model and Dense Layer with a sigmoid activation to construct the
network described above.
[13]: # UNQ_C1
# GRADED CELL: Sequential model
model = Sequential(
[
tf.keras.Input(shape=(400,)), #specify input size
### START CODE HERE ###
Dense(units=25, activation='sigmoid'), # layer 1
Dense(units=15, activation='sigmoid'), # layer 2
Dense(units=1, activation='sigmoid') # layer 3
### END CODE HERE ###
], name = "my_model"
)
[14]: model.summary()
Model: "my_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (None, 25) 10025
=================================================================
Total params: 10,431
Trainable params: 10,431
8
Non-trainable params: 0
_________________________________________________________________
Expected Output (Click to Expand) The model.summary() function displays a useful summary of
the model. Because we have specified an input layer size, the shape of the weight and bias arrays
are determined and the total number of parameters per layer can be shown. Note, the names of
the layers may vary as they are auto-generated.
Model: "my_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 25) 10025
_________________________________________________________________
dense_1 (Dense) (None, 15) 390
_________________________________________________________________
dense_2 (Dense) (None, 1) 16
=================================================================
Total params: 10,431
Trainable params: 10,431
Non-trainable params: 0
_________________________________________________________________
Click for hints As described in the lecture:
model = Sequential(
[
tf.keras.Input(shape=(400,)), # specify input size (optional)
Dense(25, activation='sigmoid'),
Dense(15, activation='sigmoid'),
Dense(1, activation='sigmoid')
], name = "my_model"
)
[15]: # UNIT TESTS
from public_tests import *
test_c1(model)
9
We can examine details of the model by first extracting the layers with model.layers and then
extracting the weights with layerx.get_weights() as shown below.
10
[20]: model.compile(
loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(0.001),
)
model.fit(
X,y,
epochs=20
)
Epoch 1/20
32/32 [==============================] - 0s 1ms/step - loss: 0.7338
Epoch 2/20
32/32 [==============================] - 0s 2ms/step - loss: 0.5769
Epoch 3/20
32/32 [==============================] - 0s 1ms/step - loss: 0.4521
Epoch 4/20
32/32 [==============================] - 0s 2ms/step - loss: 0.3432
Epoch 5/20
32/32 [==============================] - 0s 1ms/step - loss: 0.2593
Epoch 6/20
32/32 [==============================] - 0s 2ms/step - loss: 0.2001
Epoch 7/20
32/32 [==============================] - 0s 1ms/step - loss: 0.1591
Epoch 8/20
32/32 [==============================] - 0s 2ms/step - loss: 0.1299
Epoch 9/20
32/32 [==============================] - 0s 1ms/step - loss: 0.1084
Epoch 10/20
32/32 [==============================] - 0s 1ms/step - loss: 0.0920
Epoch 11/20
32/32 [==============================] - 0s 2ms/step - loss: 0.0793
Epoch 12/20
32/32 [==============================] - 0s 1ms/step - loss: 0.0692
Epoch 13/20
32/32 [==============================] - 0s 2ms/step - loss: 0.0612
Epoch 14/20
32/32 [==============================] - 0s 1ms/step - loss: 0.0547
Epoch 15/20
32/32 [==============================] - 0s 2ms/step - loss: 0.0494
Epoch 16/20
32/32 [==============================] - 0s 1ms/step - loss: 0.0449
Epoch 17/20
32/32 [==============================] - 0s 2ms/step - loss: 0.0411
Epoch 18/20
32/32 [==============================] - 0s 1ms/step - loss: 0.0379
Epoch 19/20
11
32/32 [==============================] - 0s 1ms/step - loss: 0.0351
Epoch 20/20
32/32 [==============================] - 0s 2ms/step - loss: 0.0327
To run the model on an example to make a prediction, use Keras predict. The input to predict
is an array so the single example is reshaped to be two dimensional.
[21]: prediction = model.predict(X[0].reshape(1,400)) # a zero
print(f" predicting a zero: {prediction}")
prediction = model.predict(X[500].reshape(1,400)) # a one
print(f" predicting a one: {prediction}")
m, n = X.shape
12
X_random_reshaped = X[random_index].reshape((20,20)).T
13
### 2.5 NumPy Model Implementation (Forward Prop in NumPy) As described in lecture, it is
possible to build your own dense layer using NumPy. This can then be utilized to build a multi-layer
neural network.
### Exercise 2
Below, build a dense layer subroutine. The example in lecture utilized a for loop to visit each unit
(j) in the layer and perform the dot product of the weights for that unit (W[:,j]) and sum the
bias for the unit (b[j]) to form z. An activation function g(z) is then applied to that result. This
section will not utilize some of the matrix operations described in the optional lectures. These will
be explored in a later section.
[24]: # UNQ_C2
# GRADED FUNCTION: my_dense
14
def my_dense(a_in, W, b, g):
"""
Computes dense layer
Args:
a_in (ndarray (n, )) : Data, 1 example
W (ndarray (n,j)) : Weight matrix, n features per unit, j units
b (ndarray (j, )) : bias vector, j units
g activation function (e.g. sigmoid, relu..)
Returns
a_out (ndarray (j,)) : j units
"""
units = W.shape[1]
a_out = np.zeros(units)
### START CODE HERE ###
for j in range(units):
z = np.dot(a_in, W[:, j]) + b[j]
a_out[j] = g(z)
### END CODE HERE ###
return(a_out)
15
for j in range(units):
w = # Select weights for unit j. These are in column j of W
z = # dot product of w and a_in + b
a_out[j] = # apply activation to z
return(a_out)
Click for more hints
def my_dense(a_in, W, b, g):
"""
Computes dense layer
Args:
a_in (ndarray (n, )) : Data, 1 example
W (ndarray (n,j)) : Weight matrix, n features per unit, j units
b (ndarray (j, )) : bias vector, j units
g activation function (e.g. sigmoid, relu..)
Returns
a_out (ndarray (j,)) : j units
"""
units = W.shape[1]
a_out = np.zeros(units)
for j in range(units):
w = W[:,j]
z = np.dot(w, a_in) + b[j]
a_out[j] = g(z)
return(a_out)
[26]: # UNIT TESTS
test_c2(my_dense)
16
if prediction >= 0.5:
yhat = 1
else:
yhat = 0
print( "yhat = ", yhat, " label= ", y[0,0])
prediction = my_sequential(X[500], W1_tmp, b1_tmp, W2_tmp, b2_tmp, W3_tmp,␣
,→b3_tmp )
yhat = 0 label= 0
yhat = 1 label= 1
Run the following cell to see predictions from both the Numpy model and the Tensorflow model.
This takes a moment to run.
[30]: import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cell
m, n = X.shape
17
# Display the label above the image
ax.set_title(f"{y[random_index,0]},{tf_yhat},{my_yhat}")
ax.set_axis_off()
fig.suptitle("Label, yhat Tensorflow, yhat Numpy", fontsize=16)
plt.show()
### 2.6 Vectorized NumPy Model Implementation (Optional) The optional lectures described
vector and matrix operations that can be used to speed the calculations. Below describes a layer
operation that computes the output for all units in a layer on a given input example:
We can demonstrate this using the examples X and the W1,b1 parameters above. We use np.matmul
to perform the matrix multiply. Note, the dimensions of x and W must be compatible as shown in
the diagram above.
18
[31]: x = X[0].reshape(-1,1) # column vector (400,1)
z1 = np.matmul(x.T,W1) + b1 # (1,400)(400,25) = (1,25)
a1 = sigmoid(z1)
print(a1.shape)
(1, 25)
You can take this a step further and compute all the units for all examples in one Matrix-Matrix
operation.
The full operation is Z = XW + b. This will utilize NumPy broadcasting to expand b to m rows.
If this is unfamiliar, a short tutorial is provided at the end of the notebook.
### Exercise 3
Below, compose a new my_dense_v subroutine that performs the layer calculations for a matrix of
examples. This will utilize np.matmul().
Note: This function is not graded because it is discussed in the optional lectures on vectorization.
If you didn’t go through them, feel free to click the hints below the expected code to see the code.
You can also submit the notebook even with a blank answer here.
[32]: # UNQ_C3
# UNGRADED FUNCTION: my_dense_v
19
[0.57199613 0.61301418 0.65248946]
[0.5962827 0.64565631 0.6921095 ]
[0.62010643 0.67699586 0.72908792]]
Expected Output
[[0.54735762 0.57932425 0.61063923]
[0.57199613 0.61301418 0.65248946]
[0.5962827 0.64565631 0.6921095 ]
[0.62010643 0.67699586 0.72908792]]
Click for hints In matrix form, this can be written in one or two lines.
Z = np.matmul of A_in and W plus b
A_out is g(Z)
Click for code
def my_dense_v(A_in, W, b, g):
"""
Computes dense layer
Args:
A_in (ndarray (m,n)) : Data, m examples, n features each
W (ndarray (n,j)) : Weight matrix, n features per unit, j units
b (ndarray (j,1)) : bias vector, j units
g activation function (e.g. sigmoid, relu..)
Returns
A_out (ndarray (m,j)) : m examples, j units
"""
Z = np.matmul(A_in,W) + b
A_out = g(Z)
return(A_out)
[34]: # UNIT TESTS
test_c3(my_dense_v)
20
Let’s make a prediction with the new model. This will make a prediction on all of the examples at
once. Note the shape of the output.
[37]: Prediction = my_sequential_v(X, W1_tmp, b1_tmp, W2_tmp, b2_tmp, W3_tmp, b3_tmp )
Prediction.shape
[37]: (1000, 1)
m, n = X.shape
for i, ax in enumerate(axes.flat):
# Select random indices
random_index = np.random.randint(m)
21
You can see how one of the misclassified images looks.
[40]: fig = plt.figure(figsize=(1, 1))
errors = np.where(y != Yhat)
random_index = errors[0][0]
X_random_reshaped = X[random_index].reshape((20, 20)).T
plt.imshow(X_random_reshaped, cmap='gray')
plt.title(f"{y[random_index,0]}, {Yhat[random_index, 0]}")
plt.axis('off')
plt.show()
22
### 2.7 Congratulations! You have successfully built and utilized a neural network.
### 2.8 NumPy Broadcasting Tutorial (Optional)
In the last example, Z = XW + b utilized NumPy broadcasting to expand the vector b. If you are
not familiar with NumPy Broadcasting, this short tutorial is provided.
XW is a matrix-matrix operation with dimensions (m, j1 )(j1 , j2 ) which results in a matrix with
dimension (m, j2 ). To that, we add a vector b with dimension (1, j2 ). b must be expanded to be a
(m, j2 ) matrix for this element-wise operation to make sense. This expansion is accomplished for
you by NumPy broadcasting.
Broadcasting applies to element-wise operations.
Its basic operation is to ‘stretch’ a smaller dimension by replicating elements to match a larger
dimension.
More specifically: When operating on two arrays, NumPy compares their shapes element-wise. It
starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are
compatible when - they are equal, or - one of them is 1
If these conditions are not met, a ValueError: operands could not be broadcast together exception
is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is
the size that is not 1 along each axis of the inputs.
Here are some examples:
Calculating Broadcast Result shape
The graphic below describes expanding dimensions. Note the red text below:
Broadcast notionally expands arguments to match for element wise operations
The graphic above shows NumPy expanding the arguments to match before the final operation.
Note that this is a notional description. The actual mechanics of NumPy operation choose the
most efficient implementation.
For each of the following examples, try to guess the size of the result before running the example.
[41]: a = np.array([1,2,3]).reshape(-1,1) #(3,1)
b = 5
print(f"(a + b).shape: {(a + b).shape}, \na + b = \n{a + b}")
23
[[6]
[7]
[8]]
Note that this applies to all element-wise operations:
[42]: a = np.array([1,2,3]).reshape(-1,1) #(3,1)
b = 5
print(f"(a * b).shape: {(a * b).shape}, \na * b = \n{a * b}")
[[1]
[2]
[3]
[4]]
[[1 2 3]]
(a + b).shape: (4, 3),
a + b =
[[2 3 4]
[3 4 5]
[4 5 6]
[5 6 7]]
This is the scenario in the dense layer you built above. Adding a 1-D vector b to a (m,j) matrix.
Matrix + 1-D Vector
Please click here if you want to experiment with any of the non-graded code.
Important Note: Please only do this when you’ve already passed the assignment to avoid problems
with the autograder.
On the notebook’s menu, click “View” > “Cell Toolbar” > “Edit Metadata”
Hit the “Edit Metadata” button next to the code cell which you want to lock/unlock
Set the attribute value for “editable” to:
“true” if you want to unlock it
“false” if you want to lock it
24
</li>
<li> On the notebook’s menu, click “View” > “Cell Toolbar” > “None” </li>
</ol>
<p> Here's a short demo of how to do the steps above:
<br>
<img src="https://lh3.google.com/u/0/d/14Xy_Mb17CZVgzVAgq7NCjMVBvSae3xO1" align="center" al
25