Deep Learning and TensorFlow
Deep Learning and TensorFlow
This experiment helps trainees understand the basic syntax of TensorFlow 2.x by introducing a series of tensor
operations of TensorFlow 2.x, including tensor creation, slicing, and indexing, tensor dimension modification,
tensor arithmetic operations, and tensor sorting.
1.1.2 Objectives
In TensorFlow, tensors are classified into constant tensors and variable tensors.
A defined constant tensor has an unchangeable value and dimension, and a defined variable tensor has
a changeable value and an unchangeable dimension.
In neural networks, variable tensors are generally used as matrices for storing weights and other
information, and are a type of trainable data. Constant tensors can be used for storing hyperparameters
or other structured data.
Step 1 tf.constant()
In [11]:
import tensorflow as tf
const_a = tf.constant([[1, 2, 3, 4]],shape=[2,2], dtype=tf.float32) # Create
a 2x2 matrix with values 1, 2, 3, and 4.
const_a
Out[11]:
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[1., 2.],
[3., 4.]], dtype=float32)>
In [12]:
#View common attributes.
print("value of the constant const_a:", const_a.numpy())
print("data type of the constant const_a:", const_a.dtype)
print("shape of the constant const_a:", const_a.shape)
print("name of the device that is to generate the constant const_a:",
const_a.device)
value of the constant const_a: [[1. 2.]
[3. 4.]]
data type of the constant const_a: <dtype: 'float32'>
shape of the constant const_a: (2, 2)
name of the device that is to generate the constant const_a:
/job:localhost/replica:0/task:0/device:CPU:0
Step 2 tf.zeros(), tf.zeros_like(), tf.ones(), and tf.ones_like()
Usages of tf.ones() and tf.ones_like() are similar to those of tf.zeros() and tf.zeros_like ().
Therefore, the following describes only the usages of tf.ones() and tf.ones_like().
In [13]:
zeros_b = tf.zeros(shape=[2, 3], dtype=tf.int32) # Create a 2x3 matrix with
all values being 0.
Create a tensor whose value is 0 0 based on the input tensor, with its shape being the same as that of the input
tensor.
dtype: A type for the returned Tensor. Must be float16, float32, float64, int8, uint8, int16, uint16, int32,
int64, complex64, complex128, bool or string (optional).
dims: A 1-D sequence of non-negative numbers. Represents the shape of the output tf.Tensor. Entries
should be of type: int32, int64.
In [15]:
fill_d = tf.fill([3,3], 8) # Create a 2x3 matrix with all values being 8.
#View data.
fill_d.numpy()
Out[15]:
array([[8, 8, 8],
[8, 8, 8],
[8, 8, 8]], dtype=int32)
Step 4 tf.random
This module is used to generate a tensor with a specific distribution. Common methods in this module include
tf.random.uniform(), tf.random.normal(), and tf.random.shuffle(). The following describes how to use
tf.random.normal().
shape: A 1-D integer Tensor or Python array. The shape of the output tensor.
mean: A Tensor or Python value of type dtype, broadcastable with stddev. The mean of the normal
distribution.
stddev: A Tensor or Python value of type dtype, broadcastable with mean. The standard deviation of
the normal distribution.
seed: A Python integer. Used to create a random seed for the distribution. See tf.random.set_seed for
behavior.
In [16]:
random_e = tf.random.normal([5,5],mean=0,stddev=1.0, seed = 1)
#View the created data.
random_e.numpy()
Out[16]:
array([[-0.8113182 , 1.4845988 , 0.06532937, -2.4427042 , 0.0992484 ],
[ 0.5912243 , 0.59282297, -2.1229296 , -0.72289723, -0.05627038],
[ 0.6435448 , -0.26432407, 1.8566332 , 0.5678417 , -0.3828359 ],
[-1.4853433 , 1.2617711 , -0.02530608, -0.2646297 , 1.5328138 ],
[-1.7429771 , -0.43789294, -0.56601 , 0.32066926, 1.132831 ]],
dtype=float32)
Step 5 Create a list object by using NumPy, and then convert the list object into a tensor by using
tf.convert_to_tensor.
This method can convert a given value into a tensor. tf.convert_to_tensor can be used to convert a Python data
type into a tensor data type available to TensorFlow.
tf.convert_to_tensor(value,dtype=None,dtype_hint=None,name=None) :
dtype_hint: Optional element type for the returned tensor, used when dtype is None. In some cases, a
caller may not have a dtype in mind when converting to a tensor, so dtype_hint can be used as a soft
preference. If the conversion to dtype_hint is not possible, this argument has no effect.
In [17]:
#Create a list.
list_f = [1,2,3,4,5,6]
#View the data type.
type(list_f)
Out[17]:
list
In [18]:
tensor_f = tf.convert_to_tensor(list_f, dtype=tf.float32)
tensor_f
Out[18]:
<tf.Tensor: shape=(6,), dtype=float32, numpy=array([1., 2., 3., 4., 5., 6.],
dtype=float32)>
In TensorFlow, variables are operated using the tf.Variable class. tf.Variable indicates a tensor. The value of
tf.Variable can be changed by running an arithmetic operation on tf.Variable. Variable values can be read and
changed.
In [19]:
#Create a variable. Only the initial value needs to be provided.
var_1 = tf.Variable(tf.ones([2,3]))
var_1
Out[19]:
<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
array([[1., 1., 1.],
[1., 1., 1.]], dtype=float32)>
In [20]:
#Read the variable value.
print("Value of the variable var_1:",var_1.read_value())
#Assign a variable value.
var_value_1=[[1,2,3],[4,5,6]]
var_1.assign(var_value_1)
print("Value of the variable var_1 after the assignment:",var_1.read_value())
Value of the variable var_1: tf.Tensor(
[[1. 1. 1.]
[1. 1. 1.]], shape=(2, 3), dtype=float32)
Value of the variable var_1 after the assignment: tf.Tensor(
[[1. 2. 3.]
[4. 5. 6.]], shape=(2, 3), dtype=float32)
In [21]:
#Variable addition
var_1.assign_add(tf.ones([2,3]))
var_1
Out[21]:
<tf.Variable 'Variable:0' shape=(2, 3) dtype=float32, numpy=
array([[2., 3., 4.],
[5., 6., 7.]], dtype=float32)>
1.2.1.2.1 Slicing
[start: end]: extracts a data slice from the start position to the end position of the tensor.
[start:end:step] or [::step]: extracts a data slice at an interval of step from the start position to the end
position of the tensor.
In [22]:
#Create a 4-dimensional tensor. The tensor contains four images. The size of
each image is 100 x 100 x 3.
tensor_h = tf.random.normal([4,100,100,3])
tensor_h
Out[22]:
<tf.Tensor: shape=(4, 100, 100, 3), dtype=float32, numpy=
array([[[[-0.42564863, 0.452668 , -0.29914168],
[-1.2919582 , 1.8550993 , -0.05315711],
[-1.1958379 , -1.214941 , 0.36617145],
...,
[ 0.58726525, 0.70759076, -0.46401304],
[-0.5949442 , 0.60752153, -0.95319635],
[-0.2574743 , 0.24064586, 0.34597343]],
...,
...,
...,
...,
...,
...,
...,
...,
...,
...,
1.2.1.2.2 Indexing
In [27]:
#Extract the first, second, and fourth images from tensor_h ([4,100,100,3]).
indices = [0,1,3]
tf.gather(tensor_h,axis=0,indices=indices)
Out[27]:
<tf.Tensor: shape=(3, 100, 100, 3), dtype=float32, numpy=
array([[[[-0.42564863, 0.452668 , -0.29914168],
[-1.2919582 , 1.8550993 , -0.05315711],
[-1.1958379 , -1.214941 , 0.36617145],
...,
[ 0.58726525, 0.70759076, -0.46401304],
[-0.5949442 , 0.60752153, -0.95319635],
[-0.2574743 , 0.24064586, 0.34597343]],
...,
...,
In [28]:
# Extract the pixel in [1,1] from the first dimension of the first image
# and the pixel in [2,2] from the first dimension of the second image in
tensot_h ([4,100,100,3]).
indices = [[0,1,1,0],[1,2,2,0]]
tf.gather_nd(tensor_h,indices=indices)
Out[28]:
<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 1.0574174 , -
0.89853895], dtype=float32)>
In [29]:
const_d_1 = tf.constant([[1, 2, 3, 4]],shape=[2,2], dtype=tf.float32)
#Three common methods for displaying a dimension:
print(const_d_1.shape)
print(const_d_1.get_shape())
#The output is a tensor. The value of the tensor indicates the size of the
tensor dimension to be displayed.
print(tf.shape(const_d_1))
(2, 2)
(2, 2)
tf.Tensor([2 2], shape=(2,), dtype=int32)
As described above, .shape and .get_shape() return TensorShape objects, and tf.shape(x) returns Tensor
objects.
tf.reshape(tensor,shape,name=None):
tensor: input tensor
shape: dimension of the reshaped tensor
In [30]:
reshape_1 = tf.constant([[1,2,3],[4,5,6]])
print(reshape_1)
tf.reshape(reshape_1, (3,2))
tf.Tensor(
[[1 2 3]
[4 5 6]], shape=(2, 3), dtype=int32)
Out[30]:
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[1, 2],
[3, 4],
[5, 6]], dtype=int32)>
tf.expand_dims(input,axis,name=None):
In [31]:
#Generate a 100 x 100 x 3 tensor to represent a 100 x 100 three-channel color
image.
expand_sample_1 = tf.random.normal([100,100,3], seed=1)
print("size of the original data:",expand_sample_1.shape)
print("add a dimension before the first dimension ( axis = 0 0):
",tf.expand_dims(expand_sample_1, axis=0).shape)
print("add a dimension before the second dimension ( axis = 1 1):
",tf.expand_dims(expand_sample_1, axis=1).shape)
print("add a dimension after the last dimension ( axis = – –1 1):
",tf.expand_dims(expand_sample_1, axis=-1).shape)
size of the original data: (100, 100, 3)
add a dimension before the first dimension ( axis = 0 0): (1, 100, 100, 3)
add a dimension before the second dimension ( axis = 1 1): (100, 1, 100, 3)
add a dimension after the last dimension ( axis = – –1 1): (100, 100, 3, 1)
tf.squeeze(input,axis=None,name=None):
In [32]:
#Generate a 100 x 100 x 3 tensor to represent a 100 x 100 three-channel color
image.
squeeze_sample_1 = tf.random.normal([1,100,100,3])
print("size of the original data:",squeeze_sample_1.shape)
squeezed_sample_1 = tf.squeeze(expand_sample_1)
print("data size after dimension squeezing:",squeezed_sample_1.shape)
size of the original data: (1, 100, 100, 3)
data size after dimension squeezing: (100, 100, 3)
1.2.1.3.5 Transpose
tf.transpose(a,perm=None,conjugate=False,name='transpose'):
a: input tensor
perm: tensor size sequence, generally used to transpose high-dimensional arrays
conjugate: indicates complex number transpose.
name: tensor name
In [33]:
#Input the tensor to be transposed, and call tf.transpose.
trans_sample_1 = tf.constant([1,2,3,4,5,6],shape=[2,3])
print("size of the original data:",trans_sample_1.shape)
transposed_sample_1 = tf.transpose(trans_sample_1)
print("size of transposed data:",transposed_sample_1.shape)
size of the original data: (2, 3)
size of transposed data: (3, 2)
perm is required for high-dimensional data transpose, and indicates the dimension sequence of the input tensor.
The original dimension sequence of a three-dimensional tensor is [0, 1, 2] ( perm), indicating the length, width,
and height of high-dimensional data, respectively.
tf.broadcast_to(input,shape,name=None):
In [35]:
broadcast_sample_1 = tf.constant([1,2,3,4,5,6])
print("original data:",broadcast_sample_1.numpy())
broadcasted_sample_1 = tf.broadcast_to(broadcast_sample_1,shape=[4,6])
print("broadcasted data:",broadcasted_sample_1.numpy())
original data: [1 2 3 4 5 6]
broadcasted data: [[1 2 3 4 5 6]
[1 2 3 4 5 6]
[1 2 3 4 5 6]
[1 2 3 4 5 6]]
In [36]:
#During the operation, if two arrays have different shapes, TensorFlow
automatically triggers the broadcast mechanism as NumPy does.
a = tf.constant([[ 0, 0, 0],
[10,10,10],
[20,20,20],
[30,30,30]])
b = tf.constant([1,2,3])
print(a + b)
tf.Tensor(
[[ 1 2 3]
[11 12 13]
[21 22 23]
[31 32 33]], shape=(4, 3), dtype=int32)
Main arithmetic operations include addition (tf.add), subtraction (tf.subtract), multiplication (tf.multiply),
division ( tf.divide), logarithm (tf.math.log), and powers ( tf.pow). The following describes only one addition
example.
In [37]:
a = tf.constant([[3, 5], [4, 8]])
b = tf.constant([[1, 6], [2, 9]])
print(tf.add(a, b))
tf.Tensor(
[[ 4 11]
[ 6 17]], shape=(2, 2), dtype=int32)
tf.argmax(input,axis):
In TensorFlow, a series of operations of tf.reduce_* reduce tensor dimensions. The series of operations can be
performed on dimensional elements of a tensor, for example, calculating the mean value by row and
calculating a product of all elements in the tensor.
The methods for using these operations are similar. The following describes how to use tf.reduce_sum.
In [40]:
reduce_sample_1 = tf.constant([1,2,3,4,5,6],shape=[2,3])
print("original data",reduce_sample_1.numpy())
print("calculate the sum of all elements in the tensor ( axis =
None):",tf.reduce_sum(reduce_sample_1,axis=None).numpy())
print("calculate the sum of elements in each column by column ( axis = 0
0):",tf.reduce_sum(reduce_sample_1,axis=0).numpy())
print("calculate the sum of elements in each column by row ( axis = 1
1):",tf.reduce_sum(reduce_sample_1,axis=1).numpy())
original data [[1 2 3]
[4 5 6]]
calculate the sum of all elements in the tensor ( axis = None): 21
calculate the sum of elements in each column by column ( axis = 0 0): [5 7 9]
calculate the sum of elements in each column by row ( axis = 1 1): [ 6 15]
In [41]:
concat_sample_1 = tf.random.normal([4,100,100,3])
concat_sample_2 = tf.random.normal([40,100,100,3])
print("sizes of the original
data:",concat_sample_1.shape,concat_sample_2.shape)
concated_sample_1 = tf.concat([concat_sample_1,concat_sample_2],axis=0)
print("size of the concatenated data:",concated_sample_1.shape)
sizes of the original data: (4, 100, 100, 3) (40, 100, 100, 3)
size of the concatenated data: (44, 100, 100, 3)
A dimension can be added to an original matrix in the same way. axis determines the position of the
dimension.
values: A list of Tensor objects with the same shape and type.
axis: An int. The axis to stack along. Defaults to the first dimension. Negative values wrap around, so
the valid range is [-(R+1), R+1).
name: A name for this operation (optional).
In [42]:
stack_sample_1 = tf.random.normal([100,100,3])
stack_sample_2 = tf.random.normal([100,100,3])
print("sizes of the original data: ",stack_sample_1.shape,
stack_sample_2.shape)
In [43]:
#Split data based on the first dimension and output the split data in a list.
tf.unstack(stacked_sample_1,axis=0)
Out[43]:
[<tf.Tensor: shape=(100, 100, 3), dtype=float32, numpy=
array([[[-3.97027731e-01, 6.74600065e-01, -9.82945263e-01],
[ 9.37624693e-01, -9.55090046e-01, -5.13697684e-01],
[-9.67928052e-01, 1.53015286e-01, 3.64312351e-01],
...,
[-1.03265333e+00, 1.74107659e+00, -1.64016807e+00],
[-1.75055861e+00, 3.47322911e-01, 9.39206481e-01],
[-1.96195650e-03, 1.39943630e-01, 1.55450010e+00]],
...,
...,
In [44]:
import numpy as np
split_sample_1 = tf.random.normal([10,100,100,3])
print("size of the original data:",split_sample_1.shape)
splited_sample_1 = tf.split(split_sample_1, num_or_size_splits=5,axis=0)
print("size of the split data when m_or_size_splits is set to 10:
",np.shape(splited_sample_1))
splited_sample_2 = tf.split(split_sample_1,
num_or_size_splits=[3,5,2],axis=0)
print("sizes of the split data when num_or_size_splits is set to [3,5,2]:",
np.shape(splited_sample_2[0]),
np.shape(splited_sample_2[1]),
np.shape(splited_sample_2[2]))
size of the original data: (10, 100, 100, 3)
size of the split data when m_or_size_splits is set to 10: (5, 2, 100, 100,
3)
sizes of the split data when num_or_size_splits is set to [3,5,2]: (3, 100,
100, 3) (5, 100, 100, 3) (2, 100, 100, 3)
tf.sort(): sorts tensors in ascending or descending order and returns the sorted tensors.
tf.argsort(): sorts tensors in ascending or descending order, and returns tensor indexes.
tf.nn.top_k(): returns the first k maximum values.
tf.sort/argsort(input, direction, axis):
input: input tensor
direction: sorting order, which can be set to DESCENDING (descending order) or ASCENDING
(ascending order). The default value is ASCENDING.
axis: sorting by the dimension specified by axis. The default value of axis is –1, indicating the last
dimension.
In [45]:
sort_sample_1 = tf.random.shuffle(tf.range(10))
print("input tensor:",sort_sample_1.numpy())
sorted_sample_1 = tf.sort(sort_sample_1, direction="ASCENDING")
print("tensor sorted in ascending order:",sorted_sample_1.numpy())
sorted_sample_2 = tf.argsort(sort_sample_1,direction="ASCENDING")
print("indexes of elements in ascending order:",sorted_sample_2.numpy())
input tensor: [9 2 8 3 6 1 4 5 0 7]
tensor sorted in ascending order: [0 1 2 3 4 5 6 7 8 9]
indexes of elements in ascending order: [8 5 1 3 6 7 4 9 2 0]
tf.nn.top_k(input,k=1,sorted=True,name=None):
The eager execution mode of TensorFlow is a type of imperative programming, which is the same as native
Python. When you perform a particular operation, the system immediately returns a result.
Graph mode:
TensorFlow 1.0 adopts the graph mode to first build a computational graph, enable a session, and then feed
actual data to obtain a result.
In eager execution mode, code debugging is easier, but the code execution efficiency is lower.
The following implements simple multiplication by using TensorFlow to compare the differences between the
eager execution mode and the graph mode.
In [47]:
x = tf.ones((2, 2), dtype=tf.dtypes.float32)
y = tf.constant([[1, 2],
[3, 4]], dtype=tf.dtypes.float32)
z = tf.matmul(x, y)
print(z)
tf.Tensor(
[[4. 6.]
[4. 6.]], shape=(2, 2), dtype=float32)
In [48]:
#Use the syntax of TensorFlow 1.x in TensorFlow 2.x.
# You can install the v1 compatibility package in TensorFlow 2.0
# to inherit the TensorFlow 1.x code and disable the eager execution mode.
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()
When used to comment out a function, the decorator tf.function can be called like any other function.
tf.function will be compiled into a graph, so that it can run more efficiently on a GPU or TPU. In this case, the
function becomes an operation in TensorFlow. The function can be directly called to output a return value.
However, the function is executed in graph mode and intermediate variable values cannot be directly viewed.
In [3]:
@tf.function
def simple_nn_layer(w,x,b):
print(b)
return tf.nn.relu(tf.matmul(w, x)+b)
w = tf.random.uniform((3, 3))
x = tf.random.uniform((3, 3))
b = tf.constant(0.5, dtype='float32')
simple_nn_layer(w,x,b)
Tensor("b:0", shape=(), dtype=float32)
Out[3]:
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1.1031461 , 1.1353902 , 0.8399954 ],
[1.1968153 , 1.2687542 , 0.96577394],
[1.1476098 , 1.2371151 , 0.9468075 ]], dtype=float32)>
According to the output result, the value of b b in the function cannot be viewed directly, but the return value
can be viewed using .numpy().
The following compares the performance of the graph mode and eager execution mode by performing the
same operation.
In [6]:
import timeit
x = tf.random.uniform(shape=[20, 20], minval=-1, maxval=2,
dtype=tf.dtypes.int32)
@tf.function
def power_graph(x, y):
return power(x, y)
2.2 Objectives
Upon completion of this task, you will be able to master the common deep learning modeling interfaces in
tf.keras.
The most common way to build a model is to stack layers by using tf.ke
In [7]:
import tensorflow.keras.layers as layers
model = tf.keras.Sequential()
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
Functional models are mainly built by using tf.keras.Input and tf.keras.Model, which are more complex than
tf.keras.Sequential but have a good effect. Variables can be input at the same time or in different phases, and
data can be output in different phases. Functional models are preferred if more than one model output is
needed.
The tf.keras.Sequential model is a simple stack of layers that cannot represent arbitrary models. You can use
the Keras functional API to build complex model topologies such as:
Multi-input models
Multi-output models
Models with shared layers
Models with non-sequential data flows (for example, residual connections)
In [8]:
#Use the output of the previous layer as the input of the next layer.
x = tf.keras.Input(shape=(32,))
h1 = layers.Dense(32, activation='relu')(x)
h2 = layers.Dense(32, activation='relu')(h1)
y = layers.Dense(10, activation='softmax')(h2)
model_sample_2 = tf.keras.models.Model(x, y)
The tf.keras.layers module is used to configure neural network layers. Common classes include:
activation: sets the activation function for the layer. By default, the system applies no activation
function.
kernel_initializer and bias_initializer: initialization schemes that create the layer's weights (kernel and
bias). This defaults to the Glorot uniform initializer.
kernel_regularizer and bias_regularizer: regularization schemes that apply to the layer's weights
(kernel and bias), such as L1 or L2 regularization. By default, the system applies no regularization
function.
2.3.1.3.1 tf.keras.layers.Dense
In [9]:
#Create a fully connected layer that contains 32 neurons. Set the activation
function to sigmoid.
#The activation parameter can be set to a function name string,
# for example, sigmoid or a function object, for example, tf.sigmoid.
layers.Dense(32, activation='sigmoid')
layers.Dense(32, activation=tf.sigmoid)
#Set kernel_initializer.
layers.Dense(32, kernel_initializer=tf.keras.initializers.he_normal)
#Set kernel_regularizer to L2 regularization.
layers.Dense(32, kernel_regularizer=tf.keras.regularizers.l2(0.01))
Out[9]:
<tensorflow.python.keras.layers.core.Dense at 0x7fa3246e8e90>
2.3.1.3.2 tf.keras.layers.Conv2D
pool_size: size of the pooled kernel. For example, if the matrix (2, 2) is used, the picture becomes half
of the original length in both dimensions. If this parameter is set to an integer, the integer is the values
of all dimensions.
strides: Integer, tuple of 2 integers, or None. Strides values. Specifies how far the pooling window
moves for each pooling step. If None, it will default to pool_size.
padding: One of "valid" or "same" (case-insensitive). "valid" adds no zero padding. "same" adds
padding such that if the stride is 1, the output shape is the same as input shape.
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions
in the inputs. channels_last corresponds to inputs with shape (batch, height, width, channels) while
channels_first corresponds to inputs with shape (batch, channels, height, width). It defaults to the
image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it,
then it will be "channels_last".
In [11]:
layers.MaxPool2D(pool_size=(2,2),strides=(2,1))
Out[11]:
<tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7fa324533510>
2.3.1.3.4 tf.keras.layers.LSTM/tf.keras.layers.LSTMCell
In [12]:
import numpy as np
In [13]:
#LSTM
tf.keras.layers.LSTM(16, return_sequences=True)
#LSTMCell
x = tf.keras.Input((None, 3))
y = layers.RNN(layers.LSTMCell(16))(x)
model_lstm_3= tf.keras.Model(x, y)
After a model is built, you can call compile to configure the learning process of the
model: compile(optimizer='rmsprop', loss=None, metrics=None, loss_weights=None, weighted_metrics=None,
run_eagerly=None, \*\*kwargs):
In [15]:
import numpy as np
train_x = np.random.random((1000, 36))
train_y = np.random.random((1000, 10))
val_x = np.random.random((200, 36))
val_y = np.random.random((200, 10))
model.fit(train_x, train_y, epochs=10, batch_size=100,
validation_data=(val_x, val_y))
Epoch 1/10
10/10 [==============================] - 1s 49ms/step - loss: 12.3802 -
categorical_accuracy: 0.0844 - val_loss: 12.3513 - val_categorical_accuracy:
0.0950
Epoch 2/10
10/10 [==============================] - 0s 15ms/step - loss: 12.4663 -
categorical_accuracy: 0.0871 - val_loss: 12.3515 - val_categorical_accuracy:
0.0950
Epoch 3/10
10/10 [==============================] - 0s 13ms/step - loss: 12.3072 -
categorical_accuracy: 0.1013 - val_loss: 12.3511 - val_categorical_accuracy:
0.0950
Epoch 4/10
10/10 [==============================] - 0s 13ms/step - loss: 12.2677 -
categorical_accuracy: 0.0932 - val_loss: 12.3513 - val_categorical_accuracy:
0.0950
Epoch 5/10
10/10 [==============================] - 0s 12ms/step - loss: 12.3376 -
categorical_accuracy: 0.1059 - val_loss: 12.3519 - val_categorical_accuracy:
0.0950
Epoch 6/10
10/10 [==============================] - 0s 14ms/step - loss: 12.3756 -
categorical_accuracy: 0.1003 - val_loss: 12.3512 - val_categorical_accuracy:
0.0950
Epoch 7/10
10/10 [==============================] - 0s 13ms/step - loss: 12.3867 -
categorical_accuracy: 0.0877 - val_loss: 12.3505 - val_categorical_accuracy:
0.0950
Epoch 8/10
10/10 [==============================] - 0s 32ms/step - loss: 12.3140 -
categorical_accuracy: 0.0971 - val_loss: 12.3501 - val_categorical_accuracy:
0.0950
Epoch 9/10
10/10 [==============================] - 0s 9ms/step - loss: 12.3372 -
categorical_accuracy: 0.1055 - val_loss: 12.3502 - val_categorical_accuracy:
0.0950
Epoch 10/10
10/10 [==============================] - 0s 12ms/step - loss: 12.3934 -
categorical_accuracy: 0.0943 - val_loss: 12.3496 - val_categorical_accuracy:
0.0950
Out[15]:
<tensorflow.python.keras.callbacks.History at 0x7fa31c7605d0>
You can use tf.data to build training input pipelines for large datasets.
In [16]:
dataset = tf.data.Dataset.from_tensor_slices((train_x, train_y))
dataset = dataset.batch(32)
dataset = dataset.repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((val_x, val_y))
val_dataset = val_dataset.batch(32)
val_dataset = val_dataset.repeat()
model.fit(dataset, epochs=10, steps_per_epoch=30,
validation_data=val_dataset, validation_steps=3)
Epoch 1/10
30/30 [==============================] - 1s 12ms/step - loss: 12.3317 -
categorical_accuracy: 0.0979 - val_loss: 12.2354 - val_categorical_accuracy:
0.1042
Epoch 2/10
30/30 [==============================] - 0s 4ms/step - loss: 12.3113 -
categorical_accuracy: 0.1004 - val_loss: 12.2317 - val_categorical_accuracy:
0.1042
Epoch 3/10
30/30 [==============================] - 0s 4ms/step - loss: 12.3155 -
categorical_accuracy: 0.0962 - val_loss: 12.2275 - val_categorical_accuracy:
0.1042
Epoch 4/10
30/30 [==============================] - 0s 3ms/step - loss: 12.3378 -
categorical_accuracy: 0.0929 - val_loss: 12.2230 - val_categorical_accuracy:
0.1042
Epoch 5/10
30/30 [==============================] - 0s 4ms/step - loss: 12.2941 -
categorical_accuracy: 0.0994 - val_loss: 12.2191 - val_categorical_accuracy:
0.1042
Epoch 6/10
30/30 [==============================] - 0s 3ms/step - loss: 12.2916 -
categorical_accuracy: 0.0929 - val_loss: 12.2161 - val_categorical_accuracy:
0.1042
Epoch 7/10
30/30 [==============================] - 0s 4ms/step - loss: 12.3166 -
categorical_accuracy: 0.0972 - val_loss: 12.2136 - val_categorical_accuracy:
0.1042
Epoch 8/10
30/30 [==============================] - 0s 3ms/step - loss: 12.3050 -
categorical_accuracy: 0.0897 - val_loss: 12.2115 - val_categorical_accuracy:
0.1042
Epoch 9/10
30/30 [==============================] - 0s 4ms/step - loss: 12.2984 -
categorical_accuracy: 0.0929 - val_loss: 12.2101 - val_categorical_accuracy:
0.1042
Epoch 10/10
30/30 [==============================] - 0s 4ms/step - loss: 12.3012 -
categorical_accuracy: 0.0983 - val_loss: 12.2088 - val_categorical_accuracy:
0.1042
Out[16]:
<tensorflow.python.keras.callbacks.History at 0x7fa31c77f650>
A callback function is an object passed to the model to customize and extend the model's behavior during
training. You can customize callback functions or use embedded functions in tf.keras.callbacks. Common
embedded callback functions include:
In [17]:
#Set hyperparameters.
Epochs = 10
print(lr)
return lr
callbacks = [
#Early stopping:
tf.keras.callbacks.EarlyStopping(
#Metric for determining whether the model performance has no further
improvement
monitor='val_loss',
#Threshold for determining whether the model performance has no
further improvement
min_delta=1e-2,
#Number of epochs in which the model performance has no further
improvement
patience=2),
In [20]:
import numpy as np
import os
# create the file
if not os.path.exists('./model/'):
os.mkdir('./model/')
#Save models.
model.save('./model/the_save_model.h5')
#Import models.
new_model = tf.keras.models.load_model('./model/the_save_model.h5')
new_prediction = new_model.predict(test_x)
#np.testing.assert_allclose: determines whether the similarity between two
objects exceeds the specified tolerance.
#If yes, the system displays an exception.
#atol: specified tolerance
np.testing.assert_allclose(result, new_prediction, atol=1e-6) # Prediction
results are the same.
After a model is saved, you can find the corresponding weight file in the corresponding folder.
If the weight name is suffixed with .h5 or .keras, save the weight as an HDF5 file, or otherwise, save the
weight as a TensorFlow checkpoint file by default.
In [21]:
model.save_weights('./model/model_weights')
model.save_weights('./model/model_weights.h5')
#Load the weights.
model.load_weights('./model/model_weights')
model.load_weights('./model/model_weights.h5')
3 Handwritten Digit Recognition with TensorFlow
3.1 Introduction
Handwritten digit recognition is a common image recognition task where computers recognize text in
handwriting images. Different from printed fonts, handwriting of different people has different sizes and
styles, making it difficult for computers to recognize handwriting. This chapter describes the basic process of
TensorFlow computing and basic elements for building a network.
3.2 Objectives
Upon completion of this task, you will be able to: • Master the basic process of TensorFlow computing. • Be
familiar with the basic elements of network building, including dataset, network model building, model
training, and model validation.
3.3.1.1 Description
This project applies deep learning and TensorFlow tools to train and build models based on the MNIST
handwriting dataset.
The MNIST dataset is provided by the National Institute of Standards and Technology (NIST). The dataset
consists of handwritten digits from 250 individuals, of which 50% are high school students and 50% are staff
from Bureau of the Census. You can download the dataset from http://yann.lecun.com/exdb/mnist/, which
consists of the following parts:
- Training set labels: train-labels-idx1-ubyte.gz (29 KB, 60 KB after decompression, including 60,000
labels)
- Test set images: t10k-images-idx3-ubyte.gz (1.6 MB, 7.8 MB after decompression, including 10,000
samples)
- Test set labels: t10k-labels-idx1-ubyte.gz (5 KB, 10 KB after decompression, including 10,000 labels)
The MNIST dataset is an entry-level computer vision dataset that contains images of various handwritten
digits.
It also contains one label for each image, to clarify the correct digit. For example, the labels for the preceding
four images are 5, 0, 4, and 1.
Download the MNIST dataset directly from the official TensorFlow website and decompress it.
In [22]:
import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, datasets
from matplotlib import pyplot as plt
import numpy as np
print(x_train_raw.shape, y_train_raw.shape)
print(x_test_raw.shape, y_test_raw.shape)
print(y_train_raw[0])
print(y_train[0])
(60000, 28, 28) (60000,)
(10000, 28, 28) (10000,)
5
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
In the MNIST dataset, the images are a tensor in the shape of [60000, 28, 28]. The first dimension is used to
extract images, and the second and third dimensions are used to extract pixels in each image. Each element in
this tensor indicates the strength of a pixel in an image. The value ranges from 0 to 255. Label data is
converted from scalar to one-hot vectors. In a one-hot vector, one digit is 1, and digits in other dimensions are
all 0s. For example, label 1 may be represented as [0,1,0,0,0,0,0,0,0,0,0,0]. Therefore, the labels are a digital
matrix of [60000, 10].
An output of a fully connected network must be in the form of vector, instead of the matrix form of the current
images. Therefore, you need to sort the images into vectors.
In [24]:
#Convert a 28 x 28 image into a 784 x 1 vector.
x_train = x_train_raw.reshape(60000, 784)
x_test = x_test_raw.reshape(10000, 784)
Currently, the dynamic range of pixels is 0 to 255. Image pixels are usually normalized to the range of 0 to 1
during processing of image pixel values.
In [25]:
#Normalize image pixel values.
x_train = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255
In [26]:
#Create a deep neural network (DNN) model that consists of three fully
connected layers and two RELU activation functions.
model = keras.Sequential([
layers.Dense(512, activation='relu', input_dim = 784),
layers.Dense(256, activation='relu'),
layers.Dense(124, activation='relu'),
layers.Dense(num_classes, activation='softmax')])
model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_11 (Dense) (None, 512) 401920
_________________________________________________________________
dense_12 (Dense) (None, 256) 131328
_________________________________________________________________
dense_13 (Dense) (None, 124) 31868
_________________________________________________________________
dense_14 (Dense) (None, 10) 1250
=================================================================
Total params: 566,366
Trainable params: 566,366
Non-trainable params: 0
_________________________________________________________________
layer.Dense() indicates a fully connected layer, and activation indicates a used activation function.
In [27]:
Optimizer = optimizers.Adam(0.001)
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=Optimizer,
metrics=['accuracy'])
In the preceding example, the loss function of the model is cross entropy, and the optimization algorithm is the
Adam gradient descent method.
In [28]:
#Fit the training data to the model by using the fit method.
model.fit(x_train, y_train,
batch_size=128,
epochs=10,
verbose=1)
Epoch 1/10
469/469 [==============================] - 11s 22ms/step - loss: 0.4302 -
accuracy: 0.8708
Epoch 2/10
469/469 [==============================] - 11s 24ms/step - loss: 0.0864 -
accuracy: 0.9735
Epoch 3/10
469/469 [==============================] - 11s 23ms/step - loss: 0.0510 -
accuracy: 0.9842
Epoch 4/10
469/469 [==============================] - 10s 22ms/step - loss: 0.0388 -
accuracy: 0.9879
Epoch 5/10
469/469 [==============================] - 10s 22ms/step - loss: 0.0262 -
accuracy: 0.9914
Epoch 6/10
469/469 [==============================] - 10s 21ms/step - loss: 0.0256 -
accuracy: 0.9913
Epoch 7/10
469/469 [==============================] - 11s 24ms/step - loss: 0.0251 -
accuracy: 0.9916
Epoch 8/10
469/469 [==============================] - 11s 24ms/step - loss: 0.0153 -
accuracy: 0.9952
Epoch 9/10
469/469 [==============================] - 11s 23ms/step - loss: 0.0154 -
accuracy: 0.9944
Epoch 10/10
469/469 [==============================] - 12s 25ms/step - loss: 0.0145 -
accuracy: 0.9951
Out[28]:
<tensorflow.python.keras.callbacks.History at 0x7fa3244831d0>
Epoch indicates a specific round of training. In the preceding example, full data is iterated for 10 times.
In [30]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Test loss: 0.09210538119077682
Test accuracy: 0.9800000190734863
The evaluation shows that the model accuracy reaches 0.87, and 10 training iterations have been performed.
In [32]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
In [35]:
#Expand data dimensions to adapt to the CNN model.
X_train=x_train.reshape(60000,28,28,1)
X_test=x_test.reshape(10000,28,28,1)
model.compile(optimizer="adam",loss="categorical_crossentropy",metrics=['accu
racy'])
model.fit(x=X_train,y=y_train,epochs=5,batch_size=128)
Epoch 1/5
469/469 [==============================] - 125s 265ms/step - loss: 0.0977 -
accuracy: 0.9693
Epoch 2/5
469/469 [==============================] - 132s 281ms/step - loss: 0.0664 -
accuracy: 0.9804
Epoch 3/5
469/469 [==============================] - 128s 273ms/step - loss: 0.0550 -
accuracy: 0.9831
Epoch 4/5
469/469 [==============================] - 126s 268ms/step - loss: 0.0442 -
accuracy: 0.9867
Epoch 5/5
469/469 [==============================] - 134s 287ms/step - loss: 0.0412 -
accuracy: 0.9875
Out[35]:
<tensorflow.python.keras.callbacks.History at 0x7fa3082a8610>
During training, the network training data is iterated for only five times. You can increase the number of
network iterations to check the effect.
In [36]:
test_loss,test_acc=model.evaluate(x=X_test,y=y_test)
print("Test Accuracy %.2f"%test_acc)
313/313 [==============================] - 9s 27ms/step - loss: 0.0202 -
accuracy: 0.9926
Test Accuracy 0.99
The verification shows that accuracy of the CNN model reaches up to 99%.
In [37]:
model.save('./final_CNN_model.h5')
In [38]:
from tensorflow.keras.models import load_model
new_model = load_model('./final_CNN_model.h5')
new_model.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 28, 28, 32) 832
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 14, 14, 64) 18496
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 64) 0
_________________________________________________________________
dropout (Dropout) (None, 7, 7, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 3136) 0
_________________________________________________________________
dense_15 (Dense) (None, 128) 401536
_________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
_________________________________________________________________
dense_16 (Dense) (None, 10) 1290
=================================================================
Total params: 422,154
Trainable params: 422,154
Non-trainable params: 0
_________________________________________________________________
Visualize prediction results.
In [39]:
#Visualize test set output results.
import matplotlib.pyplot as plt
%matplotlib inline
def res_Visual(n):
final_opt_a=np.argmax(new_model.predict(X_test[0:n]), axis=-1)#Perform
predictions on the test set by using the model.
fig, ax = plt.subplots(nrows=int(n/5),ncols=5 )
ax = ax.flatten()
print('prediction results of the first {} images:'.format(n))
for i in range(n):
print(final_opt_a[i],end=',')
if int((i+1)%5) ==0:
print('\t')
#Visualize image display.
img = X_test[i].reshape((28,28))#Read each row of data in the format
of Ndarry.
plt.axis("off")
ax[i].imshow(img, cmap='Greys',
interpolation='nearest')#Visualization
ax[i].axis("off")
print('first {} images in the test set:'.format(n))
res_Visual(20)
prediction results of the first 20 images:
7,2,1,0,4,
1,4,9,5,9,
0,6,9,0,1,
5,9,7,3,4,
first 20 images in the test set:
4 Image Classification
4.1 Introduction
This experiment is about a basic task in computer vision, that is, image recognition. The numay and
TensorFlow frameworks are required. The NumPy arrays are used as the image objects. The TensorFlow
framework is mainly used to create deep learning algorithms and build a convolutional neural network (CNN).
This experiment recognizes image categories based on the CIFAR10 dataset.
4.1.2 Objectives
In [40]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, datasets, Sequential
from tensorflow.keras.layers import
Conv2D,Activation,MaxPooling2D,Dropout,Flatten,Dense
import os
import numpy as np
import matplotlib.pyplot as plt
In [41]:
#download Cifar-10 dataset
(x_train,y_train), (x_test, y_test) = datasets.cifar10.load_data()
#print the size of the dataset
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
print(y_train[0])
(50000, 32, 32, 3) (50000, 1) (10000, 32, 32, 3) (10000, 1)
[6]
In [42]:
#Convert the category label into onehot encoding
num_classes = 10
y_train_onehot = keras.utils.to_categorical(y_train, num_classes)
y_test_onehot = keras.utils.to_categorical(y_test, num_classes)
y_train_onehot[0]
Out[42]:
array([0., 0., 0., 0., 0., 0., 1., 0., 0., 0.], dtype=float32)
Show the first 9 images
In [43]:
#Create a image tag list
category_dict =
{0:'airplane',1:'automobile',2:'bird',3:'cat',4:'deer',5:'dog',
6:'frog',7:'horse',8:'ship',9:'truck'}
#Show the first 9 images and their labels
plt.figure()
for i in range(9):
#create a figure with 9 subplots
plt.subplot(3,3,i+1)
#show an image
plt.imshow(x_train[i])
#show the label
plt.ylabel(category_dict[y_train[i][0]])
plt.show()
In [44]:
#Pixel normalization
x_train = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255
In [45]:
def CNN_classification_model(input_size = x_train.shape[1:]):
model = Sequential()
#the first block with 2 convolutional layers and 1 maxpooling layer
'''Conv1 with 32 3*3 kernels
padding="same": it applies zero padding to the input image so that
the input image gets fully covered by the filter and specified stride.
It is called SAME because, for stride 1 , the output will be the same
as the input.
output: 32*32*32'''
model.add(Conv2D(32, (3, 3), padding='same',
input_shape=input_size))
#relu activation function
model.add(Activation('relu'))
#Conv2
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
#maxpooling
model.add(MaxPooling2D(pool_size=(2, 2),strides =1))
opt = keras.optimizers.Adam(lr=0.0001)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt,
metrics=['accuracy'])
return model
model=CNN_classification_model()
model.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_3 (Conv2D) (None, 32, 32, 32) 896
_________________________________________________________________
activation (Activation) (None, 32, 32, 32) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 30, 30, 32) 9248
_________________________________________________________________
activation_1 (Activation) (None, 30, 30, 32) 0
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 29, 29, 32) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 29, 29, 64) 18496
_________________________________________________________________
activation_2 (Activation) (None, 29, 29, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 27, 27, 64) 36928
_________________________________________________________________
activation_3 (Activation) (None, 27, 27, 64) 0
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 13, 13, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10816) 0
_________________________________________________________________
dense_17 (Dense) (None, 128) 1384576
_________________________________________________________________
activation_4 (Activation) (None, 128) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_18 (Dense) (None, 10) 1290
_________________________________________________________________
activation_5 (Activation) (None, 10) 0
=================================================================
Total params: 1,451,434
Trainable params: 1,451,434
Non-trainable params: 0
_________________________________________________________________
In [47]:
from tensorflow.keras.callbacks import ModelCheckpoint
model_name = "final_cifar10.h5"
model_checkpoint = ModelCheckpoint(model_name, monitor='loss',verbose=1,
save_best_only=True)
In [48]:
new_model = CNN_classification_model()
new_model.load_weights('final_cifar10.h5')
plt.figure()
for i in range(0,10):
plt.subplot(5,2,i+1)
#plot
plt.imshow(x_test[i])
#predict
# pred = new_model.predict_classes(x_test[0:10])
pred = np.argmax(new_model.predict(x_test[0:10]), axis=-1)
pred_list.append(pred)
#Display actual and predicted labels of images
plt.title("pred:"+category_dict[pred[i]]+" actual:"+
category_dict[y_test[i][0]])
plt.axis('off')
plt.show()
4.3 Summary
This chapter describes how to build an image classification model based on TensorFlow 2 and python. It
provides trainees with a basic concept of deep learning model building.