0% found this document useful (0 votes)
30 views

Basics of TensorFlow

Uploaded by

ghost uchiha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Basics of TensorFlow

Uploaded by

ghost uchiha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

CHAPTER 1

Basics of TensorFlow
This chapter covers the basics of TensorFlow, the deep learning
framework. Deep learning does a wonderful job in pattern recognition,
especially in the context of images, sound, speech, language, and time-
series data. With the help of deep learning, you can classify, predict,
cluster, and extract features. Fortunately, in November 2015, Google
released TensorFlow, which has been used in most of Google’s products
such as Google Search, spam detection, speech recognition, Google
Assistant, Google Now, and Google Photos. Explaining the basic
components of TensorFlow is the aim of this chapter.
TensorFlow has a unique ability to perform partial subgraph
computation so as to allow distributed training with the help of
partitioning the neural networks. In other words, TensorFlow allows model
parallelism and data parallelism. TensorFlow provides multiple APIs.
The lowest level API—TensorFlow Core—provides you with complete
programming control.
Note the following important points regarding TensorFlow:

• Its graph is a description of computations.

• Its graph has nodes that are operations.

• It executes computations in a given context of a session.

• A graph must be launched in a session for any


computation.

© Navin Kumar Manaswi 2018 1


N. K. Manaswi, Deep Learning with Applications Using Python,
https://doi.org/10.1007/978-1-4842-3516-4_1
Chapter 1 BasiCs of tensorflow

• A session places the graph operations onto devices


such as the CPU and GPU.

• A session provides methods to execute the graph


operations.

For installation, please go to https://www.tensorflow.org/install/.


I will discuss the following topics:

Tensors
Before you jump into the TensorFlow library, let’s get comfortable with
the basic unit of data in TensorFlow. A tensor is a mathematical object
and a generalization of scalars, vectors, and matrices. A tensor can be
represented as a multidimensional array. A tensor of zero rank (order) is
nothing but a scalar. A vector/array is a tensor of rank 1, whereas a

2
Chapter 1 BasiCs of tensorflow

matrix is a tensor of rank 2. In short, a tensor can be considered to be an


n-dimensional array.
Here are some examples of tensors:

• 5: This is a rank 0 tensor; this is a scalar with shape [ ].

• [2.,5., 3.]: This is a rank 1 tensor; this is a vector


with shape [3].

• [[1., 2., 7.], [3., 5., 4.]]: This is a rank 2


tensor; it is a matrix with shape [2, 3].

• [[[1., 2., 3.]], [[7., 8., 9.]]]: This is a rank 3


tensor with shape [2, 1, 3].

Computational Graph and Session


TensorFlow is popular for its TensorFlow Core programs where it has two
main actions.

• Building the computational graph in the construction


phase

• Running the computational graph in the execution


phase
Let’s understand how TensorFlow works.

• Its programs are usually structured into a construction


phase and an execution phase.

• The construction phase assembles a graph that has


nodes (ops/operations) and edges (tensors).

• The execution phase uses a session to execute ops


(operations) in the graph.

3
Chapter 1 BasiCs of tensorflow

• The simplest operation is a constant that takes no


inputs but passes outputs to other operations that do
computation.

• An example of an operation is multiplication


(or addition or subtraction that takes two matrices as
input and passes a matrix as output).

• The TensorFlow library has a default graph to which


ops constructors add nodes.

So, the structure of TensorFlow programs has two phases, shown here:

A computational graph is a series of TensorFlow operations arranged


into a graph of nodes.
Let’s look at TensorFlow versus Numpy. In Numpy, if you plan to
multiply two matrices, you create the matrices and multiply them. But in
TensorFlow, you set up a graph (a default graph unless you create another
graph). Next, you need to create variables, placeholders, and constant
values and then create the session and initialize variables. Finally, you feed
that data to placeholders so as to invoke any action.

4
Chapter 1 BasiCs of tensorflow

To actually evaluate the nodes, you must run the computational graph
within a session.
A session encapsulates the control and state of the TensorFlow runtime.
The following code creates a Session object:

sess = tf.Session()

It then invokes its run method to run enough of the computational


graph to evaluate node1 and node2.
The computation graph defines the computation. It neither computes
anything nor holds any value. It is meant to define the operations
mentioned in the code. A default graph is created. So, you don’t need to
create it unless you want to create graphs for multiple purposes.
A session allows you to execute graphs or parts of graphs. It allocates
resources (on one or more CPUs or GPUs) for the execution. It holds the
actual values of intermediate results and variables.
The value of a variable, created in TensorFlow, is valid only within
one session. If you try to query the value afterward in a second session,
TensorFlow will raise an error because the variable is not initialized there.
To run any operation, you need to create a session for that graph. The
session will also allocate memory to store the current value of the variable

5
Chapter 1 BasiCs of tensorflow

Here is the code to demonstrate:

Constants, Placeholders, and Variables


TensorFlow programs use a tensor data structure to represent all data—
only tensors are passed between operations in the computation graph. You
can think of a TensorFlow tensor as an n-dimensional array or list. A tensor
has a static type, a rank, and a shape. Here the graph produces a constant
result. Variables maintain state across executions of the graph.

6
Chapter 1 BasiCs of tensorflow

Generally, you have to deal with many images in deep learning, so you
have to place pixel values for each image and keep iterating over all images.
To train the model, you need to be able to modify the graph to tune
some objects such as weight and bias. In short, variables enable you to
add trainable parameters to a graph. They are constructed with a type and
initial value.
Let’s create a constant in TensorFlow and print it.

Here is the explanation of the previous code in simple terms:

1. Import the tensorflow module and call it tf.

2. Create a constant value (x) and assign it the


numerical value 12.

3. Create a session for computing the values.

4. Run just the variable x and print out its current


value.

The first two steps belong to the construction phase, and the last two
steps belong to the execution phase. I will discuss the construction and
execution phases of TensorFlow now.
You can rewrite the previous code in another way, as shown here:

7
Chapter 1 BasiCs of tensorflow

Now you will explore how you create a variable and initialize it. Here is
the code that does it:

Here is the explanation of the previous code:

1. Import the tensorflow module and call it tf.

2. Create a constant value called x and give it the


numerical value 12.

3. Create a variable called y and define it as being the


equation 12+11.

4. Initialize the variables with tf.global_variables_


initializer().

5. Create a session for computing the values.

6. Run the model created in step 4.

7. Run just the variable y and print out its current


value.

Here is some more code for your perusal:

8
Chapter 1 BasiCs of tensorflow

Placeholders
A placeholder is a variable that you can feed something to at a later time. It
is meant to accept external inputs. Placeholders can have one or multiple
dimensions, meant for storing n-dimensional arrays.

Here is the explanation of the previous code:

1. Import the tensorflow module and call it tf.

2. Create a placeholder called x, mentioning the


float type.

3. Create a tensor called y that is the operation of


multiplying x by 10 and adding 500 to it. Note that
any initial values for x are not defined.

4. Create a session for computing the values.

5. Define the values of x in feed_dict so as to run y.

6. Print out its value.

In the following example, you create a 2×4 matrix (a 2D array) for


storing some numbers in it. You then use the same operation as before to
do element-wise multiplying by 10 and adding 1 to it. The first dimension
of the placeholder is None, which means any number of rows is allowed.

9
Chapter 1 BasiCs of tensorflow

You can also consider a 2D array in place of the 1D array. Here is the
code:

This is a 2×4 matrix. So, if you replace None with 2, you can see the
same output.

But if you create a placeholder of [3, 4] shape (note that you will feed
a 2×4 matrix at a later time), there is an error, as shown here:

10
Chapter 1 BasiCs of tensorflow

################# What happens in a linear model ##########


# Weight and Bias as Variables as they are to be tuned
W = tf.Variable([2], dtype=tf.float32)
b = tf.Variable([3], dtype=tf.float32)
# Training dataset that will be fed while training as Placeholders
x = tf.placeholder(tf.float32)
# Linear Model
y = W * x + b

Constants are initialized when you call tf.constant, and their values
can never change. By contrast, variables are not initialized when you call
tf.Variable. To initialize all the variables in a TensorFlow program, you
must explicitly call a special operation as follows.

It is important to realize init is a handle to the TensorFlow subgraph


that initializes all the global variables. Until you call sess.run, the
variables are uninitialized.

11
Chapter 1 BasiCs of tensorflow

Creating Tensors
An image is a tensor of the third order where the dimensions belong to
height, width, and number of channels (Red, Blue, and Green).
Here you can see how an image is converted into a tensor:

12
Chapter 1 BasiCs of tensorflow

You can generate tensors of various types such as fixed tensors,


random tensors, and sequential tensors.

Fixed Tensors
Here is a fixed tensor:

13
Chapter 1 BasiCs of tensorflow

tf:.fill creates a tensor of shape (2×3) having a unique number.

tf.diag creates a diagonal matrix having specified diagonal elements.

tf.constant creates a constant tensor.

Sequence Tensors
tf.range creates a sequence of numbers starting from the specified value
and having a specified increment.

tf.linspace creates a sequence of evenly spaced values.

14
Chapter 1 BasiCs of tensorflow

Random Tensors
tf.random_uniform generates random values from uniform distribution
within a range.

tf.random_normal generates random values from the normal


distribution having the specified mean and standard deviation.

15
Chapter 1 BasiCs of tensorflow

Can you guess the result?

If you are not able to find the result, please revise the previous portion
where I discuss the creation of tensors.
Here you can see the result:

Working on Matrices
Once you are comfortable creating tensors, you can enjoy working on
matrices (2D tensors).

16
Chapter 1 BasiCs of tensorflow

Activation Functions
The idea of an activation function comes from the analysis of how a
neuron works in the human brain (see Figure 1-1). The neuron becomes
active beyond a certain threshold, better known as the activation potential.
It also attempts to put the output into a small range in most cases.
Sigmoid, hyperbolic tangent (tanh), ReLU, and ELU are most popular
activation functions.
Let’s look at the popular activation functions.

17
Chapter 1 BasiCs of tensorflow

Tangent Hyperbolic and Sigmoid


Figure 1-2 shows the tangent hyperbolic and sigmoid activation functions.

Figure 1-1. An activation function

Figure 1-2. Two popular activation functions

18
Chapter 1 BasiCs of tensorflow

Here is the demo code:

ReLU and ELU


Figure 1-3 shows the ReLU and ELU functions.

Figure 1-3. The ReLU and ELU functions

19
Chapter 1 BasiCs of tensorflow

Here is the code to produce these functions:

ReLU6
ReLU6 is similar to ReLU except that the output cannot be more than six ever.

Note that tanh is a rescaled logistic sigmoid function.

20
Chapter 1 BasiCs of tensorflow

21
Chapter 1 BasiCs of tensorflow

Loss Functions
The loss function (cost function) is to be minimized so as to get the best
values for each parameter of the model. For example, you need to get the
best value of the weight (slope) and bias (y-intercept) so as to explain the
target (y) in terms of the predictor (X). The method is to achieve the best
value of the slope, and y-intercept is to minimize the cost function/loss
function/sum of squares. For any model, there are numerous parameters,
and the model structure in prediction or classification is expressed in
terms of the values of the parameters.
You need to evaluate your model, and for that you need to define the
cost function (loss function). The minimization of the loss function can
be the driving force for finding the optimum value of each parameter. For

22
Chapter 1 BasiCs of tensorflow

regression/numeric prediction, L1 or L2 can be the useful loss function.


For classification, cross entropy can be the useful loss function. Softmax or
sigmoid cross entropy can be quite popular loss functions.

Loss Function Examples


Here is the code to demonstrate:

Common Loss Functions


The following is a list of the most common loss functions:

tf.contrib.losses.absolute_difference

tf.contrib.losses.add_loss

23
Chapter 1 BasiCs of tensorflow

tf.contrib.losses.hinge_loss

tf.contrib.losses.compute_weighted_loss

tf.contrib.losses.cosine_distance

tf.contrib.losses.get_losses

tf.contrib.losses.get_regularization_losses

tf.contrib.losses.get_total_loss

tf.contrib.losses.log_loss

tf.contrib.losses.mean_pairwise_squared_error

tf.contrib.losses.mean_squared_error

tf.contrib.losses.sigmoid_cross_entropy

tf.contrib.losses.softmax_cross_entropy

tf.contrib.losses.sparse_softmax_cross_entropy

tf.contrib.losses.log(predictions,labels,weight=2.0)

24
Chapter 1 BasiCs of tensorflow

Optimizers
Now you should be convinced that you need to use a loss function to
get the best value of each parameter of the model. How can you get the
best value?
Initially you assume the initial values of weight and bias for the model
(linear regression, etc.). Now you need to find the way to reach to the
best value of the parameters. The optimizer is the way to reach the best
value of the parameters. In each iteration, the value changes in a direction
suggested by the optimizer. Suppose you have 16 weight values (w1, w2,
w3, …, w16) and 4 biases (b1, b2, b3, b4). Initially you can assume every
weight and bias to be zero (or one or any number). The optimizer suggests
whether w1 (and other parameters) should increase or decrease in the
next iteration while keeping the goal of minimization in mind. After many
iterations, w1 (and other parameters) would stabilize to the best value
(or values) of parameters.
In other words, TensorFlow, and every other deep learning framework,
provides optimizers that slowly change each parameter in order to
minimize the loss function. The purpose of the optimizers is to give
direction to the weight and bias for the change in the next iteration.
Assume that you have 64 weights and 16 biases; you try to change the
weight and bias values in each iteration (during backpropagation) so that
you get the correct values of weights and biases after many iterations while
trying to minimize the loss function.
Selecting the best optimizer for the model to converge fast and to learn
weights and biases properly is a tricky task.
Adaptive techniques (adadelta, adagrad, etc.) are good optimizers
for converging faster for complex neural networks. Adam is supposedly
the best optimizer for most cases. It also outperforms other adaptive
techniques (adadelta, adagrad, etc.), but it is computationally costly. For
sparse data sets, methods such as SGD, NAG, and momentum are not the
best options; the adaptive learning rate methods are. An additional benefit

25
Chapter 1 BasiCs of tensorflow

is that you won’t need to adjust the learning rate but can likely achieve the
best results with the default value.

Loss Function Examples


Here is the code to demonstrate:

26
Chapter 1 BasiCs of tensorflow

Common Optimizers
The following is a list of common optimizers:

27
Chapter 1 BasiCs of tensorflow

Metrics
Having learned some ways to build a model, it is time to evaluate the
model. So, you need to evaluate the regressor or classifier.
There are many evaluation metrics, among which classification
accuracy, logarithmic loss, and area under ROC curve are the most popular
ones.
Classification accuracy is the ratio of the number of correct predictions
to the number of all predictions. When observations for each class are not
much skewed, accuracy can be considered as a good metric.

tf.contrib.metrics.accuracy(actual_labels, predictions)

There are other evaluation metrics as well.

Metrics Examples
This section shows the code to demonstrate.
Here you create actual values (calling them x) and predicted values
(calling them y). Then you check the accuracy. Accuracy represents the
ratio of the number of times the actual equals the predicted values and
total number of instances.

28
Chapter 1 BasiCs of tensorflow

Common Metrics
The following is a list of common metrics:

29
Chapter 1 BasiCs of tensorflow

30

You might also like