Skip to content

Commit c071459

Browse files
tensorflower-gardenermartinwicke
authored andcommitted
Adds an overview of the tf.learn linear model tools.
Change: 125636045
1 parent 60a9b8a commit c071459

File tree

1 file changed

+237
-0
lines changed

1 file changed

+237
-0
lines changed
Lines changed: 237 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,237 @@
1+
# Large-scale Linear Models with TensorFlow
2+
3+
The tf.learn API provides (among other things) a rich set of tools for working
4+
with linear models in TensorFlow. This document provides an overview of those
5+
tools. It explains:
6+
7+
* what a linear model is.
8+
* why you might want to use a linear model.
9+
* how tf.learn makes it easy to build linear models in TensorFlow.
10+
* how you can use tf.learn to combine linear models with
11+
deep learning to get the advantages of both.
12+
13+
Read this overview to decide whether the tf.learn linear model tools might be
14+
useful to you. Then do the [Linear Models tutorial](wide/) to
15+
give it a try. This overview uses code samples from the tutorial, but the
16+
tutorial walks through the code in greater detail.
17+
18+
To understand this overview it will help to have some familiarity
19+
with basic machine learning concepts, and also with
20+
[tf.learn](../tflearn/).
21+
22+
[TOC]
23+
24+
## What is a linear model?
25+
26+
A *linear model* uses a single weighted sum of features to make a prediction.
27+
For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names)
28+
on age, years of education, and weekly hours of
29+
work for a population, you can learn weights for each of those numbers so that
30+
their weighted sum estimates a person's salary. You can also use linear models
31+
for classification.
32+
33+
Some linear models transform the weighted sum into a more convenient form. For
34+
example, *logistic regression* plugs the weighted sum into the logistic
35+
function to turn the output into a value between 0 and 1. But you still just
36+
have one weight for each input feature.
37+
38+
## Why would you want to use a linear model?
39+
40+
Why would you want to use so simple a model when recent research has
41+
demonstrated the power of more complex neural networks with many layers?
42+
43+
Linear models:
44+
45+
* train quickly, compared to deep neural nets.
46+
* can work well on very large feature sets.
47+
* can be trained with algorithms that don't require a lot of fiddling
48+
with learning rates, etc.
49+
* can be interpreted and debugged more easily than neural nets.
50+
You can examine the weights assigned to each feature to figure out what's
51+
having the biggest impact on a prediction.
52+
* provide an excellent starting point for learning about machine learning.
53+
* are widely used in industry.
54+
55+
## How does tf.learn help you build linear models?
56+
57+
You can build a linear model from scratch in TensorFlow without the help of a
58+
special API. But tf.learn provides some tools that make it easier to build
59+
effective large-scale linear models.
60+
61+
### Feature columns and transformations
62+
63+
Much of the work of designing a linear model consists of transforming raw data
64+
into suitable input features. tf.learn uses the `FeatureColumn` abstraction to
65+
enable these transformations.
66+
67+
A `FeatureColumn` represents a single feature in your data. A `FeatureColumn`
68+
may represent a quantity like 'height', or it may represent a category like
69+
'eye_color' where the value is drawn from a set of discrete possibilities like {'blue', 'brown', 'green'}.
70+
71+
In the case of both *continuous features* like 'height' and *categorical
72+
features* like 'eye_color', a single value in the data might get transformed
73+
into a sequence of numbers before it is input into the model. The
74+
`FeatureColumn` abstraction lets you manipulate the feature as a single
75+
semantic unit in spite of this fact. You can specify transformations and
76+
select features to include without dealing with specific indices in the
77+
tensors you feed into the model.
78+
79+
#### Sparse columns
80+
81+
Categorical features in linear models are typically translated into a sparse
82+
vector in which each possible value has a corresponding index or id. For
83+
example, if there are only three possible eye colors you can represent
84+
'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would
85+
become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called
86+
"sparse" because they may be very long, with many zeros, when the set of
87+
possible values is very large (such as all English words).
88+
89+
While you don't need to use sparse columns to use tf.learn linear models, one
90+
of the strengths of linear models is their ability to deal with large sparse
91+
vectors. Sparse features are a primary use case for the tf.learn linear model
92+
tools.
93+
94+
##### Encoding sparse columns
95+
96+
`FeatureColumn` handles the conversion of categorical values into vectors
97+
automatically, with code like this:
98+
99+
```python
100+
eye_color = tf.contrib.layers.sparse_column_with_keys(
101+
column_name="eye_color", keys=["blue", "brown", "green"])
102+
```
103+
104+
where `eye_color` is the name of a column in your source data.
105+
106+
You can also generate `FeatureColumn`s for categorical features for which you
107+
don't know all possible values. For this case you would use
108+
`sparse_column_with_hash_bucket()`, which uses a hash function to assign
109+
indices to feature values.
110+
111+
```python
112+
education = tf.contrib.layers.sparse_column_with_hash_bucket(\
113+
"education", hash_bucket_size=1000)
114+
```
115+
116+
##### Feature Crosses
117+
118+
Because linear models assign independent weights to separate features, they
119+
can't learn the relative importance of specific combinations of feature
120+
values. If you have a feature 'favorite_sport' and a feature 'home_city' and
121+
you're trying to predict whether a person likes to wear red, your linear model
122+
won't be able to learn that baseball fans from St. Louis especially like to
123+
wear red.
124+
125+
You can get around this limitation by creating a new feature
126+
'favorite_sport_x_home_city'. The value of this feature for a given person is
127+
just the concatenation of the values of the two source features:
128+
'baseball_x_stlouis', for example. This sort of combination feature is called
129+
a *feature cross*.
130+
131+
The `crossed_column()` method makes it easy to set up feature crosses:
132+
133+
```python
134+
sport = tf.contrib.layers.sparse_column_with_hash_bucket(\
135+
"sport", hash_bucket_size=1000)
136+
city = tf.contrib.layers.sparse_column_with_hash_bucket(\
137+
"city", hash_bucket_size=1000)
138+
sport_x_city = tf.contrib.layers.crossed_column(
139+
[sport, city], hash_bucket_size=int(1e4))
140+
```
141+
142+
#### Continuous columns
143+
144+
You can specify a continuous feature like so:
145+
146+
```python
147+
age = tf.contrib.layers.real_valued_column("age")
148+
```
149+
150+
Although, as a single real number, a continuous feature can often be input
151+
directly into the model, tf.learn offers useful transformations for this sort
152+
of column as well.
153+
154+
##### Bucketization
155+
156+
*Bucketization* turns a continuous column into a categorical column. This
157+
transformation lets you use continuous features in feature crosses, or learn
158+
cases where specific value ranges have particular importance.
159+
160+
Bucketization divides the range of possible values into subranges called
161+
buckets:
162+
163+
```python
164+
age_buckets = tf.contrib.layers.bucketized_column(
165+
age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
166+
```
167+
168+
The bucket into which a value falls becomes the categorical label for
169+
that value.
170+
171+
#### Input function
172+
173+
`FeatureColumn`s provide a specification for the input data for your model,
174+
indicating how to represent and transform the data. But they do not provide
175+
the data itself. You provide the data through an input function.
176+
177+
The input function must return a dictionary of tensors. Each key corresponds
178+
to the name of a `FeatureColumn`. Each key's value is a tensor containing the
179+
values of that feature for all data instances. See `input_fn` in the [linear
180+
models tutorial code](
181+
https://www.tensorflow.org/code/tensorflow/examples/learn/wide_n_deep_tutorial.py?l=160)
182+
for an example of an input function.
183+
184+
The input function is passed to the `fit()` and `evaluate()` calls that
185+
initiate training and testing, as described in the next section.
186+
187+
### Linear estimators
188+
189+
tf.learn's estimator classes provide a unified training and evaluation harness
190+
for regression and classification models. They take care of the details of the
191+
training and evaluation loops and allow the user to focus on model inputs and
192+
architecture.
193+
194+
To build a linear estimator, you can use either the
195+
`tf.contrib.learn.LinearClassifier` estimator or the
196+
`tf.contrib.learn.LinearRegressor` estimator, for classification and
197+
regression respectively.
198+
199+
As with all tf.learn estimators, to run the estimator you just:
200+
201+
1. Instantiate the estimator class. For the two linear estimator classes,
202+
you pass a list of `FeatureColumn`s to the constructor.
203+
2. Call the estimator's `fit()` method to train it.
204+
3. Call the estimator's `evaluate()` method to see how it does.
205+
206+
For example:
207+
208+
```python
209+
e = tf.contrib.learn.LinearClassifier(feature_columns=[
210+
native_country, education, occupation, workclass, marital_status,
211+
race, age_buckets, education_x_occupation, age_buckets_x_race_x_occupation],
212+
model_dir=YOUR_MODEL_DIRECTORY)
213+
e.fit(input_fn=input_fn_train, steps=200)
214+
# Evaluate for one step (one pass through the test data).
215+
results = e.evaluate(input_fn=input_fn_test, steps=1)
216+
217+
# Print the stats for the evaluation.
218+
for key in sorted(results):
219+
print "%s: %s" % (key, results[key])
220+
```
221+
222+
### Wide and deep learning
223+
224+
The tf.learn API also provides an estimator class that lets you jointly train
225+
a linear model and a deep neural network. This novel approach combines the
226+
ability of linear models to "memorize" key features with the generalization
227+
ability of neural nets. Use `tf.contrib.learn.DNNLinearCombinedClassifier` to
228+
create this sort of "wide and deep" model:
229+
230+
```python
231+
e = tf.contrib.learn.DNNLinearCombinedClassifier(
232+
model_dir=YOUR_MODEL_DIR,
233+
linear_feature_columns=wide_columns,
234+
dnn_feature_columns=deep_columns,
235+
dnn_hidden_units=[100, 50])
236+
```
237+
For more information, see the [Wide and Deep Learning tutorial](../wide_n_deep/).

0 commit comments

Comments
 (0)