Skip to content

Commit e79232f

Browse files
authored
Merge pull request tensorflow#4334 from gariel-google/master
Added the MorphNet library
2 parents 81d7766 + 7968028 commit e79232f

29 files changed

+3441
-0
lines changed

CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
/research/lm_1b/ @oriolvinyals @panyx0718
2525
/research/marco/ @vincentvanhoucke
2626
/research/maskgan/ @a-dai
27+
/research/morph_net/ @gariel-google
2728
/research/namignizer/ @knathanieltucker
2829
/research/neural_gpu/ @lukaszkaiser
2930
/research/neural_programmer/ @arvind2505

research/morph_net/README.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks
2+
3+
[TOC]
4+
5+
## What is MorphNet?
6+
7+
MorphNet is a method for learning deep network structure during training. The
8+
key principle is continuous relaxation of the network-structure learning
9+
problem. Specifically, we use regularizers that induce sparsity in the space of
10+
activations of the network. The regularizers can be tailored to target the
11+
consumption of specific resources by the network, such as FLOPs or model size.
12+
When such a regularizer is added to the training loss and their sum is
13+
minimized via stochastic gradient descent or a similar optimizer, the learning
14+
problem becomes also a constrained optimization of the structure of the network,
15+
under the constraint represented by the regularizer. The method is described in
16+
detail in the [this paper](https://arxiv.org/abs/1711.06798), to appear in [CVPR
17+
2018](http://cvpr2018.thecvf.com/).
18+
19+
## Adding a MorphNet regularizer to your training code
20+
21+
Your interaction with the MorphNet codebase will most likely be through
22+
subclasses of `NetworkRegularizer`. Each subclass represents a resource that we
23+
wish to target/constrain when optimizing the network. The MorphNet package
24+
provides several `NetworkRegularizer`s in the `network_regularizers` directory,
25+
as well as a framework for writing your own. The framework is described in
26+
detail [here](g3doc/regularizers_framework.md). The interface of
27+
`NetworkRegularizer` is given
28+
[here](g3doc/regularizers_framework.md?#network-regularizers).
29+
30+
To apply a `NetworkRegularizer` to your network, your code would look similar to
31+
the example below. The example refers to a specific type of `NetworkRegularizer`
32+
that targets FLOPs, and to make the discussion simpler we henceforth restrict it
33+
to this case, but generalization to an arbitrary constrained resource and an
34+
arbitrary regularization method that targets that resource is straightforward.
35+
36+
```python
37+
my_gamma_threshold = 1e-3
38+
regularizer_strength = 1e-9
39+
network_reg = network_regularizers.GammaFlopsRegularizer(
40+
[my_network_output.op], my_gamma_threshold)
41+
my_training_loss += regularizer_strength * network_reg.get_regularization_term()
42+
tf.summary.scalar('FLOPs', network_reg.get_cost()
43+
```
44+
45+
Once you start your training, your TensorBoard will display the effective FLOP
46+
count of the model. "Effective" is the sense that as activations are zeroed out
47+
by the regularizer, their impact on the FLOP count is discounted.
48+
49+
![TensorBoardDisplayOfFlops](g3doc/tensorboard.png "Example of the TensorBoard
50+
display of the resource regularized by MorphNet.")
51+
52+
The larger the `regularizer_strength`, the smaller the effective FLOP count to
53+
which the network will converge. If `regularizer_strength` is large enough, the
54+
FLOP count will collapse to zero, whereas if it is small enough, the FLOP count
55+
will remain at its initial value and the network structure will not vary.
56+
`regularizer_strength` is your knob to control where you want to be on the
57+
price-performance curve. The `my_gamma_threshold` parameter is used for
58+
determining when an activation is alive. It is described in more detail
59+
[here](framework/README.md?#the-opregularizer-interface), including
60+
an explanation for how to tune it.
61+
62+
## Extracting the architecture learned by MorphNet
63+
64+
One way to extract the structure is querying the `network_reg` object created
65+
above. To query which activations in a given op were kept alive (as opposed to
66+
removed) by MorphNet, your code would look similar to
67+
68+
```python
69+
alive = sess.run(network_reg.opreg_manager.get_regularizer(op).alive_vector)
70+
```
71+
72+
where `op` is the tensorflow op in question, and `sess` is a tf.Session object.
73+
The result is a vector of booleans, designating which activations were kept
74+
alive (more details can be found
75+
[here](framework/README.md?#the-opregularizer-interface)). Typically
76+
one would be interested in the number of alive activations, which can be
77+
obtained by counting the `True` values in `alive`. Looping over all convolutions
78+
and / or fully connected layers (as `op`) is typically sufficient to extract the
79+
full structure learned by MorphNet.
80+
81+
## Maintainers
82+
83+
* Elad Eban
84+
* Ariel Gordon, github: [gariel-google](https://github.com/gariel-google).

research/morph_net/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)