Skip to content

[mobile] Mobile Perf Recipe #1031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 24, 2020

Conversation

IvanKobzarev
Copy link
Contributor

No description provided.

@netlify
Copy link

netlify bot commented Jun 16, 2020

Deploy preview for pytorch-tutorials-preview ready!

Built with commit 7b182fe

https://deploy-preview-1031--pytorch-tutorials-preview.netlify.app

@IvanKobzarev IvanKobzarev force-pushed the recipe_mobile_perf branch 2 times, most recently from a3c5fd7 to bac6bf6 Compare June 17, 2020 22:01
::

from torch.utils.mobile_optimizer import optimize_for_mobile
traced_model = torch.jit.load("input_model_path")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total nitpick: Not all TorchScript models are traced - q.v., torch.jit.script(), method decorations. I'd name this var torchscript_model or similar for clarity & accuracy.


from torch.utils.mobile_optimizer import optimize_for_mobile
traced_model = torch.jit.load("input_model_path")
optimized_model = optimize_for_mobile(traced_model)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are valid TorchScript models (googlenet & inception_v3 in torchvision) that segfault on this line. Other than that, the instructions work and there's a modest (<=2.2%) improvement in file size for these models:

mobilenet_v2
resnet18
alexnet
squeezenet1_0
vgg16
densenet161
shufflenet_v2_x1_0
mnasnet1_0


2. Fuse operators using ``torch.quantization.fuse_modules``
Do not be confused that fuse_modules is in the quantization package.
It works for all types of torch script modules.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. PMM is going to come back and say that we should write it TorchScript.
  2. The code below does not pass a TorchScript module to fuse_modules() - that MobileNet v2 from TorchVision is not a subclass of ScriptModule.

Passing in the original torchvision module itself or a version of it processed by torch.jit.script() works, with the latter giving ~2% file size improvement.

m = torchvision.models.mobilenet_v2(pretrained=True)
m.eval()
fuse_model(m)
torch.jit.trace(m, torch.rand(1, 3, 224, 224)).save("mobilenetV2-bnfused.pt")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be guiding people to torch.jit.trace()? torch.jit.script() preserves control flow, trace() does not. trace() is still there for cases where script() hits an unsupported op.

model.eval()
script_model = torch.jit.script(model)
x = torch.rand(1, 3 , 224, 224)
y = script_model(x)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y is never used, which means we could do without x as well.


supported_qengines = torch.backends.quantized.supported_engines
print(supported_qengines)
model = torchvision.models.quantization.__dict__['mobilenet_v2'](pretrained=True, quantize=True)
Copy link

@fbbradheintz fbbradheintz Jun 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. Is this the quantization workflow we want to show? This looks like it's just pulling down a pre-quantized version of the model, in which case this recipe is only useful if you want to quantize the torchvision models. In the general case, I'd think people would want to be able to quantize their own trained models for mobile deployment.
  2. On MacOS, this line throws a warning:
/Users/bradheintz/anaconda2/envs/pyto16pre/lib/python3.8/site-packages/torch/nn/quantized/modules/utils.py:8: UserWarning: 0quantize_tensor_per_tensor_affine current rounding mode is not set to round-to-nearest-ties-to-even (FE_TONEAREST). This will cause accuracy issues in quantized models. (Triggered internally at  ../aten/src/ATen/native/quantized/affine_quantizer.cpp:25.)
  qweight = torch.quantize_per_tensor(
/Users/bradheintz/anaconda2/envs/pyto16pre/lib/python3.8/site-packages/torch/quantization/observer.py:134: UserWarning: must run observer before calling calculate_qparams.                                    Returning default scale and zero point
  warnings.warn(

(And yes, it really looks like that in my terminal.) The warning came up in this env:

# torch 1.6.0a0+55bcb5d built from master with USE_CUDA=0
# torchvision 0.7.0a0+148bac2 built from master with USE_CUDA=0
# python 3.8.0
# MacOS 10.15.4

Copy link

@raghuramank100 raghuramank100 Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should show a workflow where we start with a floating point model and then do the quantization. The steps are:

# Start with a fully trained floating point model
# The model code is modified to enable eager mode quantization, for more details
# please see the quantization tutorials at: https://pytorch.org/tutorials/advanced/static_quantization_tutorial.html

model = torchvision.models.quantization.__dict__['resnet18'](pretrained=True, quantize=False)
torch.backends.quantized.engine='qnnpack'
# We convert the float model with the appropriate_Qconfig
model.eval()
model.qconfig = torch.quantization.get_default_qconfig('qnnpack')
torch.quantization.prepare(model)
# Run model with representative data for calibration
# model(calibration_data)
torch.quantization.convert(model)
script_model = torch.jit.script(model)

# Export to mobile
script_model._save_for_lite_interpreter("model.bc")

model = torchvision.models.quantization.__dict__['mobilenet_v2'](pretrained=True, quantize=True)
torch.backends.quantized.engine='qnnpack'
model.eval()
script_model = torch.jit.script(model)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know the answer to this: Is it preferable to do quantization before or after TorchScript conversion, or does it matter at all?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, pytorch only supports doing quantization prior to scripting. We are working on adding support for quantization after scripting, but it is not part of release 1.6 yet.

@IvanKobzarev IvanKobzarev force-pushed the recipe_mobile_perf branch 3 times, most recently from 3ababf9 to f11a36c Compare June 18, 2020 06:12
@IvanKobzarev IvanKobzarev changed the title [WIP][mobile] Mobile Perf Recipe [mobile] Mobile Perf Recipe Jun 18, 2020
Copy link

@fbbradheintz fbbradheintz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great; I'm still tweaking my Android custom build env for the last couple of recipes.

import torch
from torch.utils.mobile_optimizer import optimize_for_mobile

class AnnotatedConvBnReLUModel(torch.nn.Module):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including a sample model here is a great idea - this will help users generalize to their own use case.

@IvanKobzarev IvanKobzarev force-pushed the recipe_mobile_perf branch 2 times, most recently from e9d31e3 to 3839516 Compare June 18, 2020 19:05
Copy link

@fbbradheintz fbbradheintz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added notes for one bug in the quantization step

::

model.qconfig = torch.quantization.get_default_qconfig('qnnpack')
torch.quantization.prepare(model)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare() has inplace=False by default, and the same goes for convert(), so this whole method is a no-op except for setting model.qconfig.

We either need to do:

model = torch.quantization.prepare(model)
# calibration
return torch.quantization.convert(model)

or:

torch.quantization.prepare(model, inplace=True)
# calibration
torch.quantization.convert(model, inplace=True)

@jlin27 jlin27 self-requested a review June 23, 2020 21:24
@jlin27 jlin27 changed the base branch from master to release/1.6 June 24, 2020 02:13
@jlin27 jlin27 merged commit 6285b8f into pytorch:release/1.6 Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants