Skip to content

BatchNorm is not fused to Conv layer in TFlite conversion from customized QAT model? #98324

@aidevmin

Description

@aidevmin

Issue type

Performance

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.12

Custom code

No

OS platform and distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have problem with customized QAT model. I am using TF2.12.

Step 1: Firstly, I have fp32 model and I create QAT model from it
quant_aware_model = tfmot.quantization.keras.quantize_model(base_model)
After that I convert quant_aware_model to Tflite model and I check Tflite model with Neutron. I saw that BatchNorm is fused with Conv layer.

Step 2: But when I added new layer Multiply into the above fp32 model. Because this layer is not supported QAT by default, so I used similar source as in this link https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide#quantize_some_layers . I used this method tfmot.quantization.keras.quantize_apply. After that I converted the new QAT model to TFLite model and check TFlite model with Neutron. I saw that BatchNorm is not fused into Conv layer ==> inference time is increased much for tflite model.

As my understanding, BatchNorm could be fused into Conv layer only when using method tfmot.quantization.keras.quantize_model. Is that right?

With step 2, what I need to do, so that BatchNorm could be fused into Conv layer to reduce inference time?

Thank you.

Standalone code to reproduce the issue

No

Relevant log output

Metadata

Metadata

Assignees

Labels

TF 2.12For issues related to Tensorflow 2.12comp:liteTF Lite related issuestype:performancePerformance Issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions