diff --git a/_posts/2023-04-18-flatbuffer-mobile-models.md b/_posts/2023-04-18-flatbuffer-mobile-models.md new file mode 100644 index 000000000000..cb73d815b1d7 --- /dev/null +++ b/_posts/2023-04-18-flatbuffer-mobile-models.md @@ -0,0 +1,201 @@ +Blazingly Fast Pytorch Mobile Model Loading with Flatbuffer +TL; DR; +With the aim of reducing the overall model loading time experienced in mobile applications, we added another serialization format based on flatbuffers. +We observed that a setup using flatbuffer in a single file can be loaded 10x+ faster compared to the pickle based format, a step function needed to meet mobile use cases. +Why? +Time to first inference is critical in many edge applications, which aim at more interactive user experiences, and thus track the critical metric of “user perceived latency”. +Interactions like a voice action (e.g. “take a picture”) would be degraded meaningfully by a delayed response. Thus, it’s common to set a high bar (e.g. 200ms) for the maximum permissible response time. This becomes even more important in wearable devices, where due to resource constraints it is common for applications to be loaded / unloaded regularly. +What do we cover in this blog? +At a high level, we have splitted mode loading into 3 steps: Read from storage, Deserialization, and Runtime initialization. This potentially allow amortizing the cost of them across different times in an application lifecycle; plus provide a nice separation of concerns between model representation/deserialization and runtime initialization. +The read from storage part simply means reading the file from disk to memory as raw bytes. Deserialization means creating manipulable in-memory structures from the raw bytes. And finally runtime initialization means creating the in-memory torch::jit::mobile::Module ready for inference. +Benchmarks +In our benchmarks, we load a set of models in different mobile devices, using both pickle and flatbuffer format, and measure the latency. +The models are loaded in C++ using the torch::jit::mobile::_load_for_mobile(path) API. +Models used for test: +Resnet50 as available with model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet50', pretrained=True). This model has a file size of 161MB +Resnet101 as available with model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet101', pretrained=True); this model has a file size of 234MB +And Silero speech-to-text as available at model, _,_ = torch.hub.load(repo_or_dir='snakers4/silero-models', model='silero_stt', language='en')this model has a file size of 112MB +Devices used to test are iPhone X, iPhone 13 Pro, Pixel 4 and Pixel 6 Pro. +Results: +Resnet 50: + +device +flatbuffer mean (ms) +flatbuffer p90 (ms) +pickle mean (ms) +pickle p90 +mean ratio +p90 ratio +iPhone X +4.23 +4.26 +74.24 +75.05 +17.55 +17.62 +iPhone 13 Pro +2.60 +2.64 +44.72 +44.80 +17.19 +17.00 +Pixel 4 +4.23 +4.44 +99.02 +101.13 +23.39 +22.78 +Pixel 6 Pro +3.13 +3.18 +77.92 +79.98 +24.86 +25.14 + + + + + +Resnet 101: + +device +flatbuffer mean(ms) +flatbuffer p90 (ms) +pickle mean(ms) +pickle p90(ms) +mean ratio +p90 ratio +iPhone X +7.31 +7.45 +106.72 +107.53 +14.58 +14.43 +iPhone 13 Pro +4.17 +4.22 +64.60 +64.81 +15.49 +15.36 +Pixel 4 +6.07 +6.21 +142.35 +146.51 +23.43 +23.59 +Pixel 6 Pro +4.42 +4.48 +101.42 +104.41 +22.97 +23.33 + + + +Sileto TTS + +device +flatbuffer mean(ms) +flatbuffer p90(ms) +pickle mean(ms) +pickle p90(ms) +mean ratio +p90 ratio +iPhone X +5.37 +5.46 +56.79 +57.14 +10.57 +10.47 +iPhone 13 Pro +3.16 +3.20 +33.54 +33.59 +10.62 +10.48 +Pixel 4 +4.35 +4.47 +78.48 +79.86 +18.03 +17.87 +Pixel 6 Pro +3.39 +3.43 +57.02 +58.69 +16.80 +17.11 + + + +Why is it so fast? + +Deserialize for flatbuffer is almost free. +Recall the 3 stages of model loading: read from disk, deserialization and initialization. The read from disk stage for both flatbuffer and pickle format is the same, because the file sizes of those two formats are very similar. However, deserialization for pickle means running the Unpickler, which is a stack machine interpreting pickle bytecodes. After unpickling, A tuple of torch::jit::IValues are created. + +Flatbuffer files has this property that the on-disk layout matches its in-memory layout. Therefore, the deserialize part, which is provided by the flatbuffer library, is just pointer arithmetic followed by a cast: +(https://github.com/google/flatbuffers/blob/3fda20d7c7fe1f8006210bddae8cb55bc7a74c3b/include/flatbuffers/buffer.h#L132) Of course, accessing individual fields inside of casted Flatbuffer object also has some CPU computations, such as additional pointer arithmetic and endianness checks, but such operation are much faster than running pickler’s stack machine. + + +template T *GetMutableRoot(void *buf) { + EndianCheck(); + return reinterpret_cast( + reinterpret_cast(buf) + + EndianScalar(*reinterpret_cast(buf))); +} + + + +Flatbuffer’s initialize part is also more efficient: + +A Pytorch model consists of two parts: +Tensor weights of the model +Model’s code serialized as stack machine bytecode that the torchscript’s lite interpreter executes when the model runs on device. Along with non-tensor constants (such as string class names, precomputed error messages etc). + +Loading of the weight itself is just reading bytes from disk to memory and should be the same for both pickle and flatbuffer formats. + +However, when loading instructions, flatbuffer format has major advantages. + +In flatbuffer format, instruction is stored in the Instruction flatbuffer struct, whose generated code looks like the following: +FLATBUFFERS_MANUALLY_ALIGNED_STRUCT(4) Instruction FLATBUFFERS_FINAL_CLASS { +private: + int8_t op_; + int8_t padding0__; + uint16_t n_; + int32_t x_; +....... +} +Which happens to be the same layout of struct Instruction defined in Pytorch. So parsing this instruction is a memcpy away. +With pickle, however, every instruction is serialized as a tuple of (string opcode, int n, int x); This tuple is then pickled. So to deserialize, we first unpickle to get this tuple back, then construct an Instruction struct by converting string OpCode -> int8 opcode. + +How to create a flatbuffer model file: +When we save a mobile::Module we can pass the optional _use_flatbuffer argument and it will produce flatbuffer format: + +Example (python): + +>>> m = torch.jit.load(...) # m is a ScriptModule +>>> m._save_for_lite_interpreter('/tmp/hello.ff', _use_flatbuffer=True) + +Or, in C++: + +torch::jit::Module m = ... +bool _use_flatbuffer = true; +m._save_for_mobile( + filename, _extra_files, _save_mobile_debug_info, _use_flatbuffer); + +The loading side is unchanged: +_load_for_lite_interpreter (python) or _load_for_mobile (C++, and its corresponding bindings) will just work regardless which format it loads. +Conclusion: +We present a new file format that significantly speeds up loading of a model. + diff --git a/assets/hub/flatbuffer_benchmarks.csv b/assets/hub/flatbuffer_benchmarks.csv new file mode 100644 index 000000000000..00b8b58996ba --- /dev/null +++ b/assets/hub/flatbuffer_benchmarks.csv @@ -0,0 +1,121 @@ +,mean,p0,p10,p50,p90,p100,stdev,MAD,cv,model_name,model_type,device,type +0,74.24220000000001,54.859,54.8671,54.941500000000005,75.04919999999993,246.573,57.44458544336447,0.07800000000000296,0.7737457327956938,resnet50,pickle,iPhone X,LOAD latency +1,3.1715000000000004,3.056,3.0884,3.166,3.2620999999999998,3.371,0.08595260321828536,0.052999999999999936,0.02710156179041001,resnet50,pickle,iPhone X,UNLOAD latency +2,172068044.8,171655168.0,172068044.8,172113920.0,172113920.0,172113920.0,137625.6,0.0,0.0007998324160652054,resnet50,pickle,iPhone X,LOAD_MEM memory from start +3,169192652.8,168869888.0,168869888.0,168869888.0,169546547.2,171655168.0,831366.5235402253,0.0,0.004913727102103959,resnet50,pickle,iPhone X,LOAD_MEM_DELTA memory increase this iteration +4,210598297.6,210141184.0,210598297.60000002,210649088.0,210649088.0,210649088.0,152371.20000000004,0.0,0.0007235158201012924,resnet50,pickle,iPhone X,LOAD_MEM_PEAK peak memory after this iteration +5,44.7228,31.563,31.5774,31.6445,44.80239999999995,162.463,39.246762352581385,0.04349999999999987,0.8775560195824363,resnet50,pickle,iPhone 13 Pro,LOAD latency +6,1.4985,1.466,1.4741,1.487,1.5508000000000002,1.558,0.030519665791092793,0.012499999999999956,0.02036681067139993,resnet50,pickle,iPhone 13 Pro,UNLOAD latency +7,171999232.0,171556864.0,171999232.0,172048384.0,172048384.0,172048384.0,147456.0,0.0,0.0008573061535530577,resnet50,pickle,iPhone 13 Pro,LOAD_MEM memory from start +8,169186099.2,168869888.0,168869888.0,168869888.0,169566208.0,171556864.0,802851.1078037821,0.0,0.004745372767621455,resnet50,pickle,iPhone 13 Pro,LOAD_MEM_DELTA memory increase this iteration +9,213942272.0,213499904.0,213942272.0,213991424.0,213991424.0,213991424.0,147456.0,0.0,0.0006892326543115331,resnet50,pickle,iPhone 13 Pro,LOAD_MEM_PEAK peak memory after this iteration +10,99.0239,97.865,97.874,98.34,101.12559999999999,103.246,1.6364970791296876,0.3895000000000053,0.016526283847936585,resnet50,pickle,Pixel 4,LOAD latency +11,10.5495,10.107,10.336500000000001,10.525,10.9506,10.956,0.24685430925953056,0.13949999999999996,0.02339962171283289,resnet50,pickle,Pixel 4,UNLOAD latency +12,171880448.0,171814912.0,171870208.0,171886592.0,171893555.2,171900928.0,22805.562830151768,6144.0,0.00013268270530777165,resnet50,pickle,Pixel 4,LOAD_MEM memory from start +13,168769126.4,168402944.0,168410316.8,168421376.0,168839987.2,171814912.0,1015648.1112876841,12288.0,0.006017973387386593,resnet50,pickle,Pixel 4,LOAD_MEM_DELTA memory increase this iteration +14,220130099.2,220045312.0,220130099.2,220139520.0,220139520.0,220139520.0,28262.399999999998,0.0,0.00012838953011292696,resnet50,pickle,Pixel 4,LOAD_MEM_PEAK peak memory after this iteration +15,77.9151,76.87,77.0113,77.1745,79.9805,82.631,1.7533273196981793,0.10549999999999216,0.022503049084172125,resnet50,pickle,Pixel 6 Pro,LOAD latency +16,10.8776,10.764,10.7775,10.868500000000001,10.953999999999999,11.125,0.0977406773047948,0.05350000000000055,0.008985500230270905,resnet50,pickle,Pixel 6 Pro,UNLOAD latency +17,173321420.8,173297664.0,173319782.4,173322240.0,173326336.0,173326336.0,8150.937084777429,2048.0,4.7027869072126996e-05,resnet50,pickle,Pixel 6 Pro,LOAD_MEM memory from start +18,168483635.2,167931904.0,167931904.0,167931904.0,168590131.20000002,173297664.0,1605171.186118465,0.0,0.009527163775953862,resnet50,pickle,Pixel 6 Pro,LOAD_MEM_DELTA memory increase this iteration +19,221031628.8,221007872.0,221029990.4,221032448.0,221036544.0,221036544.0,8150.937084777429,2048.0,3.687679057078653e-05,resnet50,pickle,Pixel 6 Pro,LOAD_MEM_PEAK peak memory after this iteration +20,106.71340000000001,78.18,79.008,79.64,107.5322999999999,351.921,81.73777905986924,0.3885000000000005,0.7659560941725148,resnet101,pickle,iPhone X,LOAD latency +21,4.6302,4.403,4.4219,4.6005,4.8582,5.139,0.21375443855040768,0.15450000000000008,0.04616527116548047,resnet101,pickle,iPhone X,UNLOAD latency +22,250206617.6,249380864.0,250206617.60000002,250298368.0,250298368.0,250298368.0,275251.20000000007,0.0,0.001100095603546499,resnet101,pickle,iPhone X,LOAD_MEM memory from start +23,245991014.4,245514240.0,245514240.0,245514240.0,246711910.4,249380864.0,1161449.9613337803,0.0,0.004721513768162177,resnet101,pickle,iPhone X,LOAD_MEM_DELTA memory increase this iteration +24,289428275.2,288604160.0,289415168.0,289521664.0,289521664.0,289521664.0,274748.4933202729,0.0,0.0009492800699255001,resnet101,pickle,iPhone X,LOAD_MEM_PEAK peak memory after this iteration +25,64.6097,45.814,45.8689,46.009,64.81369999999993,232.058,55.81620623985474,0.07000000000000028,0.863898241902605,resnet101,pickle,iPhone 13 Pro,LOAD latency +26,2.2884,2.184,2.1912000000000003,2.2039999999999997,2.3306999999999998,3.021,0.24491639389799935,0.011499999999999622,0.10702516775825875,resnet101,pickle,iPhone 13 Pro,UNLOAD latency +27,249931366.4,249430016.0,249931366.4,249987072.0,249987072.0,249987072.0,167116.80000000002,0.0,0.0006686507676373029,resnet101,pickle,iPhone 13 Pro,LOAD_MEM memory from start +28,245959884.8,245514240.0,245514240.0,245514240.0,246392422.4,249430016.0,1167888.4799538697,0.0,0.004748288449165307,resnet101,pickle,iPhone 13 Pro,LOAD_MEM_DELTA memory increase this iteration +29,291907174.4,291405824.0,291907174.4,291962880.0,291962880.0,291962880.0,167116.80000000005,0.0,0.0005724998035539892,resnet101,pickle,iPhone 13 Pro,LOAD_MEM_PEAK peak memory after this iteration +30,142.3496,140.543,140.8499,141.4485,146.5147,146.818,2.1852459907296504,0.4865000000000066,0.015351261898380117,resnet101,pickle,Pixel 4,LOAD latency +31,16.569899999999997,16.022,16.1885,16.493499999999997,16.8838,17.809,0.47850923711042426,0.25099999999999767,0.028878221178789514,resnet101,pickle,Pixel 4,UNLOAD latency +32,249063014.4,248852480.0,249036800.0,249092096.0,249106841.60000002,249110528.0,72238.39190236726,16384.0,0.00029004062315872805,resnet101,pickle,Pixel 4,LOAD_MEM memory from start +33,245679718.4,245264384.0,245275443.20000002,245311488.0,245877555.2,248852480.0,1060337.557101058,24576.0,0.00431593443694316,resnet101,pickle,Pixel 4,LOAD_MEM_DELTA memory increase this iteration +34,297114828.8,296902656.0,297086976.0,297142272.0,297157017.6,297160704.0,72622.90011945271,16384.0,0.0002444270466498261,resnet101,pickle,Pixel 4,LOAD_MEM_PEAK peak memory after this iteration +35,101.4223,100.27,100.3015,100.6,104.41010000000001,105.653,1.8033136748774479,0.15850000000000364,0.017780248277523263,resnet101,pickle,Pixel 6 Pro,LOAD latency +36,9.489799999999999,9.344,9.3926,9.4865,9.586200000000002,9.624,0.08471457961885918,0.06850000000000023,0.008926908851488882,resnet101,pickle,Pixel 6 Pro,UNLOAD latency +37,250524057.6,250470400.0,250492518.4,250533888.0,250535936.0,250535936.0,21475.69947265979,2048.0,8.572310251716038e-05,resnet101,pickle,Pixel 6 Pro,LOAD_MEM memory from start +38,244663500.8,243945472.0,244000768.0,244006912.0,244769996.79999998,250494976.0,1944323.3969265914,0.0,0.0079469287023567,resnet101,pickle,Pixel 6 Pro,LOAD_MEM_DELTA memory increase this iteration +39,298711859.2,298676224.0,298709401.59999996,298717184.0,298717184.0,298717184.0,12018.811495318498,0.0,4.023546814480976e-05,resnet101,pickle,Pixel 6 Pro,LOAD_MEM_PEAK peak memory after this iteration +40,56.7934,42.758,42.762499999999996,43.092,57.135199999999955,180.383,41.197211112404204,0.26950000000000074,0.7253873005033016,silero,pickle,iPhone X,LOAD latency +41,2.5906,2.465,2.4893,2.5774999999999997,2.6684,2.807,0.09204803094037374,0.05699999999999994,0.03553154903897697,silero,pickle,iPhone X,UNLOAD latency +42,121595494.4,121257984.0,121582387.19999999,121634816.0,121634816.0,121634816.0,112609.46204222801,0.0,0.0009260989693564501,silero,pickle,iPhone X,LOAD_MEM memory from start +43,118285926.4,117915648.0,117915648.0,117915648.0,118559539.2,121257984.0,995931.5977205663,0.0,0.008419696476423473,silero,pickle,iPhone X,LOAD_MEM_DELTA memory increase this iteration +44,160636928.0,160301056.0,160610713.6,160677888.0,160677888.0,160677888.0,112382.7753919612,0.0,0.0006996073492638082,silero,pickle,iPhone X,LOAD_MEM_PEAK peak memory after this iteration +45,33.5372,24.46,24.469900000000003,24.506,33.58789999999996,114.857,27.10661262791793,0.019499999999998963,0.8082550907027996,silero,pickle,iPhone 13 Pro,LOAD latency +46,1.1623999999999999,1.097,1.1006,1.125,1.1827999999999999,1.523,0.12111168399456755,0.014499999999999957,0.10419105643028868,silero,pickle,iPhone 13 Pro,UNLOAD latency +47,121644646.4,121290752.0,121644646.4,121683968.0,121683968.0,121683968.0,117964.79999999999,0.0,0.000969749212078765,silero,pickle,iPhone 13 Pro,LOAD_MEM memory from start +48,118290841.6,117915648.0,117915648.0,117915648.0,118592307.2,121290752.0,1006261.7448870048,0.0,0.008506674999317993,silero,pickle,iPhone 13 Pro,LOAD_MEM_DELTA memory increase this iteration +49,163749888.0,163381248.0,163749888.0,163790848.0,163790848.0,163790848.0,122880.0,0.0,0.00075041272699985,silero,pickle,iPhone 13 Pro,LOAD_MEM_PEAK peak memory after this iteration +50,78.4763,77.524,77.57889999999999,77.9125,79.86090000000002,82.938,1.5811824088320754,0.19249999999999545,0.020148534128546777,silero,pickle,Pixel 4,LOAD latency +51,13.9001,11.655,11.7783,13.6085,16.3782,16.533,1.586521821469847,0.976,0.11413743940474148,silero,pickle,Pixel 4,UNLOAD latency +52,121267814.4,121204736.0,121248972.8,121276416.0,121287065.6,121290752.0,23401.03267464921,8192.0,0.0001929698559377121,silero,pickle,Pixel 4,LOAD_MEM memory from start +53,118000025.6,117600256.0,117615001.6,117633024.0,118111846.4,121204736.0,1069125.362960509,12288.0,0.009060382466226423,silero,pickle,Pixel 4,LOAD_MEM_DELTA memory increase this iteration +54,169610854.4,169508864.0,169578905.60000002,169623552.0,169651404.79999998,169680896.0,42937.72544325096,26624.0,0.00025315434908422323,silero,pickle,Pixel 4,LOAD_MEM_PEAK peak memory after this iteration +55,57.0194,56.09,56.1422,56.3615,58.6874,61.382,1.5850049968375484,0.08099999999999952,0.027797644255070177,silero,pickle,Pixel 6 Pro,LOAD latency +56,5.394299999999999,5.303,5.321,5.3735,5.526000000000001,5.571,0.0836696480212507,0.04349999999999987,0.015510751723346997,silero,pickle,Pixel 6 Pro,UNLOAD latency +57,122861158.4,122830848.0,122860339.2,122863616.0,122867712.0,122867712.0,10231.804720575934,0.0,8.327940948810014e-05,silero,pickle,Pixel 6 Pro,LOAD_MEM memory from start +58,117569126.4,116973568.0,116973568.0,116973568.0,117629337.6,122830848.0,1754057.3738604558,0.0,0.014919370650868898,silero,pickle,Pixel 6 Pro,LOAD_MEM_DELTA memory increase this iteration +59,171293081.6,171233280.0,171292262.4,171298816.0,171302912.0,171302912.0,19999.2207208181,0.0,0.00011675439856654491,silero,pickle,Pixel 6 Pro,LOAD_MEM_PEAK peak memory after this iteration +60,4.2292000000000005,1.251,1.2546,1.2679999999999998,4.259599999999989,30.878,8.882944239383697,0.010500000000000065,2.1003840535760183,resnet50,flatbuffer,iPhone X,LOAD latency +61,0.21470000000000003,0.206,0.21049999999999996,0.214,0.22,0.229,0.00576281181368957,0.0025000000000000022,0.026841228754958403,resnet50,flatbuffer,iPhone X,UNLOAD latency +62,2670592.0,2670592.0,2670592.0,2670592.0,2670592.0,2670592.0,0.0,0.0,0.0,resnet50,flatbuffer,iPhone X,LOAD_MEM memory from start +63,1800601.6,1703936.0,1703936.0,1703936.0,1800601.5999999996,2670592.0,289996.79999999993,0.0,0.16105550500454954,resnet50,flatbuffer,iPhone X,LOAD_MEM_DELTA memory increase this iteration +64,41746432.0,41746432.0,41746432.0,41746432.0,41746432.0,41746432.0,0.0,0.0,0.0,resnet50,flatbuffer,iPhone X,LOAD_MEM_PEAK peak memory after this iteration +65,2.6010999999999997,0.645,0.6477,0.6505000000000001,2.635399999999993,20.108,5.835649123276691,0.0040000000000000036,2.2435312457332253,resnet50,flatbuffer,iPhone 13 Pro,LOAD latency +66,0.1182,0.114,0.114,0.115,0.1259,0.134,0.0061773780845922,0.0010000000000000009,0.05226208193394417,resnet50,flatbuffer,iPhone 13 Pro,UNLOAD latency +67,2670592.0,2670592.0,2670592.0,2670592.0,2670592.0,2670592.0,0.0,0.0,0.0,resnet50,flatbuffer,iPhone 13 Pro,LOAD_MEM memory from start +68,1800601.6,1703936.0,1703936.0,1703936.0,1800601.5999999996,2670592.0,289996.79999999993,0.0,0.16105550500454954,resnet50,flatbuffer,iPhone 13 Pro,LOAD_MEM_DELTA memory increase this iteration +69,44122112.0,44122112.0,44122112.0,44122112.0,44122112.0,44122112.0,0.0,0.0,0.0,resnet50,flatbuffer,iPhone 13 Pro,LOAD_MEM_PEAK peak memory after this iteration +70,4.2338000000000005,3.568,3.5716,3.6630000000000003,4.439199999999998,9.292,1.6886820778346645,0.07299999999999995,0.39885730970633104,resnet50,flatbuffer,Pixel 4,LOAD latency +71,0.3925,0.338,0.35059999999999997,0.3795,0.4337,0.53,0.051178608812667045,0.020000000000000018,0.13039136003227272,resnet50,flatbuffer,Pixel 4,UNLOAD latency +72,6046515.2,4280320.0,4759552.0,6307840.0,6826803.2,6926336.0,869430.9734327159,489472.0,0.14379042219768434,resnet50,flatbuffer,Pixel 4,LOAD_MEM memory from start +73,4110336.0,2093056.0,2579660.8000000003,4464640.0,4996710.399999999,5914624.0,1076907.2617008393,466944.0,0.26199981259460037,resnet50,flatbuffer,Pixel 4,LOAD_MEM_DELTA memory increase this iteration +74,55039180.8,54128640.0,55039180.800000004,55140352.0,55140352.0,55140352.0,303513.5999999999,0.0,0.005514500680940366,resnet50,flatbuffer,Pixel 4,LOAD_MEM_PEAK peak memory after this iteration +75,3.1338999999999997,2.611,2.6353,2.6715,3.1814999999999984,7.326,1.3977947953830705,0.03200000000000003,0.44602405800538325,resnet50,flatbuffer,Pixel 6 Pro,LOAD latency +76,0.2702,0.253,0.2548,0.2605,0.28969999999999996,0.35,0.02780215818960822,0.004500000000000004,0.1028947379334131,resnet50,flatbuffer,Pixel 6 Pro,UNLOAD latency +77,7093043.2,6586368.0,6988185.6,7163904.0,7163904.0,7163904.0,173353.7339412105,0.0,0.024439965900843588,resnet50,flatbuffer,Pixel 6 Pro,LOAD_MEM memory from start +78,4458086.4,4112384.0,4222976.0,4235264.0,4470374.399999999,6586368.0,710372.5285931599,0.0,0.15934471987648327,resnet50,flatbuffer,Pixel 6 Pro,LOAD_MEM_DELTA memory increase this iteration +79,54815539.2,54308864.0,54710681.6,54886400.0,54886400.0,54886400.0,173353.73394121052,0.0,0.0031624925426476606,resnet50,flatbuffer,Pixel 6 Pro,LOAD_MEM_PEAK peak memory after this iteration +80,7.318500000000002,1.903,1.93,2.0505,7.450999999999982,54.854,15.845466129148742,0.11199999999999999,2.1651248383068578,resnet101,flatbuffer,iPhone X,LOAD latency +81,0.3252,0.303,0.3039,0.3315,0.3408,0.357,0.01835102176991789,0.016000000000000014,0.05642995624205993,resnet101,flatbuffer,iPhone X,UNLOAD latency +82,4266393.6,4177920.0,4266393.6,4276224.0,4276224.0,4276224.0,29491.200000000004,0.0,0.006912442396313365,resnet101,flatbuffer,iPhone X,LOAD_MEM memory from start +83,3214540.8,3096576.0,3096576.0,3096576.0,3293183.9999999995,4177920.0,322461.0956939767,0.0,0.10031326891043868,resnet101,flatbuffer,iPhone X,LOAD_MEM_DELTA memory increase this iteration +84,43604377.6,43515904.0,43604377.6,43614208.0,43614208.0,43614208.0,29491.199999999997,0.0,0.0006763357631321859,resnet101,flatbuffer,iPhone X,LOAD_MEM_PEAK peak memory after this iteration +85,4.169500000000001,0.976,0.9778,0.993,4.219799999999989,32.739,9.523187074188975,0.004500000000000004,2.284011769801888,resnet101,flatbuffer,iPhone 13 Pro,LOAD latency +86,0.16840000000000002,0.163,0.163,0.1655,0.17689999999999997,0.185,0.006636264009214819,0.0015000000000000013,0.03940774352265332,resnet101,flatbuffer,iPhone 13 Pro,UNLOAD latency +87,4251648.0,4177920.0,4251648.0,4259840.0,4259840.0,4259840.0,24576.0,0.0,0.005780346820809248,resnet101,flatbuffer,iPhone 13 Pro,LOAD_MEM memory from start +88,3212902.4,3096576.0,3096576.0,3096576.0,3278438.4,4177920.0,322598.42209260724,0.0,0.10040716521379774,resnet101,flatbuffer,iPhone 13 Pro,LOAD_MEM_DELTA memory increase this iteration +89,45981696.0,45907968.0,45981696.0,45989888.0,45989888.0,45989888.0,24576.0,0.0,0.0005344735435595938,resnet101,flatbuffer,iPhone 13 Pro,LOAD_MEM_PEAK peak memory after this iteration +90,6.0741000000000005,5.38,5.4331000000000005,5.515000000000001,6.209699999999998,11.13,1.6868779120019328,0.0535000000000001,0.2777165196493197,resnet101,flatbuffer,Pixel 4,LOAD latency +91,0.6784,0.617,0.6224,0.643,0.7671000000000001,0.786,0.06141042256816021,0.02300000000000002,0.09052243892712296,resnet101,flatbuffer,Pixel 4,UNLOAD latency +92,10659020.8,9084928.0,9844326.399999999,10868736.0,11123507.2,11149312.0,620470.4568961845,157696.0,0.058210830857576,resnet101,flatbuffer,Pixel 4,LOAD_MEM memory from start +93,8756838.4,7012352.0,8457420.8,8847360.0,9150873.6,9928704.0,679134.7104996475,149504.0,0.07755478398455401,resnet101,flatbuffer,Pixel 4,LOAD_MEM_DELTA memory increase this iteration +94,58776371.2,57778176.0,58508083.2,58896384.0,58998784.0,58998784.0,351502.9121964141,71680.0,0.00598034388683754,resnet101,flatbuffer,Pixel 4,LOAD_MEM_PEAK peak memory after this iteration +95,4.4153,3.877,3.8877999999999995,3.9205,4.476099999999999,8.887,1.490890475521257,0.026000000000000245,0.3376645925579818,resnet101,flatbuffer,Pixel 6 Pro,LOAD latency +96,0.43279999999999996,0.41,0.41179999999999994,0.42,0.4546,0.514,0.03019536388255655,0.008000000000000007,0.06976747662328224,resnet101,flatbuffer,Pixel 6 Pro,UNLOAD latency +97,10017996.8,9687040.0,9922969.6,9949184.0,10044211.2,10899456.0,304036.8951758322,0.0,0.030349070901662913,resnet101,flatbuffer,Pixel 6 Pro,LOAD_MEM memory from start +98,7332249.6,6717440.0,6938624.0,6963200.0,7356825.599999999,10899456.0,1191324.1877228215,0.0,0.16247730951805317,resnet101,flatbuffer,Pixel 6 Pro,LOAD_MEM_DELTA memory increase this iteration +99,58830848.0,58830848.0,58830848.0,58830848.0,58830848.0,58830848.0,0.0,0.0,0.0,resnet101,flatbuffer,Pixel 6 Pro,LOAD_MEM_PEAK peak memory after this iteration +100,5.3713999999999995,1.394,1.3967,1.4224999999999999,5.459299999999986,40.796,11.808278047200615,0.026999999999999913,2.198361329858252,silero,flatbuffer,iPhone X,LOAD latency +101,0.2378,0.227,0.227,0.229,0.25249999999999995,0.284,0.01692808317559905,0.0020000000000000018,0.07118622025062678,silero,flatbuffer,iPhone X,UNLOAD latency +102,3091660.8,3047424.0,3091660.8,3096576.0,3096576.0,3096576.0,14745.600000000002,0.0,0.004769475357710653,silero,flatbuffer,iPhone X,LOAD_MEM memory from start +103,2241331.2,2146304.0,2146304.0,2146304.0,2280652.8,3047424.0,269096.9130453935,0.0,0.12006119981080594,silero,flatbuffer,iPhone X,LOAD_MEM_DELTA memory increase this iteration +104,42314956.8,42270720.0,42314956.800000004,42319872.0,42319872.0,42319872.0,14745.600000000002,0.0,0.0003484725287489837,silero,flatbuffer,iPhone X,LOAD_MEM_PEAK peak memory after this iteration +105,3.1575999999999995,0.764,0.7667,0.774,3.2045999999999912,24.558,7.13349035465809,0.007500000000000007,2.2591494662585796,silero,flatbuffer,iPhone 13 Pro,LOAD latency +106,0.1183,0.113,0.1139,0.1155,0.12309999999999999,0.142,0.008222530024268681,0.0015000000000000013,0.06950574830320103,silero,flatbuffer,iPhone 13 Pro,UNLOAD latency +107,3080192.0,3080192.0,3080192.0,3080192.0,3080192.0,3080192.0,0.0,0.0,0.0,silero,flatbuffer,iPhone 13 Pro,LOAD_MEM memory from start +108,2239692.8,2146304.0,2146304.0,2146304.0,2239692.8,3080192.0,280166.39999999997,0.0,0.1250914411119239,silero,flatbuffer,iPhone 13 Pro,LOAD_MEM_DELTA memory increase this iteration +109,44580864.0,44580864.0,44580864.0,44580864.0,44580864.0,44580864.0,0.0,0.0,0.0,silero,flatbuffer,iPhone 13 Pro,LOAD_MEM_PEAK peak memory after this iteration +110,4.3515999999999995,3.776,3.7805,3.8075,4.469699999999999,9.075,1.5754530903838424,0.022499999999999964,0.36203996010291445,silero,flatbuffer,Pixel 4,LOAD latency +111,0.45059999999999995,0.406,0.4114,0.4215,0.5458,0.607,0.06372001255492657,0.009000000000000008,0.1414114792608224,silero,flatbuffer,Pixel 4,UNLOAD latency +112,7109836.8,5484544.0,5820006.4,7467008.0,7916748.8,8093696.0,862829.5701577225,507904.0,0.12135715550569634,silero,flatbuffer,Pixel 4,LOAD_MEM memory from start +113,5251891.2,3383296.0,3733504.0,5433344.0,6341836.8,7495680.0,1159873.802366861,708608.0,0.22084878726483537,silero,flatbuffer,Pixel 4,LOAD_MEM_DELTA memory increase this iteration +114,56051302.4,55513088.0,56051302.4,56111104.0,56111104.0,56111104.0,179404.8,0.0,0.0032007249130396654,silero,flatbuffer,Pixel 4,LOAD_MEM_PEAK peak memory after this iteration +115,3.3924,2.886,2.886,2.903,3.429499999999998,7.754,1.454020715120661,0.016999999999999904,0.428611223653066,silero,flatbuffer,Pixel 6 Pro,LOAD latency +116,0.3215,0.298,0.2989,0.31,0.33459999999999995,0.439,0.03981770962775232,0.006500000000000006,0.12384979666485947,silero,flatbuffer,Pixel 6 Pro,UNLOAD latency +117,5611110.4,5083136.0,5437030.399999999,5476352.0,5650431.999999999,7217152.0,548033.3200591365,0.0,0.09766931694288825,silero,flatbuffer,Pixel 6 Pro,LOAD_MEM memory from start +118,2981478.4,2183168.0,2514944.0,2551808.0,3018342.3999999985,7217152.0,1416162.54581084,0.0,0.474986686407267,silero,flatbuffer,Pixel 6 Pro,LOAD_MEM_DELTA memory increase this iteration +119,55390208.0,55390208.0,55390208.0,55390208.0,55390208.0,55390208.0,0.0,0.0,0.0,silero,flatbuffer,Pixel 6 Pro,LOAD_MEM_PEAK peak memory after this iteration diff --git a/assets/images/Screenshot 2023-04-17 at 19.21.29.png b/assets/images/Screenshot 2023-04-17 at 19.21.29.png new file mode 100644 index 000000000000..8ac189f2ee21 Binary files /dev/null and b/assets/images/Screenshot 2023-04-17 at 19.21.29.png differ diff --git a/assets/images/Screenshot 2023-04-18 at 16.41.48.png b/assets/images/Screenshot 2023-04-18 at 16.41.48.png new file mode 100644 index 000000000000..dc6e29f89e64 Binary files /dev/null and b/assets/images/Screenshot 2023-04-18 at 16.41.48.png differ diff --git a/assets/images/Screenshot 2023-04-18 at 16.41.58.png b/assets/images/Screenshot 2023-04-18 at 16.41.58.png new file mode 100644 index 000000000000..818b6df9c2e3 Binary files /dev/null and b/assets/images/Screenshot 2023-04-18 at 16.41.58.png differ diff --git a/assets/images/Screenshot 2023-04-18 at 16.42.10.png b/assets/images/Screenshot 2023-04-18 at 16.42.10.png new file mode 100644 index 000000000000..66d7d668182e Binary files /dev/null and b/assets/images/Screenshot 2023-04-18 at 16.42.10.png differ