Bug#31790329 PERFORMANCE REGRESSION DUE TO TOO MUCH RELEASE OF NDB API OBJECTS

Ole John Aske · Ole John Aske · commit 965f1276bfc9 · 2021-02-24T16:16:11.000+01:00
A performance regression was introduced by WL#8351, where too many
NDB API objects were released to the memory allocator, instead of being
kept in the free-list. The overhead of the memort allocator being used too much,
mainly resulted in an increased latency being observed for the transactions at
high load. Maximum TPS was also somewhat affected.

Root cause seems to be that the NDB API object usage statistic introduced
by WL#8351, made assumption about the object usage being normal distributed
over the transactions being executed in each Ndb context. That might not
be the case, resulting in the object recycling mechanism in WL#8351 not behaving
as intended. When instrumenting some customer usecases the observed behavior could
typically be a majority (~99%) of transactions using 1-2 NDB API objects, then
a few outliers using a lot more of the NDB API objects. As all objects usage
was sampled with the same 'priority', these small usages dominated the usage
statistics, preventing the larger transactions to be served from the free-list
of API objects.

As we want even these larger transactions to be served from the free-list,
we should focus on sampling these in the statistcs. Smaller usage periode
inbetween should be ignored until it is likely that a more permanent change
in behaviour has occured.

This patch implements logic for ignoring such intermediate low usage periode,
and just sample the larger usage peaks - from comments in patch:

+   * update_stats() is called whenever a new local peak of 'm_used_cnt'
+   * objects has been observed.
+   *
+   * The high usage peaks are most interesting as we want to scale the
+   * free-list to accomodate these - The smaller peaks inbetween are mostly
+   * considered as 'noise' in this statistics. Which may cause a too low
+   * usage statistics to be collected, such that the high usage peaks could
+   * not be served from the free-list.
+   *
+   * In order to implement this we use a combination of statistics and
+   * heuristics. Heuristics is based on observing free-list behavior of
+   * an instrumented version of this code.
+   *
+   * 1) A 'high peak' is any peak value above or equal to the current
+   *    sampled mean value. -&gt; Added to the statistics immediately .
+   * 2) A sampled peak value of 2 or less is considered as 'noise' and
+   *    just ignored.
+   * 3) Other peak values, less than the current mean:
+   *    These are observed over a periode of such smaller peaks, and their
+   *    max value collected in 'm_sample_max'. When the windows size has expired,
+   *    the 'm_sample_max' value is sampled.
+   *    Intn with this heuristic is that temporary reduced usage of objects
+   *    should be ignored, but longer term changes should be acounted for.
+   *
+   * When we have taken a valid sample, we use the statistics to calculate the
+   * 95% percentile for max objects in use of 'class T'.

Reviewed by: Frazer Clement &lt;frazer.clement@oracle.com&gt;
diff --git a/storage/ndb/include/util/stat_utils.hpp b/storage/ndb/include/util/stat_utils.hpp
@@ -1,5 +1,5 @@
 /*
-   Copyright (c) 2016, 2020, Oracle and/or its affiliates. All rights reserved.
+   Copyright (c) 2016, 2021, Oracle and/or its affiliates.
 
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License, version 2.0,
@@ -99,7 +99,7 @@ class NdbStatistics
       /* Add 'sample' as 'simple moving average' */
       m_noOfSamples++;
       m_mean      += (delta / m_noOfSamples);
-      m_sumSquare += (delta * (sample - m_mean));
+      m_sumSquare += fabs(delta * (sample - m_mean));
     }
   }
 
diff --git a/storage/ndb/src/ndbapi/NdbImpl.hpp b/storage/ndb/src/ndbapi/NdbImpl.hpp
@@ -1,5 +1,5 @@
 /*
-   Copyright (c) 2003, 2018, Oracle and/or its affiliates. All rights reserved.
+   Copyright (c) 2003, 2021, Oracle and/or its affiliates.
 
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License, version 2.0,
@@ -65,13 +65,64 @@ struct Ndb_free_list_t
   Ndb_free_list_t& operator=(const Ndb_free_list_t&);
 
   /**
-   * Based on a serie of sampled max. values for m_used_cnt;
-   * calculate the 95% percentile for max objects in use of 'class T'.
+   * update_stats() is called whenever a new local peak of 'm_used_cnt'
+   * objects has been observed.
+   *
+   * The high usage peaks are most interesting as we want to scale the
+   * free-list to accomodate these - The smaller peaks inbetween are mostly
+   * considered as 'noise' in this statistics. Which may cause a too low
+   * usage statistics to be collected, such that the high usage peaks could
+   * not be served from the free-list.
+   *
+   * In order to implement this we use a combination of statistics and
+   * heuristics. Heuristics is based on observing free-list behavior of
+   * an instrumented version of this code.
+   *
+   * 1) A 'high peak' is any peak value above or equal to the current
+   *    sampled mean value. -> Added to the statistics immediately .
+   * 2) A sampled peak value of 2 or less is considered as 'noise' and
+   *    just ignored.
+   * 3) Other peak values, less than the current mean:
+   *    These are observed over a periode of such smaller peaks, and their
+   *    max value collected in 'm_sample_max'. When the windows size has expired,
+   *    the 'm_sample_max' value is sampled.
+   *    Intention with this heuristic is that temporary reduced usage of objects
+   *    should be ignored, but longer term changes should be acounted for.
+   *
+   * When we have taken a valid sample, we use the statistics to calculate the
+   * 95% percentile for max objects in use of 'class T'.
    */
   void update_stats()
   {
-    m_stats.update(m_used_cnt);
-    m_estm_max_used = (Uint32)(m_stats.getMean() + (2 * m_stats.getStdDev()));
+    const Uint32 mean = m_stats.getMean();
+    if (m_used_cnt >= mean)
+      // 1) A high-peak value, sample it
+      m_stats.update(m_used_cnt);
+    else if (m_used_cnt <= 2)
+      // 2) Ignore very low sampled values, is 'noise'
+      return;
+    else
+    {
+      // 3) A local peak, less than current 'mean'
+      if (m_sample_max < m_used_cnt)
+        m_sample_max = m_used_cnt;
+
+      // Use a decay function of current 'mean' to decide how many small samples
+      // we may ignore - Smaller samples are ignored for a longer time.
+      const Uint32 max_skipped = (mean*5) / m_used_cnt;
+      m_samples_skipped++;
+      if (m_samples_skipped < max_skipped && m_samples_skipped < 10)
+        return;
+
+      // Expired low-value observation periode, sample max value seen.
+      m_stats.update(m_sample_max);
+    }
+    m_sample_max = 0;
+    m_samples_skipped = 0;
+
+    // Calculate upper 95% percentile from sampled values
+    const double upper = m_stats.getMean() + (2 * m_stats.getStdDev());
+    m_estm_max_used = (Uint32)(upper+0.999);
   }
 
   /** Shrink m_free_list such that m_used_cnt+'free' <= 'm_estm_max_used' */
@@ -83,6 +134,12 @@ struct Ndb_free_list_t
   /** Last operation allocated, or grabbed a free object */
   bool m_is_growing;
 
+  /** Number of consecuitive 'low-peak' values skipped */
+  Uint32 m_samples_skipped;
+
+  /** Max sample value seen in the 'm_samples_skipped' periode */
+  Uint32 m_sample_max;
+
   /** Statistics of peaks in number of obj 'T' in use */
   NdbStatistics m_stats;
 
@@ -399,6 +456,8 @@ Ndb_free_list_t<T>::Ndb_free_list_t()
    m_free_cnt(0),
    m_free_list(NULL),
    m_is_growing(false),
+   m_samples_skipped(0),
+   m_sample_max(0),
    m_stats(),
    m_estm_max_used(0)
 {}
diff --git a/storage/ndb/test/ndbapi/testNdbApi.cpp b/storage/ndb/test/ndbapi/testNdbApi.cpp
@@ -306,7 +306,7 @@ int runTestMaxOperations(NDBT_Context* ctx, NDBT_Step* step){
   }
 
   maxOpsLimit = 100;
-  Uint32 coolDownLoops = 25;
+  Uint32 coolDownLoops = 250;
   while (coolDownLoops-- > 0){
     int errors = 0;
     const int maxErrors = 5;

Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,5 @@`
`1`	`1`	`/*`
`2`		`- Copyright (c) 2016, 2020, Oracle and/or its affiliates. All rights reserved.`
	`2`	`+ Copyright (c) 2016, 2021, Oracle and/or its affiliates.`
`3`	`3`
`4`	`4`	`This program is free software; you can redistribute it and/or modify`
`5`	`5`	`it under the terms of the GNU General Public License, version 2.0,`
`@@ -99,7 +99,7 @@ class NdbStatistics`
`99`	`99`	`/* Add 'sample' as 'simple moving average' */`
`100`	`100`	`m_noOfSamples++;`
`101`	`101`	`m_mean += (delta / m_noOfSamples);`
`102`		`- m_sumSquare += (delta * (sample - m_mean));`
	`102`	`+ m_sumSquare += fabs(delta * (sample - m_mean));`
`103`	`103`	`}`
`104`	`104`	`}`
`105`	`105`
Original file line number	Diff line number	Diff line change
`@@ -306,7 +306,7 @@ int runTestMaxOperations(NDBT_Context* ctx, NDBT_Step* step){`
`306`	`306`	`}`
`307`	`307`
`308`	`308`	`maxOpsLimit = 100;`
`309`		`- Uint32 coolDownLoops = 25;`
	`309`	`+ Uint32 coolDownLoops = 250;`
`310`	`310`	`while (coolDownLoops-- > 0){`
`311`	`311`	`int errors = 0;`
`312`	`312`	`const int maxErrors = 5;`