Skip to content

Commit 67a8753

Browse files
committed
In pgbench logging, avoid assuming that instr_times match Unix timestamps.
For aggregated logging, pg_bench supposed that printing the integer part of INSTR_TIME_GET_DOUBLE() would produce a Unix timestamp. That was already broken on Windows, and it's about to get broken on most other platforms as well. As in commit 74baa1e, we can remove the entanglement at the price of one extra syscall per transaction; though here it seems more convenient to use time(NULL) instead of gettimeofday(), since we only need integral-second precision. I took the time to do some wordsmithing on the documentation about pgbench's logging features, too. Discussion: https://postgr.es/m/8837.1483216839@sss.pgh.pa.us
1 parent 74baa1e commit 67a8753

File tree

2 files changed

+84
-74
lines changed

2 files changed

+84
-74
lines changed

doc/src/sgml/ref/pgbench.sgml

+54-42
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,7 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
355355
<term><option>--log</option></term>
356356
<listitem>
357357
<para>
358-
Write the time taken by each transaction to a log file.
358+
Write information about each transaction to a log file.
359359
See below for details.
360360
</para>
361361
</listitem>
@@ -572,13 +572,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
572572
<term><option>--aggregate-interval=<replaceable>seconds</></option></term>
573573
<listitem>
574574
<para>
575-
Length of aggregation interval (in seconds). May be used only together
576-
with <application>-l</application> - with this option, the log contains
577-
per-interval summary (number of transactions, min/max latency and two
578-
additional fields useful for variance estimation).
579-
</para>
580-
<para>
581-
This option is not currently supported on Windows.
575+
Length of aggregation interval (in seconds). May be used only
576+
with <option>-l</option> option. With this option, the log contains
577+
per-interval summary data, as described below.
582578
</para>
583579
</listitem>
584580
</varlistentry>
@@ -618,8 +614,8 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
618614
<term><option>--log-prefix=<replaceable>prefix</></option></term>
619615
<listitem>
620616
<para>
621-
Set the filename prefix for the transaction log file created by
622-
<option>--log</>. The default is <replaceable>pgbench_log</>.
617+
Set the filename prefix for the transaction log files created by
618+
<option>--log</>. The default is <literal>pgbench_log</>.
623619
</para>
624620
</listitem>
625621
</varlistentry>
@@ -1128,38 +1124,44 @@ END;
11281124
<title>Per-Transaction Logging</title>
11291125

11301126
<para>
1131-
With the <option>-l</> option but without the <option>--aggregate-interval</option>,
1132-
<application>pgbench</> writes the time taken by each transaction
1127+
With the <option>-l</> option (but without
1128+
the <option>--aggregate-interval</option> option),
1129+
<application>pgbench</> writes information about each transaction
11331130
to a log file. The log file will be named
11341131
<filename><replaceable>prefix</>.<replaceable>nnn</></filename>,
11351132
where <replaceable>prefix</> defaults to <literal>pgbench_log</>, and
11361133
<replaceable>nnn</> is the PID of the
1137-
<application>pgbench</application> process. If the <option>-j</> option is 2 or higher,
1138-
creating multiple worker threads, each will have its own log file. The first worker will
1134+
<application>pgbench</application> process.
1135+
The prefix can be changed by using the <option>--log-prefix</> option.
1136+
If the <option>-j</> option is 2 or higher, so that there are multiple
1137+
worker threads, each will have its own log file. The first worker will
11391138
use the same name for its log file as in the standard single worker case.
11401139
The additional log files for the other workers will be named
1141-
<filename><replaceable>pgbench_log</>.<replaceable>nnn</>.<replaceable>mmm</></filename>,
1140+
<filename><replaceable>prefix</>.<replaceable>nnn</>.<replaceable>mmm</></filename>,
11421141
where <replaceable>mmm</> is a sequential number for each worker starting
1143-
with 1. The prefix can be changed by using the <option>--log-prefix</>
1144-
option.
1142+
with 1.
11451143
</para>
11461144

11471145
<para>
11481146
The format of the log is:
11491147

11501148
<synopsis>
1151-
<replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>script_no</> <replaceable>time_epoch</> <replaceable>time_us</> <optional><replaceable>schedule_lag</replaceable></optional>
1149+
<replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>script_no</> <replaceable>time_epoch</> <replaceable>time_us</> <optional> <replaceable>schedule_lag</replaceable> </optional>
11521150
</synopsis>
11531151

1154-
where <replaceable>time</> is the total elapsed transaction time in microseconds,
1152+
where
1153+
<replaceable>client_id</> indicates which client session ran the transaction,
1154+
<replaceable>transaction_no</> counts how many transactions have been
1155+
run by that session,
1156+
<replaceable>time</> is the total elapsed transaction time in microseconds,
11551157
<replaceable>script_no</> identifies which script file was used (useful when
11561158
multiple scripts were specified with <option>-f</> or <option>-b</>),
11571159
and <replaceable>time_epoch</>/<replaceable>time_us</> are a
1158-
Unix epoch format time stamp and an offset
1160+
Unix-epoch time stamp and an offset
11591161
in microseconds (suitable for creating an ISO 8601
11601162
time stamp with fractional seconds) showing when
11611163
the transaction completed.
1162-
Field <replaceable>schedule_lag</> is the difference between the
1164+
The <replaceable>schedule_lag</> field is the difference between the
11631165
transaction's scheduled start time, and the time it actually started, in
11641166
microseconds. It is only present when the <option>--rate</> option is used.
11651167
When both <option>--rate</> and <option>--latency-limit</> are used,
@@ -1168,15 +1170,16 @@ END;
11681170
</para>
11691171

11701172
<para>
1171-
Here is a snippet of the log file generated:
1173+
Here is a snippet of a log file generated in a single-client run:
11721174
<screen>
11731175
0 199 2241 0 1175850568 995598
11741176
0 200 2465 0 1175850568 998079
11751177
0 201 2513 0 1175850569 608
11761178
0 202 2038 0 1175850569 2663
11771179
</screen>
11781180

1179-
Another example with --rate=100 and --latency-limit=5 (note the additional
1181+
Another example with <literal>--rate=100</>
1182+
and <literal>--latency-limit=5</> (note the additional
11801183
<replaceable>schedule_lag</> column):
11811184
<screen>
11821185
0 81 4621 0 1412881037 912698 3005
@@ -1203,32 +1206,41 @@ END;
12031206
<title>Aggregated Logging</title>
12041207

12051208
<para>
1206-
With the <option>--aggregate-interval</option> option, the logs use a bit different format:
1209+
With the <option>--aggregate-interval</option> option, a different
1210+
format is used for the log files:
12071211

12081212
<synopsis>
1209-
<replaceable>interval_start</> <replaceable>num_of_transactions</> <replaceable>latency_sum</> <replaceable>latency_2_sum</> <replaceable>min_latency</> <replaceable>max_latency</> <optional><replaceable>lag_sum</> <replaceable>lag_2_sum</> <replaceable>min_lag</> <replaceable>max_lag</> <optional><replaceable>skipped_transactions</></optional></optional>
1213+
<replaceable>interval_start</> <replaceable>num_transactions</> <replaceable>sum_latency</> <replaceable>sum_latency_2</> <replaceable>min_latency</> <replaceable>max_latency</> <optional> <replaceable>sum_lag</> <replaceable>sum_lag_2</> <replaceable>min_lag</> <replaceable>max_lag</> <optional> <replaceable>skipped</> </optional> </optional>
12101214
</synopsis>
12111215

1212-
where <replaceable>interval_start</> is the start of the interval (Unix epoch
1213-
format time stamp), <replaceable>num_of_transactions</> is the number of transactions
1214-
within the interval, <replaceable>latency_sum</replaceable> is a sum of latencies
1215-
(so you can compute average latency easily). The following two fields are useful
1216-
for variance estimation - <replaceable>latency_sum</> is a sum of latencies and
1217-
<replaceable>latency_2_sum</> is a sum of 2nd powers of latencies. The next two
1218-
fields are <replaceable>min_latency</> - a minimum latency within the interval, and
1219-
<replaceable>max_latency</> - maximum latency within the interval. A transaction is
1220-
counted into the interval when it was committed. The fields in the end,
1221-
<replaceable>lag_sum</>, <replaceable>lag_2_sum</>, <replaceable>min_lag</>,
1216+
where
1217+
<replaceable>interval_start</> is the start of the interval (as a Unix
1218+
epoch time stamp),
1219+
<replaceable>num_transactions</> is the number of transactions
1220+
within the interval,
1221+
<replaceable>sum_latency</replaceable> is the sum of the transaction
1222+
latencies within the interval,
1223+
<replaceable>sum_latency_2</replaceable> is the sum of squares of the
1224+
transaction latencies within the interval,
1225+
<replaceable>min_latency</> is the minimum latency within the interval,
1226+
and
1227+
<replaceable>max_latency</> is the maximum latency within the interval.
1228+
The next fields,
1229+
<replaceable>sum_lag</>, <replaceable>sum_lag_2</>, <replaceable>min_lag</>,
12221230
and <replaceable>max_lag</>, are only present if the <option>--rate</>
1223-
option is used. The very last one, <replaceable>skipped_transactions</>,
1224-
is only present if the option <option>--latency-limit</> is present, too.
1225-
They are calculated from the time each transaction had to wait for the
1231+
option is used.
1232+
They provide statistics about the time each transaction had to wait for the
12261233
previous one to finish, i.e. the difference between each transaction's
12271234
scheduled start time and the time it actually started.
1235+
The very last field, <replaceable>skipped</>,
1236+
is only present if the <option>--latency-limit</> option is used, too.
1237+
It counts the number of transactions skipped because they would have
1238+
started too late.
1239+
Each transaction is counted in the interval when it was committed.
12281240
</para>
12291241

12301242
<para>
1231-
Here is example output:
1243+
Here is some example output:
12321244
<screen>
12331245
1345828501 5601 1542744 483552416 61 2573
12341246
1345828503 7884 1979812 565806736 60 1479
@@ -1238,9 +1250,9 @@ END;
12381250
</screen></para>
12391251

12401252
<para>
1241-
Notice that while the plain (unaggregated) log file contains a reference
1242-
to the custom script files, the aggregated log does not. Therefore if
1243-
you need per script data, you need to aggregate the data on your own.
1253+
Notice that while the plain (unaggregated) log file shows which script
1254+
was used for each transaction, the aggregated log does not. Therefore if
1255+
you need per-script data, you need to aggregate the data on your own.
12441256
</para>
12451257

12461258
</refsect2>

src/bin/pgbench/pgbench.c

+30-32
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@
4242
#include <limits.h>
4343
#include <math.h>
4444
#include <signal.h>
45+
#include <time.h>
4546
#include <sys/time.h>
4647
#ifdef HAVE_SYS_SELECT_H
4748
#include <sys/select.h>
@@ -227,7 +228,7 @@ typedef struct SimpleStats
227228
*/
228229
typedef struct StatsData
229230
{
230-
long start_time; /* interval start time, for aggregates */
231+
time_t start_time; /* interval start time, for aggregates */
231232
int64 cnt; /* number of transactions */
232233
int64 skipped; /* number of transactions skipped under --rate
233234
* and --latency-limit */
@@ -449,7 +450,7 @@ static const BuiltinScript builtin_script[] =
449450
static void setIntValue(PgBenchValue *pv, int64 ival);
450451
static void setDoubleValue(PgBenchValue *pv, double dval);
451452
static bool evaluateExpr(TState *, CState *, PgBenchExpr *, PgBenchValue *);
452-
static void doLog(TState *thread, CState *st, instr_time *now,
453+
static void doLog(TState *thread, CState *st,
453454
StatsData *agg, bool skipped, double latency, double lag);
454455
static void processXactStats(TState *thread, CState *st, instr_time *now,
455456
bool skipped, StatsData *agg);
@@ -780,7 +781,7 @@ mergeSimpleStats(SimpleStats *acc, SimpleStats *ss)
780781
* the given value.
781782
*/
782783
static void
783-
initStats(StatsData *sd, double start_time)
784+
initStats(StatsData *sd, time_t start_time)
784785
{
785786
sd->start_time = start_time;
786787
sd->cnt = 0;
@@ -2425,10 +2426,15 @@ doCustom(TState *thread, CState *st, StatsData *agg)
24252426
}
24262427

24272428
/*
2428-
* print log entry after completing one transaction.
2429+
* Print log entry after completing one transaction.
2430+
*
2431+
* We print Unix-epoch timestamps in the log, so that entries can be
2432+
* correlated against other logs. On some platforms this could be obtained
2433+
* from the instr_time reading the caller has, but rather than get entangled
2434+
* with that, we just eat the cost of an extra syscall in all cases.
24292435
*/
24302436
static void
2431-
doLog(TState *thread, CState *st, instr_time *now,
2437+
doLog(TState *thread, CState *st,
24322438
StatsData *agg, bool skipped, double latency, double lag)
24332439
{
24342440
FILE *logfile = thread->logfile;
@@ -2447,15 +2453,17 @@ doLog(TState *thread, CState *st, instr_time *now,
24472453
if (agg_interval > 0)
24482454
{
24492455
/*
2450-
* Loop until we reach the interval of the current transaction, and
2451-
* print all the empty intervals in between (this may happen with very
2452-
* low tps, e.g. --rate=0.1).
2456+
* Loop until we reach the interval of the current moment, and print
2457+
* any empty intervals in between (this may happen with very low tps,
2458+
* e.g. --rate=0.1).
24532459
*/
2454-
while (agg->start_time + agg_interval < INSTR_TIME_GET_DOUBLE(*now))
2460+
time_t now = time(NULL);
2461+
2462+
while (agg->start_time + agg_interval <= now)
24552463
{
24562464
/* print aggregated report to logfile */
24572465
fprintf(logfile, "%ld " INT64_FORMAT " %.0f %.0f %.0f %.0f",
2458-
agg->start_time,
2466+
(long) agg->start_time,
24592467
agg->cnt,
24602468
agg->latency.sum,
24612469
agg->latency.sum2,
@@ -2485,12 +2493,6 @@ doLog(TState *thread, CState *st, instr_time *now,
24852493
/* no, print raw transactions */
24862494
struct timeval tv;
24872495

2488-
/*
2489-
* We print the current system timestamp in the log, so that entries
2490-
* can be correlated against other logs. On some platforms this is
2491-
* available in *now, but rather than get entangled with that, we just
2492-
* eat the cost of an extra syscall in all cases.
2493-
*/
24942496
gettimeofday(&tv, NULL);
24952497
if (skipped)
24962498
fprintf(logfile, "%d " INT64_FORMAT " skipped %d %ld %ld",
@@ -2518,7 +2520,7 @@ processXactStats(TState *thread, CState *st, instr_time *now,
25182520
double latency = 0.0,
25192521
lag = 0.0;
25202522

2521-
if ((!skipped || agg_interval) && INSTR_TIME_IS_ZERO(*now))
2523+
if ((!skipped) && INSTR_TIME_IS_ZERO(*now))
25222524
INSTR_TIME_SET_CURRENT(*now);
25232525

25242526
if (!skipped)
@@ -2540,7 +2542,7 @@ processXactStats(TState *thread, CState *st, instr_time *now,
25402542
thread->stats.cnt++;
25412543

25422544
if (use_log)
2543-
doLog(thread, st, now, agg, skipped, latency, lag);
2545+
doLog(thread, st, agg, skipped, latency, lag);
25442546

25452547
/* XXX could use a mutex here, but we choose not to */
25462548
if (per_script_stats)
@@ -3202,7 +3204,7 @@ ParseScript(const char *script, const char *desc, int weight)
32023204
ps.desc = desc;
32033205
ps.weight = weight;
32043206
ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
3205-
initStats(&ps.stats, 0.0);
3207+
initStats(&ps.stats, 0);
32063208

32073209
/* Prepare to parse script */
32083210
sstate = psql_scan_create(&pgbench_callbacks);
@@ -3972,10 +3974,6 @@ main(int argc, char **argv)
39723974
}
39733975
break;
39743976
case 5:
3975-
#ifdef WIN32
3976-
fprintf(stderr, "--aggregate-interval is not currently supported on Windows\n");
3977-
exit(1);
3978-
#else
39793977
benchmarking_option_set = true;
39803978
agg_interval = atoi(optarg);
39813979
if (agg_interval <= 0)
@@ -3984,7 +3982,6 @@ main(int argc, char **argv)
39843982
optarg);
39853983
exit(1);
39863984
}
3987-
#endif
39883985
break;
39893986
case 6:
39903987
progress_timestamp = true;
@@ -4267,7 +4264,7 @@ main(int argc, char **argv)
42674264
thread->random_state[2] = random();
42684265
thread->logfile = NULL; /* filled in later */
42694266
thread->latency_late = 0;
4270-
initStats(&thread->stats, 0.0);
4267+
initStats(&thread->stats, 0);
42714268

42724269
nclients_dealt += thread->nstate;
42734270
}
@@ -4321,7 +4318,7 @@ main(int argc, char **argv)
43214318
#endif /* ENABLE_THREAD_SAFETY */
43224319

43234320
/* wait for threads and accumulate results */
4324-
initStats(&stats, 0.0);
4321+
initStats(&stats, 0);
43254322
INSTR_TIME_SET_ZERO(conn_total_time);
43264323
for (i = 0; i < nthreads; i++)
43274324
{
@@ -4394,6 +4391,9 @@ threadRun(void *arg)
43944391

43954392
INSTR_TIME_SET_ZERO(thread->conn_time);
43964393

4394+
initStats(&aggs, time(NULL));
4395+
last = aggs;
4396+
43974397
/* open log file if requested */
43984398
if (use_log)
43994399
{
@@ -4429,9 +4429,6 @@ threadRun(void *arg)
44294429
INSTR_TIME_SET_CURRENT(thread->conn_time);
44304430
INSTR_TIME_SUBTRACT(thread->conn_time, thread->start_time);
44314431

4432-
initStats(&aggs, INSTR_TIME_GET_DOUBLE(thread->start_time));
4433-
last = aggs;
4434-
44354432
/* explicitly initialize the state machines */
44364433
for (i = 0; i < nstate; i++)
44374434
{
@@ -4635,7 +4632,7 @@ threadRun(void *arg)
46354632
* (If a read from a 64-bit integer is not atomic, you might
46364633
* get a "torn" read and completely bogus latencies though!)
46374634
*/
4638-
initStats(&cur, 0.0);
4635+
initStats(&cur, 0);
46394636
for (i = 0; i < nthreads; i++)
46404637
{
46414638
mergeSimpleStats(&cur.latency, &thread[i].stats.latency);
@@ -4695,12 +4692,13 @@ threadRun(void *arg)
46954692
INSTR_TIME_ACCUM_DIFF(thread->conn_time, end, start);
46964693
if (thread->logfile)
46974694
{
4698-
if (agg_interval)
4695+
if (agg_interval > 0)
46994696
{
47004697
/* log aggregated but not yet reported transactions */
4701-
doLog(thread, state, &end, &aggs, false, 0, 0);
4698+
doLog(thread, state, &aggs, false, 0, 0);
47024699
}
47034700
fclose(thread->logfile);
4701+
thread->logfile = NULL;
47044702
}
47054703
return NULL;
47064704
}

0 commit comments

Comments
 (0)