Skip to content

Commit faa189c

Browse files
committed
Move libpq's write_failed mechanism down to pqsecure_raw_write().
Commit 1f39a1c implemented write-failure postponement in pqSendSome, which is above SSL/GSS processing. However, we've now seen failures indicating that (some versions of?) OpenSSL have a tendency to report write failures prematurely too. Hence, move the primary responsibility for postponing write failures down to pqsecure_raw_write(), below SSL/GSS processing. pqSendSome now sets write_failed only in corner cases where we'd lost the connection already. A side-effect of this change is that errors detected in the SSL/GSS layer itself will be reported immediately (as if they were read errors) rather than being postponed like write errors. That's reverting an effect of 1f39a1c, and I think it's fine: if there's not a socket-level error, it's hard to be sure whether an OpenSSL error ought to be considered a read or write failure anyway. Another important point is that write-failure postponement is now effective during connection setup. OpenSSL's misbehavior of this sort occurs during SSL_connect(), so that's a change we want. Per bug #17391 from Nazir Bilal Yavuz. Possibly this should be back-patched, but I think it prudent to let it age awhile in HEAD first. Discussion: https://postgr.es/m/17391-304f81bcf724b58b@postgresql.org
1 parent 335fa5a commit faa189c

File tree

2 files changed

+90
-38
lines changed

2 files changed

+90
-38
lines changed

src/interfaces/libpq/fe-misc.c

Lines changed: 20 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -777,19 +777,19 @@ pqReadData(PGconn *conn)
777777
* (putting it in conn->inBuffer) in any situation where we can't send
778778
* all the specified data immediately.
779779
*
780-
* Upon write failure, conn->write_failed is set and the error message is
781-
* saved in conn->write_err_msg, but we clear the output buffer and return
782-
* zero anyway; this is because callers should soldier on until it's possible
783-
* to read from the server and check for an error message. write_err_msg
784-
* should be reported only when we are unable to obtain a server error first.
785-
* (Thus, a -1 result is returned only for an internal *read* failure.)
780+
* If a socket-level write failure occurs, conn->write_failed is set and the
781+
* error message is saved in conn->write_err_msg, but we clear the output
782+
* buffer and return zero anyway; this is because callers should soldier on
783+
* until we have read what we can from the server and checked for an error
784+
* message. write_err_msg should be reported only when we are unable to
785+
* obtain a server error first. Much of that behavior is implemented at
786+
* lower levels, but this function deals with some edge cases.
786787
*/
787788
static int
788789
pqSendSome(PGconn *conn, int len)
789790
{
790791
char *ptr = conn->outBuffer;
791792
int remaining = conn->outCount;
792-
int oldmsglen = conn->errorMessage.len;
793793
int result = 0;
794794

795795
/*
@@ -817,7 +817,7 @@ pqSendSome(PGconn *conn, int len)
817817
if (conn->sock == PGINVALID_SOCKET)
818818
{
819819
conn->write_failed = true;
820-
/* Insert error message into conn->write_err_msg, if possible */
820+
/* Store error message in conn->write_err_msg, if possible */
821821
/* (strdup failure is OK, we'll cope later) */
822822
conn->write_err_msg = strdup(libpq_gettext("connection not open\n"));
823823
/* Discard queued data; no chance it'll ever be sent */
@@ -859,24 +859,6 @@ pqSendSome(PGconn *conn, int len)
859859
continue;
860860

861861
default:
862-
/* pqsecure_write set the error message for us */
863-
conn->write_failed = true;
864-
865-
/*
866-
* Transfer error message to conn->write_err_msg, if
867-
* possible (strdup failure is OK, we'll cope later).
868-
*
869-
* We only want to transfer whatever has been appended to
870-
* conn->errorMessage since we entered this routine.
871-
*/
872-
if (!PQExpBufferBroken(&conn->errorMessage))
873-
{
874-
conn->write_err_msg = strdup(conn->errorMessage.data +
875-
oldmsglen);
876-
conn->errorMessage.len = oldmsglen;
877-
conn->errorMessage.data[oldmsglen] = '\0';
878-
}
879-
880862
/* Discard queued data; no chance it'll ever be sent */
881863
conn->outCount = 0;
882864

@@ -886,7 +868,18 @@ pqSendSome(PGconn *conn, int len)
886868
if (pqReadData(conn) < 0)
887869
return -1;
888870
}
889-
return 0;
871+
872+
/*
873+
* Lower-level code should already have filled
874+
* conn->write_err_msg (and set conn->write_failed) or
875+
* conn->errorMessage. In the former case, we pretend
876+
* there's no problem; the write_failed condition will be
877+
* dealt with later. Otherwise, report the error now.
878+
*/
879+
if (conn->write_failed)
880+
return 0;
881+
else
882+
return -1;
890883
}
891884
}
892885
else

src/interfaces/libpq/fe-secure.c

Lines changed: 70 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -280,9 +280,22 @@ pqsecure_raw_read(PGconn *conn, void *ptr, size_t len)
280280
/*
281281
* Write data to a secure connection.
282282
*
283-
* On failure, this function is responsible for appending a suitable message
284-
* to conn->errorMessage. The caller must still inspect errno, but only
285-
* to determine whether to continue/retry after error.
283+
* Returns the number of bytes written, or a negative value (with errno
284+
* set) upon failure. The write count could be less than requested.
285+
*
286+
* Note that socket-level hard failures are masked from the caller,
287+
* instead setting conn->write_failed and storing an error message
288+
* in conn->write_err_msg; see pqsecure_raw_write. This allows us to
289+
* postpone reporting of write failures until we're sure no error
290+
* message is available from the server.
291+
*
292+
* However, errors detected in the SSL or GSS management level are reported
293+
* via a negative result, with message appended to conn->errorMessage.
294+
* It's frequently unclear whether such errors should be considered read or
295+
* write errors, so we don't attempt to postpone reporting them.
296+
*
297+
* The caller must still inspect errno upon failure, but only to determine
298+
* whether to continue/retry; a message has been saved someplace in any case.
286299
*/
287300
ssize_t
288301
pqsecure_write(PGconn *conn, const void *ptr, size_t len)
@@ -310,16 +323,50 @@ pqsecure_write(PGconn *conn, const void *ptr, size_t len)
310323
return n;
311324
}
312325

326+
/*
327+
* Low-level implementation of pqsecure_write.
328+
*
329+
* This is used directly for an unencrypted connection. For encrypted
330+
* connections, this does the physical I/O on behalf of pgtls_write or
331+
* pg_GSS_write.
332+
*
333+
* This function reports failure (i.e., returns a negative result) only
334+
* for retryable errors such as EINTR. Looping for such cases is to be
335+
* handled at some outer level, maybe all the way up to the application.
336+
* For hard failures, we set conn->write_failed and store an error message
337+
* in conn->write_err_msg, but then claim to have written the data anyway.
338+
* This is because we don't want to report write failures so long as there
339+
* is a possibility of reading from the server and getting an error message
340+
* that could explain why the connection dropped. Many TCP stacks have
341+
* race conditions such that a write failure may or may not be reported
342+
* before all incoming data has been read.
343+
*
344+
* Note that this error behavior happens below the SSL management level when
345+
* we are using SSL. That's because at least some versions of OpenSSL are
346+
* too quick to report a write failure when there's still a possibility to
347+
* get a more useful error from the server.
348+
*/
313349
ssize_t
314350
pqsecure_raw_write(PGconn *conn, const void *ptr, size_t len)
315351
{
316352
ssize_t n;
317353
int flags = 0;
318354
int result_errno = 0;
355+
char msgbuf[1024];
319356
char sebuf[PG_STRERROR_R_BUFLEN];
320357

321358
DECLARE_SIGPIPE_INFO(spinfo);
322359

360+
/*
361+
* If we already had a write failure, we will never again try to send data
362+
* on that connection. Even if the kernel would let us, we've probably
363+
* lost message boundary sync with the server. conn->write_failed
364+
* therefore persists until the connection is reset, and we just discard
365+
* all data presented to be written.
366+
*/
367+
if (conn->write_failed)
368+
return len;
369+
323370
#ifdef MSG_NOSIGNAL
324371
if (conn->sigpipe_flag)
325372
flags |= MSG_NOSIGNAL;
@@ -369,17 +416,29 @@ pqsecure_raw_write(PGconn *conn, const void *ptr, size_t len)
369416
/* FALL THRU */
370417

371418
case ECONNRESET:
372-
appendPQExpBufferStr(&conn->errorMessage,
373-
libpq_gettext("server closed the connection unexpectedly\n"
374-
"\tThis probably means the server terminated abnormally\n"
375-
"\tbefore or while processing the request.\n"));
419+
conn->write_failed = true;
420+
/* Store error message in conn->write_err_msg, if possible */
421+
/* (strdup failure is OK, we'll cope later) */
422+
snprintf(msgbuf, sizeof(msgbuf),
423+
libpq_gettext("server closed the connection unexpectedly\n"
424+
"\tThis probably means the server terminated abnormally\n"
425+
"\tbefore or while processing the request.\n"));
426+
conn->write_err_msg = strdup(msgbuf);
427+
/* Now claim the write succeeded */
428+
n = len;
376429
break;
377430

378431
default:
379-
appendPQExpBuffer(&conn->errorMessage,
380-
libpq_gettext("could not send data to server: %s\n"),
381-
SOCK_STRERROR(result_errno,
382-
sebuf, sizeof(sebuf)));
432+
conn->write_failed = true;
433+
/* Store error message in conn->write_err_msg, if possible */
434+
/* (strdup failure is OK, we'll cope later) */
435+
snprintf(msgbuf, sizeof(msgbuf),
436+
libpq_gettext("could not send data to server: %s\n"),
437+
SOCK_STRERROR(result_errno,
438+
sebuf, sizeof(sebuf)));
439+
conn->write_err_msg = strdup(msgbuf);
440+
/* Now claim the write succeeded */
441+
n = len;
383442
break;
384443
}
385444
}

0 commit comments

Comments
 (0)