Skip to content

Commit efc3289

Browse files
Brian Fosterdchinner
authored andcommitted
xfs: clear ail delwri queued bufs on unmount of shutdown fs
In the typical unmount case, the AIL is forced out by the unmount sequence before the xfsaild task is stopped. Since AIL items are removed on writeback completion, this means that the AIL ->ail_buf_list delwri queue has been drained. This is not always true in the shutdown case, however. It's possible for buffers to sit on a delwri queue for a period of time across submission attempts if said items are locked or have been relogged and pinned since first added to the queue. If the attempt to log such an item results in a log I/O error, the error processing can shutdown the fs, remove the item from the AIL, stale the buffer (dropping the LRU reference) and clear its delwri queue state. The latter bit means the buffer will be released from a delwri queue on the next submission attempt, but this might never occur if the filesystem has shutdown and the AIL is empty. This means that such buffers are held indefinitely by the AIL delwri queue across destruction of the AIL. Aside from being a memory leak, these buffers can also hold references to in-core perag structures. The latter problem manifests as a generic/475 failure, reproducing the following asserts at unmount time: XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 151 XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 132 To prevent this problem, clear the AIL delwri queue as a final step before xfsaild() exit. The !empty state should never occur in the normal case, so add an assert to catch unexpected problems going forward. [dgc: add comment explaining need for xfs_buf_delwri_cancel() after calling xfs_buf_delwri_submit_nowait().] Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
1 parent 26ca390 commit efc3289

File tree

2 files changed

+29
-6
lines changed

2 files changed

+29
-6
lines changed

fs/xfs/xfs_buf.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2055,6 +2055,13 @@ xfs_buf_delwri_submit_buffers(
20552055
* is only safely useable for callers that can track I/O completion by higher
20562056
* level means, e.g. AIL pushing as the @buffer_list is consumed in this
20572057
* function.
2058+
*
2059+
* Note: this function will skip buffers it would block on, and in doing so
2060+
* leaves them on @buffer_list so they can be retried on a later pass. As such,
2061+
* it is up to the caller to ensure that the buffer list is fully submitted or
2062+
* cancelled appropriately when they are finished with the list. Failure to
2063+
* cancel or resubmit the list until it is empty will result in leaked buffers
2064+
* at unmount time.
20582065
*/
20592066
int
20602067
xfs_buf_delwri_submit_nowait(

fs/xfs/xfs_trans_ail.c

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -531,17 +531,33 @@ xfsaild(
531531
set_current_state(TASK_INTERRUPTIBLE);
532532

533533
/*
534-
* Check kthread_should_stop() after we set the task state
535-
* to guarantee that we either see the stop bit and exit or
536-
* the task state is reset to runnable such that it's not
537-
* scheduled out indefinitely and detects the stop bit at
538-
* next iteration.
539-
*
534+
* Check kthread_should_stop() after we set the task state to
535+
* guarantee that we either see the stop bit and exit or the
536+
* task state is reset to runnable such that it's not scheduled
537+
* out indefinitely and detects the stop bit at next iteration.
540538
* A memory barrier is included in above task state set to
541539
* serialize again kthread_stop().
542540
*/
543541
if (kthread_should_stop()) {
544542
__set_current_state(TASK_RUNNING);
543+
544+
/*
545+
* The caller forces out the AIL before stopping the
546+
* thread in the common case, which means the delwri
547+
* queue is drained. In the shutdown case, the queue may
548+
* still hold relogged buffers that haven't been
549+
* submitted because they were pinned since added to the
550+
* queue.
551+
*
552+
* Log I/O error processing stales the underlying buffer
553+
* and clears the delwri state, expecting the buf to be
554+
* removed on the next submission attempt. That won't
555+
* happen if we're shutting down, so this is the last
556+
* opportunity to release such buffers from the queue.
557+
*/
558+
ASSERT(list_empty(&ailp->ail_buf_list) ||
559+
XFS_FORCED_SHUTDOWN(ailp->ail_mount));
560+
xfs_buf_delwri_cancel(&ailp->ail_buf_list);
545561
break;
546562
}
547563

0 commit comments

Comments
 (0)