Skip to content

Conversation

dutow
Copy link
Collaborator

@dutow dutow commented Aug 18, 2025

There is at lesat one corner case scenario where we have to load the last record into the cache during a write:

  • replica crashes, receives last segment from primary
  • replica replays last segment, reaches end
  • replica activtes new key
  • replica replays prepared transaction, has to use old keys again
  • old key write function sees that we generated a new key, tries to load it

In this scenario we could get away by detecting that we are in a write, and asserting if we tried to use the last key.

But in a release build assertions are not fired, and we would end up writing some non encrypted data to disk, and later if we have to run recovery failing.

It could be a FATAL, but that would still crash the server, and the next startup would crash again and again...

Instead, to properly avoid this situation we preallocate memory for one more key in the cache during initialization. Since we can only add one extra key to the cache during the servers run, this means we no longer try to allocate in the critical section in any corner case.

While this is not the nicest solution, it is simple and keeps the current cache and decrypt/encrypt logic the same as before. Any other solution would be more complex, and even more of a hack, as it would require dealing with a possibly out of date cache.

@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2025

Codecov Report

❌ Patch coverage is 85.71429% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.49%. Comparing base (167aef2) to head (d01a7c0).

❌ Your project status has failed because the head coverage (82.49%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@                  Coverage Diff                  @@
##           TDE_REL_17_STABLE     #539      +/-   ##
=====================================================
- Coverage              82.50%   82.49%   -0.01%     
=====================================================
  Files                     25       25              
  Lines                   3207     3217      +10     
  Branches                 508      510       +2     
=====================================================
+ Hits                    2646     2654       +8     
  Misses                   452      452              
- Partials                 109      111       +2     
Components Coverage Δ
access 84.84% <85.71%> (-0.07%) ⬇️
catalog 87.68% <ø> (ø)
common 77.77% <ø> (ø)
encryption 72.97% <ø> (ø)
keyring 73.21% <ø> (ø)
src 94.15% <ø> (ø)
smgr 96.53% <ø> (ø)
transam ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@AndersAstrand
Copy link
Collaborator

AndersAstrand commented Aug 18, 2025

I find the names a bit confusing. Maybe prealloc would be clearer than extra and record clearer than alloc for the cache record.

To me it's not clear what "extra" refers to at all at least, while "preallocated" is clear.

Copy link
Collaborator

@AndersAstrand AndersAstrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good solution for now. We need to clean up the whole wal key cache setup at some point anyways.

I think the naming could be made clearer. And maybe InitWrite should always call the preallocate function with a comment like Ensure there are preallocated key and cache entries because they're sometimes needed in the critical section would make it all easier to understand than understanding why this is only done when loading all of the keys.

@dutow
Copy link
Collaborator Author

dutow commented Aug 18, 2025

And maybe InitWrite should always call the preallocate function with a comment like

Technically, that will happen, that check there is only a safety net. It could be an assert. But since the init function is the first function that tries to use the WAL cache, it will be the first function to fetch the keys.

There is at lesat one corner case scenario where we have to load the
last record into the cache during a write:

* replica crashes, receives last segment from primary
* replica replays last segment, reaches end
* replica activtes new key
* replica replays prepared transaction, has to use old keys again
* old key write function sees that we generated a new key, tries to load
  it

In this scenario we could get away by detecting that we are in a write,
and asserting if we tried to use the last key.

But in a release build assertions are not fired, and we would end up
writing some non encrypted data to disk, and later if we have to run
recovery failing.

It could be a FATAL, but that would still crash the server, and the next
startup would crash again and again...

Instead, to properly avoid this situation we preallocate memory for one
more key in the cache during initialization. Since we can only add one
extra key to the cache during the servers run, this means we no longer
try to allocate in the critical section in any corner case.

While this is not the nicest solution, it is simple and keeps the
current cache and decrypt/encrypt logic the same as before. Any other
solution would be more complex, and even more of a hack, as it would
require dealing with a possibly out of date cache.
@dutow dutow merged commit ff8a389 into percona:TDE_REL_17_STABLE Aug 18, 2025
18 of 19 checks passed
@dutow dutow deleted the pg1604-2 branch August 18, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants