Skip to content

Deprecate rocksdb.transaction-lock-timeout option #21904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 11, 2025

Conversation

jbajic
Copy link
Contributor

@jbajic jbajic commented Aug 8, 2025

Scope & Purpose

  1. The timeout parameters --rocksdb.transaction-lock-timeout do very little since we practically ignore the value and set our own in this part of the code in RocksDBTrxBaseMethods.cpp:
void RocksDBTrxBaseMethods::createTransaction() {
  // start rocks transaction
  rocksdb::TransactionOptions trxOpts;

  if (_state->hasHint(transaction::Hints::Hint::IS_FOLLOWER_TRX)) {
    // write operations for the same keys on followers should normally be
    // serialized by the key locks held on the leaders. so we don't expect
    // to run into lock conflicts on followers. however, the lock_timeout
    // set here also includes locking the striped mutex for _all_ key locks,
    // which may be contended under load. to avoid timeouts caused by
    // waiting for the contented striped mutex, increase it to a higher
    // value on followers that makes this situation unlikely.
    trxOpts.lock_timeout = 3000;
  } else if (_state->options().delaySnapshot) {
    // for single operations we delay acquiring the snapshot so we can lock
    // the key _before_ we acquire the snapshot to prevent write-write
    // conflicts. in this case we obviously also want to use a higher lock
    // timeout
    // TODO - make this configurable
    trxOpts.lock_timeout = 1000;
  } else {
    // when trying to lock the same keys, we want to return quickly and not
    // spend the default 1000ms before giving up
    trxOpts.lock_timeout = 1;
  }

meaning that whatever was the --rocksdb.transaction-lock-timeout value we will override it.

We use RocksDB transactions with set_snapshot set to true and pessimistic concurrency control). This means that
RocksDB locks all objects first and does not wait for Commit to perform checks; it can do them immediately.
And when another transaction wants to change the object that we are locking, then the question is whether the object that we are locking will be changed or not. E.g., using GetForUpdate does lock the object but might not change it; therefore, the lock timeout here serves us to determine for how long we will wait for the lock, but does not guarantee that even when the lock is released, there will not be conflicts.

The timeout plays a role when we have delaySnapshot and we have that only for single operations; the delay snapshot in our case, does not set set_snapshot of transaction, which means it does not take the snapshot and does not perform the check,s which makes it much quicker to execute and in this case the timeout does make sense, since we will not abort on conflict but will just wait until whichever transaction holds it to release it.

Also, changing the lock-timeout to 0 might not be a good option since the lock being acquired is a lock on RocksDB PointLockManager, which enables establishing a lock on the actual key. Meaning the timeout is on a lock at a level above the document lock whose timeout we would want to set. That can cause spurious failures even without any conflicting transactions.

  • 💩 Bugfix
  • 🍕 New feature
  • 🔥 Performance improvement
  • 🔨 Refactoring/simplification

Checklist

  • Tests
    • Regression tests
    • C++ Unit tests
    • integration tests
    • resilience tests
  • 📖 CHANGELOG entry made
  • 📚 documentation written (release notes, API changes, ...)
  • Backports
    • Backport for 3.12.0: (Please link PR)
    • Backport for 3.11: (Please link PR)
    • Backport for 3.10: (Please link PR)

Related Information

(Please reference tickets / specification / other PRs etc)

  • Docs PR:
  • Enterprise PR:
  • GitHub issue / Jira ticket:
  • Design document:

@jbajic jbajic self-assigned this Aug 8, 2025
@cla-bot cla-bot bot added the cla-signed label Aug 8, 2025
@jbajic jbajic marked this pull request as ready for review August 8, 2025 10:00
@jbajic jbajic requested a review from goedderz August 8, 2025 11:13
@KVS85 KVS85 changed the title Depracate rocksdb.transaction-lock-timeout option Deprecate rocksdb.transaction-lock-timeout option Aug 11, 2025
@KVS85 KVS85 added this to the devel milestone Aug 11, 2025
@KVS85 KVS85 added the 3 RocksDB Storage engine related label Aug 11, 2025
jbajic and others added 2 commits August 11, 2025 13:46
Co-authored-by: Tobias Gödderz <tobias@arangodb.com>
Co-authored-by: Tobias Gödderz <tobias@arangodb.com>
@goedderz goedderz merged commit 9d7ee03 into devel Aug 11, 2025
7 checks passed
@goedderz goedderz deleted the bug-fix/depracate-transaction-lock-timeout branch August 11, 2025 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 RocksDB Storage engine related cla-signed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants