Skip to content

Deprecate rocksdb.transaction-lock-timeout option #21904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
devel
-----

* Deprecate rocksdb.transaction-lock-timeout option since we would
internally change this to a different value.

* Rebuilt included rclone v1.65.2 with go1.23.12 and non-vulnerable
dependencies.

Expand Down
6 changes: 3 additions & 3 deletions arangod/RestHandler/RestDocumentHandler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ async<void> RestDocumentHandler::insertDocument() {

bool const isMultiple = body.isArray();
transaction::Options trxOpts;
trxOpts.delaySnapshot = !isMultiple; // for now we only enable this for
trxOpts.avoidSnapshot = !isMultiple; // for now we only enable this for
// single document operations

auto trx = co_await createTransaction(
Expand Down Expand Up @@ -588,7 +588,7 @@ async<void> RestDocumentHandler::modifyDocument(bool isPatch) {

bool const isMultiple = body.isArray();
transaction::Options trxOpts;
trxOpts.delaySnapshot = !isMultiple; // for now we only enable this for
trxOpts.avoidSnapshot = !isMultiple; // for now we only enable this for
// single document operations

// find collection given by name or identifier
Expand Down Expand Up @@ -748,7 +748,7 @@ async<void> RestDocumentHandler::removeDocument() {

bool const isMultiple = search.isArray();
transaction::Options trxOpts;
trxOpts.delaySnapshot = !isMultiple; // for now we only enable this for
trxOpts.avoidSnapshot = !isMultiple; // for now we only enable this for
// single document operations

auto trx = co_await createTransaction(
Expand Down
22 changes: 13 additions & 9 deletions arangod/RocksDBEngine/Methods/RocksDBTrxBaseMethods.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ Result RocksDBTrxBaseMethods::beginTransaction() {
TRI_ASSERT(_rocksTransaction != nullptr);
TRI_ASSERT(_rocksTransaction->GetSnapshot() == nullptr);

if (!_state->options().delaySnapshot) {
if (!_state->options().avoidSnapshot) {
// In some cases we delay acquiring the snapshot so we can lock the key(s)
// _before_ we acquire the snapshot to prevent write-write conflicts. In all
// other cases we acquire the snapshot right now to be consistent with the
Expand Down Expand Up @@ -197,7 +197,7 @@ rocksdb::Status RocksDBTrxBaseMethods::Get(rocksdb::ColumnFamilyHandle* cf,
ReadOwnWrites readOwnWrites) {
TRI_ASSERT(cf != nullptr);
rocksdb::ReadOptions const& ro = _readOptions;
TRI_ASSERT(ro.snapshot != nullptr || _state->options().delaySnapshot);
TRI_ASSERT(ro.snapshot != nullptr || _state->options().avoidSnapshot);
if (readOwnWrites == ReadOwnWrites::yes) {
return _rocksTransaction->Get(ro, cf, key, val);
}
Expand All @@ -210,7 +210,7 @@ rocksdb::Status RocksDBTrxBaseMethods::GetForUpdate(
TRI_ASSERT(cf != nullptr);
TRI_ASSERT(_rocksTransaction);
rocksdb::ReadOptions const& ro = _readOptions;
TRI_ASSERT(ro.snapshot != nullptr || _state->options().delaySnapshot);
TRI_ASSERT(ro.snapshot != nullptr || _state->options().avoidSnapshot);
rocksdb::Status s = _rocksTransaction->GetForUpdate(ro, cf, key, val);
if (s.ok()) {
_memoryTracker.increaseMemoryUsage(
Expand Down Expand Up @@ -383,13 +383,13 @@ void RocksDBTrxBaseMethods::createTransaction() {
if (_state->hasHint(transaction::Hints::Hint::IS_FOLLOWER_TRX)) {
// write operations for the same keys on followers should normally be
// serialized by the key locks held on the leaders. so we don't expect
// to run into lock conflicts on followers. however, the lock_timeout
// set here also includes locking the striped mutex for _all_ key locks,
// which may be contended under load. to avoid timeouts caused by
// waiting for the contented striped mutex, increase it to a higher
// value on followers that makes this situation unlikely.
// to run into lock conflicts on followers. the lock_timeout doesn't do
// much here since the RocksDB does not differentiate between
// lock timeout on striped_mutex for PointLockManager and on single
// documents, this timeout is ignored unless it is set to 0, then we
// try_lock otherwise we just lock.
trxOpts.lock_timeout = 3000;
} else if (_state->options().delaySnapshot) {
} else if (_state->options().avoidSnapshot) {
// for single operations we delay acquiring the snapshot so we can lock
// the key _before_ we acquire the snapshot to prevent write-write
// conflicts. in this case we obviously also want to use a higher lock
Expand All @@ -399,6 +399,10 @@ void RocksDBTrxBaseMethods::createTransaction() {
} else {
// when trying to lock the same keys, we want to return quickly and not
// spend the default 1000ms before giving up
// Setting lock_timeout to 0 is not an option because then RocksDB uses
// try_lock on the striped mutex, and if it cannot acquire the lock it
// fails which can lead to spurious failures without having
// conflicting transactions.
trxOpts.lock_timeout = 1;
}

Expand Down
27 changes: 15 additions & 12 deletions arangod/RocksDBEngine/RocksDBOptionFeature.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -408,18 +408,21 @@ values to reduce a potential contention in the lock manager.
The option defaults to the number of available cores, but is increased to a
value of `16` if the number of cores is lower.)");

options->addOption(
"--rocksdb.transaction-lock-timeout",
"If positive, specifies the wait timeout in milliseconds when "
" a transaction attempts to lock a document. A negative value "
"is not recommended as it can lead to deadlocks (0 = no waiting, < 0 no "
"timeout)",
new Int64Parameter(&_transactionLockTimeout),
arangodb::options::makeFlags(
arangodb::options::Flags::DefaultNoComponents,
arangodb::options::Flags::OnAgent,
arangodb::options::Flags::OnDBServer,
arangodb::options::Flags::OnSingle));
options
->addOption(
"--rocksdb.transaction-lock-timeout",
"If positive, specifies the wait timeout in milliseconds when "
" a transaction attempts to lock a document. A negative value "
"is not recommended as it can lead to deadlocks (0 = no waiting, < 0 "
"no timeout). This is deprecated since we internally control the "
"lock timeout for different cases.",
new Int64Parameter(&_transactionLockTimeout),
arangodb::options::makeFlags(
arangodb::options::Flags::DefaultNoComponents,
arangodb::options::Flags::OnAgent,
arangodb::options::Flags::OnDBServer,
arangodb::options::Flags::OnSingle))
.setDeprecatedIn(31206);

options
->addOption(
Expand Down
5 changes: 2 additions & 3 deletions arangod/Transaction/Options.h
Original file line number Diff line number Diff line change
Expand Up @@ -108,10 +108,9 @@ struct Options {
bool requiresReplication = true;

/// @brief If set to true, the transaction is started without acquiring a
/// snapshot. The snapshot can be acquired at a later point by calling
/// `ensureSnapshot`. This allows us to lock the used keys before the
/// snapshot. This allows us to lock the used keys before the
/// snapshot is acquired in order to avoid write-write conflict.
bool delaySnapshot = false;
bool avoidSnapshot = false;

/// @brief if set to true, skips the fast, unordered lock round and always
/// uses the sequential, ordered lock round.
Expand Down
10 changes: 5 additions & 5 deletions arangod/V8Server/v8-collection.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -744,7 +744,7 @@ static void RemoveVocbaseCol(v8::FunctionCallbackInfo<v8::Value> const& args) {

bool payloadIsArray = args[0]->IsArray();
transaction::Options trxOpts;
trxOpts.delaySnapshot = !payloadIsArray; // for now we only enable this for
trxOpts.avoidSnapshot = !payloadIsArray; // for now we only enable this for
// single document operations

VPackSlice toRemove = searchBuilder.slice();
Expand Down Expand Up @@ -845,7 +845,7 @@ static void RemoveVocbase(v8::FunctionCallbackInfo<v8::Value> const& args) {
TRI_ASSERT(toRemove.isObject());

transaction::Options trxOpts;
trxOpts.delaySnapshot = true;
trxOpts.avoidSnapshot = true;

SingleCollectionTransaction trx(
std::shared_ptr<transaction::Context>(
Expand Down Expand Up @@ -1534,7 +1534,7 @@ static void ModifyVocbaseCol(TRI_voc_document_operation_e operation,

bool payloadIsArray = args[0]->IsArray();
transaction::Options trxOpts;
trxOpts.delaySnapshot = !payloadIsArray; // for now we only enable this for
trxOpts.avoidSnapshot = !payloadIsArray; // for now we only enable this for
// single document operations

// Now start the transaction:
Expand Down Expand Up @@ -1656,7 +1656,7 @@ static void ModifyVocbase(TRI_voc_document_operation_e operation,
}

transaction::Options trxOpts;
trxOpts.delaySnapshot = true;
trxOpts.avoidSnapshot = true;

SingleCollectionTransaction trx(
std::shared_ptr<transaction::Context>(
Expand Down Expand Up @@ -1954,7 +1954,7 @@ static void InsertVocbaseCol(v8::Isolate* isolate,
}

transaction::Options trxOpts;
trxOpts.delaySnapshot = !payloadIsArray; // for now we only enable this for
trxOpts.avoidSnapshot = !payloadIsArray; // for now we only enable this for
// single document operations

auto origin = transaction::OperationOriginREST{"inserting document(s)"};
Expand Down
2 changes: 1 addition & 1 deletion tests/sepp/Workloads/WriteWriteConflict.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ void WriteWriteConflict::Thread::run() {
}

transaction::Options trxOpts;
trxOpts.delaySnapshot = _options.delaySnapshot;
trxOpts.avoidSnapshot = _options.avoidSnapshot;

for (;;) {
SingleCollectionTransaction trx(
Expand Down
4 changes: 2 additions & 2 deletions tests/sepp/Workloads/WriteWriteConflict.h
Original file line number Diff line number Diff line change
Expand Up @@ -71,14 +71,14 @@ struct WriteWriteConflict::Options {
struct Thread {
std::string collection;
OperationType operation;
bool delaySnapshot;
bool avoidSnapshot;

template<class Inspector>
friend inline auto inspect(Inspector& f, Thread& o) {
return f.object(o).fields(
f.field("collection", o.collection),
f.field("operation", o.operation),
f.field("delaySnapshot", o.delaySnapshot).fallback(true));
f.field("avoidSnapshot", o.avoidSnapshot).fallback(true));
}
};

Expand Down