The new `MDB_UNSIGNEDKEY` DbiFlag should probably be the default behaviour

This issue relates to this PR #199 (which addressed this issue #198) and the subsequent PR #237 that added MDB_UNSIGNED_KEY.

We have recently upgraded to 0.9.1 (thanks for this release and the fixes in it) and found that a number of our tests are failing due to cursor start/stop key comparisons.  Adding the `MDB_UNSIGNEDKEY` DbiFlag fixes them however it feels like the old `compareBuff` comparator ought to be the default rather than ByteBuffer.compareTo.

While I appreciate it is preferable to use methods from the JDK rather than hand rolled comparators, this is only the case IF the JDK methods behave in the same way as the existing code, which they don't. As @benalexau pointed out in the PR, `ByteBuffer.compareTo` is signed while `compareBuff` is unsigned.  The latter matches the way LMDB orders its entries while the former does not.

It is hard to see a case where you would want to use a comparator for the cursor start/stop keys that does not match that used for the iteration order imposed by LMDB's default comparator.  Maybe if you are only dealing with positive signed numbers they would match, but in that case, you would be better using MDB_INTEGER_kEY for the better performance.  There is always the option to use no stop key and to instead use your own code to determine when to break out of the cursor iteration.

This problem is not just limited to numbers.  As an example, if your key was a UTF16 string and you had keys:

`0x00 0x00`   // a null char
`0xFF 0xFD`   // a � char

They would appear in the DB in that order, but the ByteBuffer.compareTo would treat `0xFF 0xFD` as being the lower of the two.

If `MDB_UNSIGNEDKEY` is not set then it can lead to strange behaviour when using cursors that is highly value dependant, and thus tricky to fix.  A crude performance test between the two ByteBuffer comparators suggests that the JDK is only marginally quicker.

It may be preferable to return `compareBuff` to being the default (when no callback comparator is used) but to use the JDK comparators by setting a flag, essentially the opposite of what is in 0.9.1.  I.e. the developer is making an explicit choice to get the slight performance gain of the JDK method with the understanding that it comes with risks.  If they know their data fits within the constraints then it is ok.

One other option is not to use a java based comparator for the cursor start/stop keys and instead call down to this method on LMDB so that you are sure the comparison is being done in the same way as the DB is storing its entries.  I suspect that it would be more costly do make this call but you would at least be sure the comparison matched LMDB's.

```
int mdb_cmp (
    MDB_txn *         txn,
    MDB_dbi           dbi,
    const MDB_val *   a,
    const MDB_val *   b 
)

Compare two data items according to a particular database.

This returns a comparison as if the two data items were keys in the specified database.

Parameters
[in]	txn	A transaction handle returned by mdb_txn_begin()
[in]	dbi	A database handle returned by mdb_dbi_open()
[in]	a	The first item to compare
[in]	b	The second item to compare
Returns
< 0 if a < b, 0 if a == b, > 0 if a > b
```

Happy to raise a PR address any changes that are agreed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The new `MDB_UNSIGNEDKEY` DbiFlag should probably be the default behaviour #249

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The new MDB_UNSIGNEDKEY DbiFlag should probably be the default behaviour #249

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

The new `MDB_UNSIGNEDKEY` DbiFlag should probably be the default behaviour #249