Skip to content

INTPYTHON-527 Add Queryable Encryption support #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

aclark4life
Copy link
Collaborator

@aclark4life aclark4life commented Jun 27, 2025

Previous attempts and additional context here:

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@timgraham
Copy link
Collaborator

The encryption tests are passing locally for me on Enterprise and on the Atlas VM.

On GitHub actions, this first issue was solved by adding "directConnection": True in DATABASES:

  File "/home/runner/work/django-mongodb-backend/django-mongodb-backend/django_repo/django/db/backends/base/base.py", line 197, in check_database_version_supported
    and self.get_database_version() < self.features.minimum_database_version
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/django-mongodb-backend/django-mongodb-backend/django_mongodb_backend/base.py", line 235, in get_database_version
    return tuple(self.connection.admin.command("buildInfo")["versionArray"])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/_csot.py", line 125, in csot_wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/database.py", line 926, in command
    with self._client._conn_for_reads(read_preference, session, operation=command_name) as (
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/mongo_client.py", line 1864, in _conn_for_reads
    server = self._select_server(read_preference, session, operation)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/mongo_client.py", line 1812, in _select_server
    server = topology.select_server(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 409, in select_server
    server = self._select_server(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 387, in _select_server
    servers = self.select_servers(
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 294, in select_servers
    server_descriptions = self._select_servers_loop(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 344, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: Could not reach any servers in [('3ff0eef351ff', 27017)]. Replica set is configured with internal hostnames or IPs?, Timeout: 30s, Topology Description: <TopologyDescription id: 688e9488ca8c88d98365da45, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('3ff0eef351ff', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('3ff0eef351ff:27017: [Errno -3] Temporary failure in name resolution (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>

But this issue remains:

Creating test database for alias 'encrypted' ('test_djangotests-encrypted')...
/home/runner/.local/lib/python3.12/site-packages/pymongo/daemon.py:147: RuntimeWarning: Failed to start mongocryptd: is it on your $PATH?
Original exception: [Errno 2] No such file or directory: 'mongocryptd'
  _silence_resource_warning(_spawn(sys.argv[1:]))
/home/runner/.local/lib/python3.12/site-packages/pymongo/daemon.py:147: RuntimeWarning: Failed to start mongocryptd: is it on your $PATH?
Original exception: [Errno 2] No such file or directory: 'mongocryptd'
  _silence_resource_warning(_spawn(sys.argv[1:]))
  Applying sites.0002_alter_domain_unique... OK
Operations to perform:
  Synchronize unmigrated apps: auth, contenttypes, encryption_, messages, sessions, staticfiles
  Apply all migrations: admin, sites
Synchronizing apps without migrations:
  Creating tables...
    Creating table encryption__appointment
Traceback (most recent call last):
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/encryption.py", line 286, in mark_command
    res = self.mongocryptd_client[database].command(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/_csot.py", line 125, in csot_wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/database.py", line 926, in command
    with self._client._conn_for_reads(read_preference, session, operation=command_name) as (
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/mongo_client.py", line 1864, in _conn_for_reads
    server = self._select_server(read_preference, session, operation)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/mongo_client.py", line 1812, in _select_server
    server = topology.select_server(
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 409, in select_server
    server = self._select_server(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 387, in _select_server
    servers = self.select_servers(
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 294, in select_servers
    server_descriptions = self._select_servers_loop(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/.local/lib/python3.12/site-packages/pymongo/synchronous/topology.py", line 344, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: localhost:27020: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 10.0s, Topology Description: <TopologyDescription id: 688e9f76697ba6965a378048, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27020) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27020: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>

Copilot

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life
Copy link
Collaborator Author

I'm still unclear on the user workflow. What I envisioned is:

* define model, `makemigrations`, `migrate`

* Put the output of `showencryptedfieldsmap` (which includes keyIds retrieved from server) in `AutoEncryptionOpts(encrypted_fields_map=...)`.

That is correct. We can remove --create if we never want to support this workflow:

  • define model
  • makemigrations
  • Put the output of showencryptedfieldsmap --create (which creates new keys) in AutoEncryptionOpts(encrypted_fields_map=...).
  • migrate

In other words, do we support the chicken and the egg or just the egg 😂

@aclark4life
Copy link
Collaborator Author

But this issue remains:

Creating test database for alias 'encrypted' ('test_djangotests-encrypted')...
/home/runner/.local/lib/python3.12/site-packages/pymongo/daemon.py:147: RuntimeWarning: Failed to start mongocryptd: is it on your $PATH?
Original exception: [Errno 2] No such file or directory: 'mongocryptd'
  _silence_resource_warning(_spawn(sys.argv[1:]))
/home/runner/.local/lib/python3.12/site-packages/pymongo/daemon.py:147: RuntimeWarning: Failed to start mongocryptd: is it on your $PATH?
Original exception: [Errno 2] No such file or directory: 'mongocryptd'

Any progress on this? Do we need to ask @blink1073 for help?

@addaleax
Copy link

addaleax commented Aug 6, 2025

@aclark4life Soooo ... I don't know what you've done around that area so far, but it's great that you're bringing it up, I didn't realize we'd have to still worry about this part because I don't think the PR currently concerns itself at all with how we integrate with crypt_shared/mongocryptd?

So I'll explain the whole story here, knowing that I'm risking telling you things you're alread well aware of 🙂

  • For automatic CSFLE/QE, we don't just need the libmongocrypt integration provided by the MongoDB driver, we also need a query analysis engine. This is created from part of the server source code, and comes in two forms:
    1. mongocryptd, a daemon process that can be spawned and run by the driver and talked through via the MongoDB wire protocol. We consider this a legacy solution, but it is fully supported and not officially deprecated for now.
    2. The mongo_crypt_v1 shared library, a dll/shared object that can be loaded by libmongocrypt at runtime.

Operationally, mongocryptd is more difficult to deploy in production applications (you're essentially managing an external daemon process from the driver, which is complex on its own, and potentially sharing this instance between multiple application processes), so while it is currently still supported, we are looking to phase it out in the long run. For backwards compatibility reasons, it is still the default (and fallback) though.

So we generally recommend using the crypt_shared library, which can be achieved by passing crypt_shared_lib_path=... to the driver, and ideally also setting crypt_shared_lib_required=True so that libmongocrypt doesn't fall back to mongocryptd and instead fails if the crypt_shared library could not be loaded.

Given that this is a new feature here, I'd strongly consider setting crypt_shared_lib_required=True and only using the crypt_shared library by default, and never using mongocryptd (or only if the user has explicitly requested it). I've asked our PMs about this in https://mongodb.slack.com/archives/C0406ECL478/p1754488650159029, it feels like the right call to me but I'd like to double-check.

As far as the tests here are concerned themselves, I imagine they're passing locally for you because you do have mongocryptd in your path, so the driver is able to spawn it. Regardless on how you decide regarding mongocryptd, you'll need to install at least one of it or the crypt_shared library in CI, and in the latter case, point the driver to the path to it (happy to provide help with the details of this should you have any questions).

@aclark4life
Copy link
Collaborator Author

Given that this is a new feature here, I'd strongly consider setting crypt_shared_lib_required=True and only using the crypt_shared library by default, and never using mongocryptd (or only if the user has explicitly requested it). I've asked our PMs about this in https://mongodb.slack.com/archives/C0406ECL478/p1754488650159029, it feels like the right call to me but I'd like to double-check.

Can we document our way around this by recommending crypt_shared_lib_path in production but allowing mongocryptd in development? I think that is something we can get Django folks to accept but I don't think enforcing the use of crypt shared in development would go over well.

Also what about bundling that library in the pymongocrypt wheel? It's the convenience of pip install django-mongodb-backend[encryption] that we are after and we lose that with the enterprise download step.

@addaleax
Copy link

addaleax commented Aug 6, 2025

Can we document our way around this by recommending crypt_shared_lib_path in production but allowing mongocryptd in development? I think that is something we can get Django folks to accept but I don't think enforcing the use of crypt shared in development would go over well.

Maybe, but I think we'd want to have a conversation around what the typical expectations here are. You'll generally want to have development and production environments behave similarly, and you'd still need to have a plan for what to do when mongocryptd does get deprecated eventually (long-term, I think it's fair to expect this to happen). You'll also still be in a position where you need to download and install mongocryptd, the only case in which this requirement goes away is the one where you happen to have the enterprise MongoDB server binaries already ready in your $PATH.

Also what about bundling that library in the pymongocrypt wheel? It's the convenience of pip install django-mongodb-backend[encryption] that we are after and we lose that with the enterprise download step.

This question comes up on a regular basis 🙂 Here's a Slack thread from April, which was one of the last times we spoke about this.

tl;dr: Yes, the setup process for CSFLE/QE is involved, and we'd like to make it easier. Currently, there is a requirement for the user to explicitly acknowledge that they have read and accepted the enterprise license agreement and that they are an Atlas or EA customer. Bundling this library with regular packages that can be installed via a regular package manager command like pip install is therefore something we can't do right now.)

Copy link
Contributor

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! I've provided comments and feedback around some documentation and clarifications, but will approve once those are addressed.

- Fix rebase merge conflict edits
- Remove integer field FIXME comments
- Remove pos_int
@@ -11,3 +11,4 @@ know:
embedded-models
transactions
known-issues
queryable-encryption
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

order above "transactions"


``showencryptedfieldsmap``
--------------------------

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. versionadded:: 5.2.0b2


Encrypted fields
----------------

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. versionadded:: 5.2.0b2

================================
Configuring Queryable Encryption
================================

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. versionadded:: 5.2.0b2

Comment on lines +29 to +32
.. django-admin-option:: --create-new-keys

If specified, creates the data keys instead of getting them from the
database.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option need some explanation in the howto. (use case)
Also, there is no explanation here that the command "gets the keys from the database"... need more explanation.

.. admonition:: List of encrypted fields

See the full list of :ref:`encrypted fields <encrypted-fields>` in the
:doc:`Model field reference </ref/models/fields>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need to repeat the name of the linked doc "Model field reference"


def setUp(self):
self.appointment = Appointment.objects.create(time="8:00")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no blank line needed after every object

self.assertEqual(json1, json2)

def test_show_encrypted_fields_map(self):
self.maxDiff = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set maxDiff = None at the class-level.

Comment on lines +42 to +45
.. admonition:: List of encrypted fields

See the full list of :ref:`encrypted fields <encrypted-fields>` in the
:doc:`Model field reference </ref/models/fields>`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth include this reference, but not in the middle of the example.

Comment on lines +64 to +68
{
_id: ObjectId('68825b066fac55353a8b2b41'),
ssn: '123-45-6789',
__safeContent__: [b'\xe0)NOFB\x9a,\x08\xd7\xdd\xb8\xa6\xba$…']
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this is telling me. So encrypted fields appear in plain text, but what the heck is "safeContent"? As I developer, do I care about it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants