Skip to content

INTPYTHON-527 Add Queryable Encryption support #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

aclark4life
Copy link
Collaborator

@aclark4life aclark4life commented Jun 27, 2025

Previous attempts and additional context here:

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

@aclark4life

This comment was marked as resolved.

@timgraham

This comment was marked as resolved.

Copilot

This comment was marked as resolved.

@mongodb mongodb deleted a comment from Copilot AI Aug 12, 2025
Comment on lines 21 to 47
def setUp(self):
self.appointment = Appointment.objects.create(time="8:00")
self.billing = Billing.objects.create(
cc_type="Visa", cc_number=1234567890123456, account_balance=100.50
)
self.portal_user = PatientPortalUser.objects.create(
ip_address="127.0.0.1",
url="https://example.com",
)
self.patientrecord = PatientRecord.objects.create(
ssn="123-45-6789",
birth_date="1970-01-01",
profile_picture=b"image data",
weight=175.5,
patient_age=47,
)
self.patient = Patient.objects.create(
patient_id=1,
patient_name="John Doe",
patient_notes="patient notes " * 25,
registration_date=datetime(2023, 10, 1, 12, 0, 0),
is_active=True,
email="john.doe@example.com",
)

# TODO: Embed billing and patient_record models in patient model
# then add tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test data is only needed for the fields tests? It looks like most test data is only used in one test method, so it's a bit wasteful to create it before every test method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in schema and fields

Comment on lines 117 to 87
"""
Fields that have not been tested elsewhere.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's difficult to navigate to navigate this test file and confirm that all fields are tested. Rather than a test method for each method, it would be more clear to have a test method for each field, e.g. test_charfield.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, don't move the misc untested fields to the patient themed tests right ? That's what I'm inclined to do but I believe you are suggesting the opposite. I like the patient themed tests but we could also re-do as field-first focused.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to keep the models as is. I'm suggesting that each field have its own test method.

Comment on lines 75 to 112
def test_create_new_keys(self):
out = StringIO()
call_command(
"showencryptedfieldsmap",
"--database",
"encrypted",
"--create-new-keys",
verbosity=0,
stdout=out,
)
command_output = json_util.loads(out.getvalue())

# TODO: Until the todo below is fixed, you can test this feature
# manually by uncommenting the following line and pasting the output
# into your client-side `encrypted_fields_map` configuration. You also
# need to temporarily configure server-side encryption for this to
# procedure to work.
# print(command_output)
self._compare_output(
self.expected_patient_record,
command_output["encryption__patientrecord"],
)

# Create a new connection to verify that the keys can be used in a
# client-side configuration to migrate the encrypted fields.
conn_params = connections["encrypted"].get_connection_params()
auto_encryption_opts = AutoEncryptionOpts(
key_vault_namespace="encryption.__keyvault",
kms_providers={"local": {"key": os.urandom(96)}},
encrypted_fields_map=command_output,
)
if conn_params.pop("auto_encryption_opts", False):
# Call MongoClient instead of get_new_connection because
# get_new_connection will return the encrypted connection from the
# connection pool.
with pymongo.MongoClient(**conn_params, auto_encryption_opts=auto_encryption_opts):
call_command("migrate", "--database", "encrypted", verbosity=0)

# TODO: Check the key vault to ensure that the keys created by
# `showencryptedfieldsmap --create-new-keys` are in the key vault.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is more to do to fix this test than just add assertions. The migrate step does nothing because there are no pending migrations to apply the encrypted database. Testing this is complicated and I'm not sure this attempted path is viable.

@timgraham

This comment was marked as resolved.

Comment on lines 34 to 35
If specified, this option will create new encryption keys for use in
production workflows.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for use in production workflows" need some explanation

Comment on lines 20 to 38
Queryable Encryption requires a KMS provider to encrypt and decrypt data. Django
MongoDB Backend supports configuring KMS credentials and providers for Queryable
Encryption via the ``KMS_CREDENTIALS`` setting in the ``DATABASES``
configuration and the ``kms_provider`` method on the ``DatabaseRouter``.

E.g. to configure AWS KMS credentials:

.. code-block:: python
KMS_CREDENTIALS = {
"aws": {
"key": os.getenv("AWS_KEY_ARN", ""),
"region": os.getenv("AWS_KEY_REGION", ""),
},
}
DATABASES = {
#
}
DATABASES["encrypted"]["KMS_CREDENTIALS"] = KMS_CREDENTIALS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagined this as a section at the bottom of the howto "Configuring KMS Providers", instead of the "KMS providers and credentials" admonition box that links (not really helpfully) to MongoDB's docs. It will also discuss router.kms_provider.

Comment on lines 91 to 102
Encrypted fields map
~~~~~~~~~~~~~~~~~~~~

In addition to the :ref:`settings described in the how-to guide
<queryable-encryption-settings>` you will need to provide a
``encrypted_fields_map`` to the ``AutoEncryptionOpts``.

You can use the :djadmin:`showencryptedfieldsmap` management command to generate
the schema map for your encrypted fields and then use the results in your
settings::

python manage.py showencryptedfieldsmap --database=encrypted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing the step on defining your models (consolidate from topic guide?), then makemigrations and migrate. Otherwise showencryptedfieldsmap may crash without any keyIds to retrieve?

Comment on lines 94 to 96
In addition to the :ref:`settings described in the how-to guide
<queryable-encryption-settings>` you will need to provide a
``encrypted_fields_map`` to the ``AutoEncryptionOpts``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this step is optional for development, right? The howto doesn't explain how --create-new-keys fits in.

"OPTIONS": {
"auto_encryption_opts": AutoEncryptionOpts(
...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aclark4life fyi: use three dots rather than the ellipses character. The entire code block won't be highlighted if the syntax is invalid, and the ellipsis character isn't valid.

@@ -6,6 +6,7 @@ set -eux
/opt/python/3.10/bin/python3 -m venv venv
. venv/bin/activate
python -m pip install -U pip
pip install ".[encryption]"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any objection to combining this line and the next line into pip install -e .[encryption]?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be best to have this line run only on builds that test encryption, so that we don't inadvertently add some top-level import for an optional dependency (similar reason to why we have a separate build for GIS that installs gdal-bin.) Perhaps the encryption build could set an environment variable like RUNNING_ENCRYPTION_TESTS that this script could consult.

from .fields import EmbeddedModelField
from .gis.schema import GISSchemaEditor
from .indexes import SearchIndex
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timgraham Not sure if you noticed this yet but in anticipation should I revert and leave for a sep PR ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be cleanest. Perhaps there are other offenders, though sometimes importing the full path might be preferable to a large number of dots. Some is some debate on this: django/django#19707, django/django#18813

aclark4life and others added 9 commits August 13, 2025 12:08
QE Doc flow

- Three top level (or close to it) docs
 - How to guide on QE configuration
 - QE field guide (moved out of model field ref to its own section)
 - Topic guide on QE usage
- Planning to finish QE setup in the how-to doc then switch back to the
  topic doc to explain the workflow.
In the QE topics guide I'm going to explain how to develop a patient model
similar to the Python QE tutorial and the encryption test suite.

If embedding were possible now, I could present the same example. Since
embedded models are a WIP I may either:

- design the guide to work around that issue
- design the guide to anticipate resolution of that issue

In ether case, I'll try to wrap this 24-48 hour docs jam having covered
everything I know about QE and Django and having presented it to Django
folks in a way we can support.
- Configuring the encrypted_fields_map section
- Configuring the Automatic Encryption Shared Library
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants