|
| 1 | +# Using customer-managed encryption keys |
| 2 | + |
| 3 | +[](https://console.cloud.google.com/cloudshell/editor) |
| 4 | + |
| 5 | +This sample demonstrate how to use |
| 6 | +[cryptographic encryption keys](https://cloud.google.com/kms/) |
| 7 | +for the I/O connectors in an |
| 8 | +[Apache Beam](https://beam.apache.org) pipeline. |
| 9 | +For more information, see the |
| 10 | +[Using customer-managed encryption keys](https://cloud.google.com/dataflow/docs/guides/customer-managed-encryption-keys) |
| 11 | +docs page. |
| 12 | + |
| 13 | +## Before you begin |
| 14 | + |
| 15 | +Follow the |
| 16 | +[Getting started with Google Cloud Dataflow](../README.md) |
| 17 | +page, and make sure you have a Google Cloud project with billing enabled |
| 18 | +and a *service account JSON key* set up in your `GOOGLE_APPLICATION_CREDENTIALS` environment variable. |
| 19 | +Additionally, for this sample you need the following: |
| 20 | + |
| 21 | +1. [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery,cloudkms.googleapis.com): |
| 22 | + BigQuery and Cloud KMS API. |
| 23 | + |
| 24 | +1. Create a Cloud Storage bucket. |
| 25 | + |
| 26 | + ```sh |
| 27 | + export BUCKET=your-gcs-bucket |
| 28 | + gsutil mb gs://$BUCKET |
| 29 | + ``` |
| 30 | + |
| 31 | +1. [Create a symmetric key ring](https://cloud.google.com/kms/docs/creating-keys). |
| 32 | + For best results, use a [regional location](https://cloud.google.com/kms/docs/locations). |
| 33 | + This example uses a `global` key for simplicity. |
| 34 | + |
| 35 | + ```sh |
| 36 | + export KMS_KEYRING=samples-keyring |
| 37 | + export KMS_KEY=samples-key |
| 38 | + |
| 39 | + # Create a key ring. |
| 40 | + gcloud kms keyrings create $KMS_KEYRING --location global |
| 41 | + |
| 42 | + # Create a key. |
| 43 | + gcloud kms keys create $KMS_KEY --location global \ |
| 44 | + --keyring $KMS_KEYRING --purpose encryption |
| 45 | + ``` |
| 46 | + |
| 47 | + > *Note:* Although you can destroy the |
| 48 | + > [*key version material*](https://cloud.google.com/kms/docs/destroy-restore), |
| 49 | + > you [cannot delete keys and key rings](https://cloud.google.com/kms/docs/object-hierarchy#lifetime). |
| 50 | + > Key rings and keys do not have billable costs or quota limitations, |
| 51 | + > so their continued existence does not impact costs or production limits. |
| 52 | +
|
| 53 | +1. Grant Encrypter/Decrypter permissions to the *Dataflow*, *Compute Engine*, and *BigQuery* accounts. |
| 54 | + |
| 55 | + ```sh |
| 56 | + export PROJECT=$(gcloud config get-value project) |
| 57 | + export PROJECT_NUMBER=$(gcloud projects list --filter $PROJECT --format "value(PROJECT_NUMBER)") |
| 58 | + |
| 59 | + # Grant Encrypter/Decrypter permissions to the Dataflow service account. |
| 60 | + gcloud projects add-iam-policy-binding $PROJECT \ |
| 61 | + --member serviceAccount:service-$PROJECT_NUMBER@dataflow-service-producer-prod.iam.gserviceaccount.com \ |
| 62 | + --role roles/cloudkms.cryptoKeyEncrypterDecrypter |
| 63 | + |
| 64 | + # Grant Encrypter/Decrypter permissions to the Compute Engine service account. |
| 65 | + gcloud projects add-iam-policy-binding $PROJECT \ |
| 66 | + --member serviceAccount:service-$PROJECT_NUMBER@compute-system.iam.gserviceaccount.com \ |
| 67 | + --role roles/cloudkms.cryptoKeyEncrypterDecrypter |
| 68 | + |
| 69 | + # Grant Encrypter/Decrypter permissions to the BigQuery service account. |
| 70 | + gcloud projects add-iam-policy-binding $PROJECT \ |
| 71 | + --member serviceAccount:bq-$PROJECT_NUMBER@bigquery-encryption.iam.gserviceaccount.com \ |
| 72 | + --role roles/cloudkms.cryptoKeyEncrypterDecrypter |
| 73 | + ``` |
| 74 | + |
| 75 | +1. Clone the `java-docs-samples` repository. |
| 76 | + |
| 77 | + ```sh |
| 78 | + git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git |
| 79 | + ``` |
| 80 | + |
| 81 | +1. Navigate to the sample code directory. |
| 82 | + |
| 83 | + ```sh |
| 84 | + cd java-docs-samples/dataflow/encryption-keys |
| 85 | + ``` |
| 86 | + |
| 87 | +## BigQueryKmsKey example |
| 88 | + |
| 89 | +* [BigQueryKmsKey.java](src/main/java/com/example/dataflow/cmek/BigQueryKmsKey.java) |
| 90 | +* [pom.xml](pom.xml) |
| 91 | + |
| 92 | +The following sample gets some data from the |
| 93 | +[NASA wildfires public BigQuery dataset](https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=nasa_wildfire&t=past_week&page=table) |
| 94 | +using a customer-managed encryption key, and dump that data into the specified `outputBigQueryTable` |
| 95 | +using the same customer-managed encryption key. |
| 96 | + |
| 97 | +Make sure you have the following variables set up: |
| 98 | + |
| 99 | +```sh |
| 100 | +# Set the project ID, GCS bucket and KMS key. |
| 101 | +export PROJECT=$(gcloud config get-value project) |
| 102 | +export BUCKET=your-gcs-bucket |
| 103 | + |
| 104 | +# Set the KMS key ID. |
| 105 | +export KMS_KEYRING=samples-keyring |
| 106 | +export KMS_KEY=samples-key |
| 107 | +export KMS_KEY_ID=$(gcloud kms keys list --location global --keyring $KMS_KEYRING --filter $KMS_KEY --format "value(NAME)") |
| 108 | + |
| 109 | +# Output BigQuery dataset and table name. |
| 110 | +export DATASET=samples |
| 111 | +export TABLE=dataflow_kms |
| 112 | +``` |
| 113 | + |
| 114 | +Create the BigQuery dataset where the output table resides. |
| 115 | + |
| 116 | +```sh |
| 117 | +# Create the BigQuery dataset. |
| 118 | +bq mk --dataset $PROJECT:$DATASET |
| 119 | +``` |
| 120 | + |
| 121 | +To run the sample using the Cloud Dataflow runner. |
| 122 | + |
| 123 | +```sh |
| 124 | +mvn compile exec:java \ |
| 125 | + -Dexec.mainClass=com.example.dataflow.cmek.BigQueryKmsKey \ |
| 126 | + -Dexec.args="\ |
| 127 | + --outputBigQueryTable=$PROJECT:$DATASET.$TABLE \ |
| 128 | + --kmsKey=$KMS_KEY_ID \ |
| 129 | + --project=$PROJECT \ |
| 130 | + --tempLocation=gs://$BUCKET/samples/dataflow/kms/tmp \ |
| 131 | + --runner=DataflowRunner" |
| 132 | +``` |
| 133 | + |
| 134 | +> *Note:* To run locally you can omit the `--runner` command line argument and it defaults to the `DirectRunner`. |
| 135 | +
|
| 136 | +You can check your submitted Cloud Dataflow jobs in the [GCP Console Dataflow page](https://console.cloud.google.com/dataflow) or by using `gcloud`. |
| 137 | + |
| 138 | +```sh |
| 139 | +gcloud dataflow jobs list |
| 140 | +``` |
| 141 | + |
| 142 | +Finally, check the contents of the BigQuery table. |
| 143 | + |
| 144 | +```sh |
| 145 | +bq query --use_legacy_sql=false "SELECT * FROM `$PROJECT.$DATASET.$TABLE`" |
| 146 | +``` |
| 147 | + |
| 148 | +## Cleanup |
| 149 | + |
| 150 | +To avoid incurring charges to your GCP account for the resources used: |
| 151 | + |
| 152 | +```sh |
| 153 | +# Remove only the files created by this sample. |
| 154 | +gsutil -m rm -rf "gs://$BUCKET/samples/dataflow/kms" |
| 155 | + |
| 156 | +# [optional] Remove the Cloud Storage bucket. |
| 157 | +gsutil rb gs://$BUCKET |
| 158 | + |
| 159 | +# Remove the BigQuery table. |
| 160 | +bq rm -f -t $PROJECT:$DATASET.$TABLE |
| 161 | + |
| 162 | +# [optional] Remove the BigQuery dataset and all its tables. |
| 163 | +bq rm -rf -d $PROJECT:$DATASET |
| 164 | + |
| 165 | +# Revoke Encrypter/Decrypter permissions to the Dataflow service account. |
| 166 | +gcloud projects remove-iam-policy-binding $PROJECT \ |
| 167 | + --member serviceAccount:service-$PROJECT_NUMBER@dataflow-service-producer-prod.iam.gserviceaccount.com \ |
| 168 | + --role roles/cloudkms.cryptoKeyEncrypterDecrypter |
| 169 | + |
| 170 | +# Revoke Encrypter/Decrypter permissions to the Compute Engine service account. |
| 171 | +gcloud projects remove-iam-policy-binding $PROJECT \ |
| 172 | + --member serviceAccount:service-$PROJECT_NUMBER@compute-system.iam.gserviceaccount.com \ |
| 173 | + --role roles/cloudkms.cryptoKeyEncrypterDecrypter |
| 174 | + |
| 175 | +# Revoke Encrypter/Decrypter permissions to the BigQuery service account. |
| 176 | +gcloud projects remove-iam-policy-binding $PROJECT \ |
| 177 | + --member serviceAccount:bq-$PROJECT_NUMBER@bigquery-encryption.iam.gserviceaccount.com \ |
| 178 | + --role roles/cloudkms.cryptoKeyEncrypterDecrypter |
| 179 | +``` |
0 commit comments