Skip to content

Commit f081bf5

Browse files
authored
1 parent 6ad0db1 commit f081bf5

File tree

11 files changed

+1119
-0
lines changed

11 files changed

+1119
-0
lines changed

dlp/README.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Cloud Data Loss Prevention (DLP) API Samples
2+
The [Data Loss Prevention API](https://cloud.google.com/dlp/docs/) provides programmatic access to
3+
a powerful detection engine for personally identifiable information and other privacy-sensitive data
4+
in unstructured data streams.
5+
6+
## Setup
7+
- A Google Cloud project with billing enabled
8+
- [Enable](https://console.cloud.google.com/launcher/details/google/dlp.googleapis.com) the DLP API.
9+
- (Local testing)[Create a service account](https://cloud.google.com/docs/authentication/getting-started)
10+
and set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable pointing to the downloaded credentials file.
11+
12+
## Build
13+
This project uses the [Assembly Plugin](https://maven.apache.org/plugins/maven-assembly-plugin/usage.html) to build an uber jar.
14+
Run:
15+
```
16+
mvn clean package
17+
```
18+
19+
## Retrieve InfoTypes
20+
An [InfoType identifier](https://cloud.google.com/dlp/docs/infotypes-categories) represents an element of sensitive data.
21+
22+
[Info types](https://cloud.google.com/dlp/docs/infotypes-reference#global) are updated periodically. Use the API to retrieve the most current
23+
info types for a given category. eg. HEALTH or GOVERNMENT.
24+
```
25+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata -category GOVERNMENT
26+
```
27+
28+
## Retrieve Categories
29+
[Categories](https://cloud.google.com/dlp/docs/infotypes-categories) provide a way to easily access a group of related InfoTypes.
30+
```
31+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata
32+
```
33+
34+
## Inspect data for sensitive elements
35+
Inspect strings, files locally and on Google Cloud Storage and Cloud Datastore kinds with the DLP API.
36+
37+
Note: image scanning is not currently supported on Google Cloud Storage.
38+
For more information, refer to the [API documentation](https://cloud.google.com/dlp/docs).
39+
Optional flags are explained in [this resource](https://cloud.google.com/dlp/docs/reference/rest/v2beta1/content/inspect#InspectConfig).
40+
```
41+
Commands:
42+
-s <string> Inspect a string using the Data Loss Prevention API.
43+
-f <filepath> Inspects a local text, PNG, or JPEG file using the Data Loss Prevention API.
44+
-gcs -bucketName <bucketName> -fileName <fileName> Inspects a text file stored on Google Cloud Storage using the Data Loss
45+
Prevention API.
46+
-ds -projectId [projectId] -namespace [namespace] - kind <kind> Inspect a Datastore instance using the Data Loss Prevention API.
47+
48+
Options:
49+
--help Show help
50+
-minLikelihood [string] [choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
51+
[default: "LIKELIHOOD_UNSPECIFIED"]
52+
specifies the minimum reporting likelihood threshold.
53+
-f, --maxFindings [number] [default: 0]
54+
maximum number of results to retrieve
55+
-q, --includeQuote [boolean] [default: true] include matching string in results
56+
-t, --infoTypes restrict to limited set of infoTypes [ default: []]
57+
[ eg. PHONE_NUMBER US_PASSPORT]
58+
```
59+
### Examples
60+
- Inspect a string:
61+
```
62+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is me@somedomain.com"
63+
```
64+
- Inspect a local file (text / image):
65+
```
66+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f resources/test.txt
67+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f resources/test.png
68+
```
69+
- Inspect a file on Google Cloud Storage:
70+
```
71+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -gcs -bucketName my-bucket -fileName my-file.txt
72+
```
73+
- Inspect a Google Cloud Datastore kind:
74+
```
75+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -ds -kind my-kind
76+
```
77+
78+
## Automatic redaction of sensitive data
79+
[Automatic redaction](https://cloud.google.com/dlp/docs/classification-redaction) produces an output with sensitive data matches removed.
80+
81+
```
82+
Commands:
83+
-s <string> Source input string
84+
-r <replacement string> String to replace detected info types
85+
Options:
86+
--help Show help
87+
-minLikelihood choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"]
88+
[default: "LIKELIHOOD_UNSPECIFIED"]
89+
specifies the minimum reporting likelihood threshold.
90+
91+
-infoTypes restrict operation to limited set of info types [ default: []]
92+
[ eg. PHONE_NUMBER US_PASSPORT]
93+
```
94+
95+
### Example
96+
- Replace sensitive data in text with `_REDACTED_`:
97+
```
98+
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Redact -s "My phone number is (123) 456-7890 and my email address is me@somedomain.com" -r "_REDACTED_"
99+
```
100+
101+
## Integration tests
102+
### Setup
103+
- [Create a Google Cloud Storage bucket](https://console.cloud.google.com/storage) and upload [test.txt](src/test/resources/test.txt).
104+
- [Create a Google Cloud Datastore](https://console.cloud.google.com/datastore) kind and add an entity with properties:
105+
- `property1` : john@doe.com
106+
- `property2` : 343-343-3435
107+
- Update the Google Cloud Storage path and Datastore kind in [InspectIT.java](src/test/java/com/example/dlp/InspectIT.java).
108+
- Ensure that `GOOGLE_APPLICATION_CREDENTIALS` points to authorized service account credentials file.
109+
110+
## Run
111+
Run all tests:
112+
```
113+
mvn clean verify
114+
```
115+

dlp/pom.xml

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!--
3+
Copyright 2017 Google Inc.
4+
5+
Licensed under the Apache License, Version 2.0 (the "License");
6+
you may not use this file except in compliance with the License.
7+
You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
-->
17+
<!-- [START pom] -->
18+
<project>
19+
<modelVersion>4.0.0</modelVersion>
20+
<packaging>jar</packaging>
21+
<groupId>com.example</groupId>
22+
<artifactId>dlp-samples</artifactId>
23+
<version>1.0</version>
24+
25+
<!-- Parent defines config for testing & linting. -->
26+
<parent>
27+
<artifactId>doc-samples</artifactId>
28+
<groupId>com.google.cloud</groupId>
29+
<version>1.0.0</version>
30+
<relativePath>..</relativePath>
31+
</parent>
32+
33+
<properties>
34+
<maven.compiler.source>1.8</maven.compiler.source>
35+
<maven.compiler.target>1.8</maven.compiler.target>
36+
<google.auth.version>0.7.0</google.auth.version>
37+
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
38+
</properties>
39+
40+
<!-- Temporary workaround for known issue : https://github.com/GoogleCloudPlatform/google-cloud-java/issues/2192 -->
41+
<dependencyManagement>
42+
<dependencies>
43+
<dependency>
44+
<groupId>com.google.auth</groupId>
45+
<artifactId>google-auth-library-credentials</artifactId>
46+
<version>${google.auth.version}</version>
47+
</dependency>
48+
<dependency>
49+
<groupId>com.google.auth</groupId>
50+
<artifactId>google-auth-library-oauth2-http</artifactId>
51+
<version>${google.auth.version}</version>
52+
</dependency>
53+
</dependencies>
54+
</dependencyManagement>
55+
<!--- End of workaround -->
56+
57+
<dependencies>
58+
<!-- [START dlp_maven] -->
59+
<dependency>
60+
<groupId>com.google.cloud</groupId>
61+
<artifactId>google-cloud-dlp</artifactId>
62+
<version>0.20.2-alpha</version>
63+
</dependency>
64+
<!-- [END dlp_maven] -->
65+
<dependency>
66+
<groupId>commons-cli</groupId>
67+
<artifactId>commons-cli</artifactId>
68+
<version>1.4</version>
69+
</dependency>
70+
<!-- Test dependencies -->
71+
<dependency>
72+
<groupId>junit</groupId>
73+
<artifactId>junit</artifactId>
74+
<version>4.12</version>
75+
</dependency>
76+
</dependencies>
77+
<!-- Build jar with dependencies for testing -->
78+
<build>
79+
<plugins>
80+
<plugin>
81+
<artifactId>maven-assembly-plugin</artifactId>
82+
<version>3.0.0</version>
83+
<configuration>
84+
<descriptorRefs>
85+
<descriptorRef>jar-with-dependencies</descriptorRef>
86+
</descriptorRefs>
87+
</configuration>
88+
<executions>
89+
<execution>
90+
<id>make-assembly</id> <!-- this is used for inheritance merges -->
91+
<phase>package</phase> <!-- bind to the packaging phase -->
92+
<goals>
93+
<goal>single</goal>
94+
</goals>
95+
</execution>
96+
</executions>
97+
</plugin>
98+
</plugins>
99+
</build>
100+
</project>
101+
<!-- [END pom] -->

0 commit comments

Comments
 (0)