-
Notifications
You must be signed in to change notification settings - Fork 2.9k
DLP => v2 #1056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(WIP) Pending tasks: -> Update / Add Tests -> Region tag / comment review -> Submit for code review + fixes -> Merge once google-cloud-java PR : googleapis/google-cloud-java#2958 is released
/** | ||
* [START dlp_deidentify_mask] | ||
* | ||
* <p>Deidentify a string by masking sensitive information with a character using the DLP API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is the <p>
intentional?
If not, please apply any required changes throughout this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
.setValue(string) | ||
ByteContentItem byteContentItem = | ||
ByteContentItem.newBuilder() | ||
.setType(ByteContentItem.BytesType.TEXT_UTF8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// (Optional) The encrypted ('wrapped') AES-256 key to use when shifting dates | ||
// This key should be encrypted using the Cloud KMS key specified above | ||
// If this is specified, then 'keyName' and 'contextFieldId' must also be set | ||
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/const
/String
/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// If this is specified, then 'keyName' and 'contextFieldId' must also be set | ||
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY' | ||
|
||
// If contextFieldId , keyName or wrappedKey is set : all three arguments must be valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - remove spaces before ,
and :
characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
while ((line = br.readLine()) != null) { | ||
// convert csv rows to Table.Row | ||
rows.add(convertCsvRowToTableRow(line)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should include this in our snippet - IMO it's not intuitive enough to leave the exercise to the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind - it is included in the snippet. :)
assertTrue(Pattern.compile( | ||
"Most common value occurs \\d time\\(s\\)").matcher(output).find()); | ||
|
||
assertThat(output, containsString("Most common value occurs")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should we check for the full string here? (e.g. to ensure that a digit is printed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
private ByteArrayOutputStream bout; | ||
private PrintStream out; | ||
|
||
private String topicId = "dlp-tests"; | ||
private String subscriptionId = "dlp-test"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: should these be based on environment variables?
(I don't know how the Java samples are tested - so perhaps I'm wrong about this. If you do change this, apply those changes throughout this PR.)
public void testCreateInspectTemplate() throws Exception { | ||
Templates.main(new String[] { | ||
"-c", | ||
"-displayName", String.format("test-name-%s", new Date()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use some sort of UUID, instead of the current time (which may not be globally unique)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
@Test | ||
public void testCreateInspectTemplate() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question/thought: would it be better to combine the {Create, List, Delete}Template tests into one? That way the test for DeleteTemplate
(which depends on templates having been created) will only run if CreateTemplate
passes
assertFalse(output.contains(text)); | ||
assertTrue(Pattern.compile("My SSN is \\w+").matcher(output).find()); | ||
assertThat( | ||
output, containsString("Successfully saved date-shift output to: results.temp.csv")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check the contents of results.temp.csv
against a file containing the desired results?
@jabubake you were right - that was a lot of Java code 😄 Nice work 👍 (Mostly nits, but there are some bigger things we may want to consider changing) |
* Return results of all tests. * Use for loop instead of while.
LGTM - Les, would you mind taking a look (in case Jisha is busy)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL - So, in general, I'd prefer smaller snippets, but it's probably ok here.
For testing, at least in Java, it's often better to snippet inside the method, and let the method return a result you check, rather that always looking at the printf results.
Ok to merge after you complete my comments, but please mention me so that I look at the last few commits.
|
||
/** | ||
* [START dlp_inspect_gcs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be before /**
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
} else { | ||
System.out.println("No findings."); | ||
} | ||
} catch (Exception e) { | ||
e.printStackTrace(); | ||
System.out.println("Error in inspectGCSFileAsync: " + e.getMessage()); | ||
} | ||
// [END dlp_inspect_gcs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be one line down ?? after final }
since we aren't hiding a return to the test app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
/** | ||
* [START dlp_inspect_bigquery] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b4 /**
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
return tableRowBuilder.build(); | ||
} | ||
// [END dlp_deidentify_date_shift] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - one long snippet. Should it be shorter?
.build(); | ||
subscriber.startAsync(); | ||
|
||
// wait for job completion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you show them this without commenting on what you might do in a production system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with the following:
// Wait for job completion semi-synchronously
// For long jobs, consider using a truly asynchronous execution model such as Cloud Functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
/** | ||
* [START dlp_l_diversity] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be outside the /**
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// wait for job completion | ||
try{ | ||
done.get(1, TimeUnit.MINUTES); | ||
Thread.sleep(500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -384,10 +557,9 @@ private static void calculateLDiversity( | |||
// [END dlp_l_diversity] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move outside final }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
public class Templates { | ||
|
||
/** | ||
* [START dlp_create_inspect_template] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move outside /**
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Ann,01/01/1970,4532908762519852,07/21/1996 | ||
James,03/06/1988,4301261899725540,04/09/2001 | ||
Dan,08/14/1945,4620761856015295,11/15/2011 | ||
Laura,11/03/1992,4564981067258901,01/04/2017 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be a newline here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@simonz130 expressed similar feedback about .NET - perhaps (when we're not concerned with launches) we can chat about whether/how we want to redesign our tests. |
// For long jobs, consider using a truly asynchronous execution model such as Cloud Functions | ||
try{ | ||
done.get(1, TimeUnit.MINUTES); | ||
Thread.sleep(500); // Wait for the job to become available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid Thread.sleep in the sample. the done blocks anyway.
// For long jobs, consider using a truly asynchronous execution model such as Cloud Functions | ||
try{ | ||
done.get(1, TimeUnit.MINUTES); | ||
Thread.sleep(500); // Wait for the job to become available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not comfortable with this, since this sleep amount seems aribtrary
@lesv FYI Jisha is wrapping this up - feel free to take a look in the future. |
Well, other than there being a merge conflict that I resolved... (let's see that tests work.) |
@dpebot merge when green |
Okay! I'll merge when all statuses are green and all reviewers approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTMing to unblock @dpebot
* DLP => v2 (WIP) Pending tasks: -> Update / Add Tests -> Region tag / comment review -> Submit for code review + fixes -> Merge once google-cloud-java PR : googleapis/google-cloud-java#2958 is released * Update to most recent versioning. * Updated DeIdentification samples and tests. * Revert pubsub to public version. * Fix Inspect samples/tests (minus pubsub). * Updated Jobs and add tests. * Updated Metadata classes. * Updated QuickStart tests and samples. * Updated Redact samples and tests. * Updated RiskAnalysis. * Update Template samples. * Update trigger tests. * Make Checkstyle Happy Again. * Fix (and ignore) tests using pubsub. * Update PR tests to complete all tests before returning results. (#1065) * Return results of all tests. * Use for loop instead of while. * WIP: Address PR feedback, part 1 * Update deps * Address PR feedback * Remove mvn clean verify failure * Add ReID FPE sample * Address PR feedback * Add k-map sample * checkstyle fixes
(WIP)
Pending tasks: