-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[DO NOT MERGE] Upgrade pubsub samples. #543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not done, don't merge. Also I want to fix the tests but can't figure out how to run them. |
@jmdobry can you summarize the changes? |
Will do |
Top-level comments:
I did like the separation of iam-related samples in its own file. I organized these to match doc pages.
I'm pretty -1 on this. I've gotten feedback that these confuse users. Principle of least surprise.
I'm -1 on this, the justification "Avoid massive duplication of boilerplate import/instantiate code that gets out of date" isn't relevant when it's literally one statement with zero parameters. |
from google.cloud import pubsub | ||
|
||
|
||
pubsub_client = pubsub.Client() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no bueno:
- It's a global variable (not a constant)
- There should not be side-effects for top-level statements.
- This is not easily testable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
# References an existing topic, e.g. "my-topic" | ||
topic = pubsub_client.topic(topic_name) | ||
|
||
project_id = os.environ['GCLOUD_PROJECT'] or 'YOUR_PROJECT_ID' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, lets not do this. pubsub_client.project
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, yeah that works better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL google-cloud-node supports this too, nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
|
||
# [START pubsub_get_subscription_metadata] | ||
def get_subscription_metadata(topic_name, subscription_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_subscription
is just a good as get_subscription_metadata
and is more succinct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
# Retrieves the IAM policy for the subscription | ||
policy = subscription.get_iam_policy() | ||
|
||
print('Policy for subscription: {}'.format(policy.to_api_repr())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_api_repr
shouldn't be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
e4305ae
to
96ec3bc
Compare
The docs are being completely rewritten and won't have the same organization, so here I just went for consistency with other languages and the somewhat common "organization by resource type"
Yeah, still pending on whether we can reliably track samples using regular expressions.
Fixed |
That seems.. weird. These samples are getting big - the cognitive load of completely understanding one is now getting high. I'm not a fan of it. |
My feedback on the proposed samples
Method Definitions
I propose that code without a wrapping method definition is more clear and makes it easier for the reader to adapt the code to his/her own needs. As an example, see the code samples for the API reference docs, // TODO: Change placeholders below to values for parameters to the Insert() method:
// A valid API project identifier.
string project = "";
// TODO Add code to assign values to properties of 'content'.
Data.Bucket content = new Data.Bucket();
BucketsResource.InsertRequest request = storageService.Buckets.Insert(content, project);
Data.Bucket response = request.Execute(); Documenting snippet variablesMethod definitions typically include simple parameter names such as To solve this, each variable MUST include a description. I see 3 options for how to include a description of the variable in the code snippet:
Method parameters1. For the first option, each language typically uses its own idiomatic commenting style for documenting parameters. For example: /**
* Creates a new bucket with the given name.
*
* @param {string} name The name of the new bucket.
* @param {function} cb The callback function.
*/
function createBucket (name, callback) {
// ...
} These comments can take up many lines of code, using up valuable vertical space. Because they follow language idioms for method parameter documentation, they can often be more syntactically complex and difficult to read at a glance than simple comments. C# parameter documentation, for example, is written in XML and is overkill for use in snippets. These are not ideal due to readability and vertical space. Comments in method code2. The second option might be to add comments near where each variable is used. For example, when # Name of a new bucket to create, e.g. "my-bucket"
bucket = storage.create_bucket bucket_name This is helpful, although... what if multiple variables are introduced on the same line? # extract_uri references a Google Storage file, e.g. "gs://my-bucket/my-file.csv"
# format references a the exported file format, e.g. "csv" or "json"
extract_job = table.extract extract_uri, format These are not ideal because:
Placeholder variables3. The third option might be to add placeholder variables with example values / descriptions for each of the variables required to successfully run the snippet code. This is similar to what is currently used by some of the code snippets shown in API usage documentation 1. def upload_file bucket_name, file_name, local_file_path
# bucket_name = "Name of Google Cloud Storage bucket"
# local_file_path = "Path to local file to upload"
bucket = storage.bucket bucket_name
bucket.create_file local_file_path, file_name
end The third option might allow users to copy/paste the snippet into their own file and uncomment existing variable definitions to run the code successfully without having to define any variables by themselves. Users will very likely also find it useful to see a list of all required variables at the top of the snippet. I propose that this is ideal.
Why we do not need to include method signaturesFor all of the 3 possible options above, none of them benefit from being wrapped in a method definition. Method definitions take up 2 extra lines of vertical space and they result in the first line of code may be something like Import Client Library in snippets
We have found that many users copy/paste our snippets into their own code files. When a developer copy/pastes a snippet following these proposals, they will discover that the snippet DOES NOT FUNCTION independently for 2 reasons:
As a developer, I may search Google looking for how to use a product such as Google Cloud Storage looking for a way to upload a file. When I scan through the documentation and find the documentation for uploading a file, I am likely to read the snippet code and attempt to upload a file using that code. I may not take the time to read the text surrounding the code snippet. I propose that this is ideal.
ExampleAs an example, this is the template that Ruby code sample snippets use today. This snippet follows all of the proposals included in this comment and are the result of templatizing code sample snippets based on opinions of those gathering user study data on sample usability. Translating a string using the Google Translate API # api_key = "Your API access key"
# text = "The text you would like to translate"
# language_code = "The ISO 639-1 code of language to translate to, eg. 'en'"
require "google/cloud"
gcloud = Google::Cloud.new
translate = gcloud.translate api_key
translation = translate.translate text, to: language_code
puts "Translated '#{text}' to '#{translation.text.inspect}'"
puts "Original language: #{translation.from} translated to: #{translation.to}" Occassionally, variables are not simple strings but rather complex objects. BigQuery, for example, includes a snippet for inserting row data into a table and the row data is a complex object ( # project_id = "Your Google Cloud project ID"
# dataset_id = "ID of the dataset containing table"
# table_id = "ID of the table to import data into"
# row_data = [{ column1: value, column2: value }, ...]
require "google/cloud"
gcloud = Google::Cloud.new project_id
bigquery = gcloud.bigquery
dataset = bigquery.dataset dataset_id
table = dataset.table table_id
response = table.insert row_data
if response.success?
puts "Inserted rows successfully"
else
puts "Failed to insert #{response.error_rows.count} rows"
end |
+1 I adore small, simple samples. Whenever I see |
He meant that the
The problem is that most languages can't do imports within a method definition. Perhaps this works out okay in Ruby, but in Node.js dynamic imports are an anti-pattern. They can't be statically analyzed, and when Node.js implements ES2015 modules the language itself will enforce that imports be top-level statements only. This definitely won't work with Go, Java, and .NET, unless we want to do one snippet per file, which I don't think we want to do, and even then the samples for those languages would have to show a method definition. An argument could be made that a method signature can be self-documenting. The majority of our samples deal with primitives like strings and numbers. This example seems pretty clear to me (though maybe not a good measure?): def create_bucket(bucket_name):
storage_client = Storage.client()
# Creates a new bucket, e.g. "my-new-bucket"
bucket = storage_client.create_bucket(bucket_name)
print('Bucket {} created'.format(bucket.name)) Granted, this is a really simple example. For more complex samples, like writing a log entry (and your BigQuery row example), it's not enough for the method signature to simply accept a We could change the Cloud Storage sample to this: def create_bucket(bucket_name):
# [START storage_create_bucket]
storage_client = Storage.client()
# Creates a new bucket, e.g. "my-new-bucket"
bucket = storage_client.create_bucket(bucket_name)
print('Bucket {} created'.format(bucket.name))
# [END storage_create_bucket] but now it's not clear where the def create_bucket(bucket_name):
# [START storage_create_bucket]
storage_client = Storage.client()
# The name for the new bucket, e.g.
# bucket_name = "my-new-bucket"
# Creates a new bucket
bucket = storage_client.create_bucket(bucket_name)
print('Bucket {} created'.format(bucket.name))
# [END storage_create_bucket] But why not just show the method signature? I would say that decomposition and code re-use via functions is more common than writing apps that import, instantiate, and make a method call all at the top level. To reduce inconsistencies between our samples are how people write apps, my vote goes for showing the method definition. (Also, the user doesn't have to copy the method definition when they copy code from our docs. But if it's not there, they definitely can't copy it.) This would be my ideal "quickstart" Cloud Storage sample, which would be shown on the Cloud Storage Client Libraries page: # [START storage_quickstart]
# Imports the Google Cloud client library
from gcloud import storage
# Your Google Cloud Platform project ID
project_id = 'YOUR_PROJECT_ID'
# Instantiates a client
storage_client = storage.Client(project=project_id)
# The name for the new bucket
bucket_name = 'my-new-bucket'
# Creates the new bucket
bucket = storage_client.create_bucket(bucket_name)
# [END storage_quickstart] And this would be my ideal "everything else" type of snippet: (The snippet would be prefaced with a sentence like this: "For how to create a Cloud Storage client, refer to Cloud Storage Client Libraries.". The linked page would show installing and importing the client library.) def create_bucket(bucket_name):
# Instantiates a client
storage_client = Storage.client()
# The name for the new bucket, e.g.
# bucket_name = "my-new-bucket"
# Creates a new bucket
bucket = storage_client.create_bucket(bucket_name)
print('Bucket {} created'.format(bucket.name)) I'm okay with adding more comments that make it clear what type value a variable holds, e.g. |
Summary of changes:
master
before this PR gets merged)topics.py
andsubscriptions.py
to match the Node.js and Ruby PRsget_subscription
samplelist_subscriptions
sample (in addition to the existinglist_topic_subscriptions
sampleCompare to:
Proposals implemented by the 3 PRs: