-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Add samples for natural language api. #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
5fb9def
Add samples for natural language api.
aff93ad
fixed variable name error
puneithk 03cb67b
logged error message with exception
puneithk e2cc3d5
fixed variable unused error
puneithk 1789677
Refactor for clarity.
1242524
cast samples to int
puneithk 3e4deb1
added sample variable
puneithk 4ffc546
fixed indentation bug
puneithk 7ca9918
Fix lint errors
0be30eb
Remove movie_nl sample until it's more stable.
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,7 @@ include = | |
dns/* | ||
datastore/* | ||
error_reporting/* | ||
language/* | ||
managed_vms/* | ||
monitoring/* | ||
speech/* | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# Google Cloud Natural Language API examples | ||
|
||
This directory contains Python examples that use the | ||
[Google Cloud Natural Language API](https://cloud.google.com/natural-language/). | ||
|
||
- [api](api) has a simple command line tool that shows off the API's features. | ||
|
||
- [ocr_nl](ocr_nl) uses the [Cloud Vision API](https://cloud.google.com/vision/) | ||
to extract text from images, then uses the NL API to extract entity information | ||
from those texts, and stores the extracted information in a database in support | ||
of further analysis and correlation. | ||
|
||
- [syntax_triples](syntax_triples) uses syntax analysis to find | ||
subject-verb-object triples in a given piece of text. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
|
||
# Google Cloud Natural Language API Sample | ||
|
||
This Python sample demonstrates the use of the [Google Cloud Natural Language API][NL-Docs] | ||
for sentiment, entity, and syntax analysis. | ||
|
||
[NL-Docs]: https://cloud.google.com/natural-language/docs/ | ||
|
||
## Setup | ||
|
||
Please follow the [Set Up Your Project](https://cloud.google.com/natural-language/docs/getting-started#set_up_your_project) | ||
steps in the Quickstart doc to create a project and enable the | ||
Cloud Natural Language API. Following those steps, make sure that you | ||
[Set Up a Service Account](https://cloud.google.com/natural-language/docs/common/auth#set_up_a_service_account), | ||
and export the following environment variable: | ||
|
||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does gcloud beta auth application-default login not work btw? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure. See my comment here, though.. |
||
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your-project-credentials.json | ||
``` | ||
|
||
## Run the sample | ||
|
||
Install [pip](https://pip.pypa.io/en/stable/installing) if not already installed. | ||
|
||
To run the example, install the necessary libraries using pip: | ||
|
||
```sh | ||
$ pip install -r requirements.txt | ||
``` | ||
|
||
Then, run the script: | ||
|
||
```sh | ||
$ python analyze.py <command> <text-string> | ||
``` | ||
|
||
where `<command>` is one of: `entities`, `sentiment`, or `syntax`. | ||
|
||
The script will write to STDOUT the json returned from the API for the requested feature. | ||
|
||
For example, if you run: | ||
|
||
```sh | ||
$ python analyze.py entities "Tom Sawyer is a book written by a guy known as Mark Twain." | ||
``` | ||
|
||
You will see something like the following returned: | ||
|
||
``` | ||
{ | ||
"entities": [ | ||
{ | ||
"salience": 0.49785897, | ||
"mentions": [ | ||
{ | ||
"text": { | ||
"content": "Tom Sawyer", | ||
"beginOffset": 0 | ||
} | ||
} | ||
], | ||
"type": "PERSON", | ||
"name": "Tom Sawyer", | ||
"metadata": { | ||
"wikipedia_url": "http://en.wikipedia.org/wiki/The_Adventures_of_Tom_Sawyer" | ||
} | ||
}, | ||
{ | ||
"salience": 0.12209519, | ||
"mentions": [ | ||
{ | ||
"text": { | ||
"content": "Mark Twain", | ||
"beginOffset": 47 | ||
} | ||
} | ||
], | ||
"type": "PERSON", | ||
"name": "Mark Twain", | ||
"metadata": { | ||
"wikipedia_url": "http://en.wikipedia.org/wiki/Mark_Twain" | ||
} | ||
} | ||
], | ||
"language": "en" | ||
} | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
#!/usr/bin/env python | ||
|
||
# Copyright 2016 Google, Inc | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
"""Analyzes text using the Google Cloud Natural Language API.""" | ||
|
||
import argparse | ||
import json | ||
import sys | ||
|
||
from googleapiclient import discovery | ||
import httplib2 | ||
from oauth2client.client import GoogleCredentials | ||
|
||
|
||
def get_service(): | ||
credentials = GoogleCredentials.get_application_default() | ||
scoped_credentials = credentials.create_scoped( | ||
['https://www.googleapis.com/auth/cloud-platform']) | ||
http = httplib2.Http() | ||
scoped_credentials.authorize(http) | ||
return discovery.build('language', 'v1beta1', http=http) | ||
|
||
|
||
def get_native_encoding_type(): | ||
"""Returns the encoding type that matches Python's native strings.""" | ||
if sys.maxunicode == 65535: | ||
return 'UTF16' | ||
else: | ||
return 'UTF32' | ||
|
||
|
||
def analyze_entities(text, encoding='UTF32'): | ||
body = { | ||
'document': { | ||
'type': 'PLAIN_TEXT', | ||
'content': text, | ||
}, | ||
'encodingType': encoding, | ||
} | ||
|
||
service = get_service() | ||
|
||
request = service.documents().analyzeEntities(body=body) | ||
response = request.execute() | ||
|
||
return response | ||
|
||
|
||
def analyze_sentiment(text): | ||
body = { | ||
'document': { | ||
'type': 'PLAIN_TEXT', | ||
'content': text, | ||
} | ||
} | ||
|
||
service = get_service() | ||
|
||
request = service.documents().analyzeSentiment(body=body) | ||
response = request.execute() | ||
|
||
return response | ||
|
||
|
||
def analyze_syntax(text, encoding='UTF32'): | ||
body = { | ||
'document': { | ||
'type': 'PLAIN_TEXT', | ||
'content': text, | ||
}, | ||
'features': { | ||
'extract_syntax': True, | ||
}, | ||
'encodingType': encoding, | ||
} | ||
|
||
service = get_service() | ||
|
||
request = service.documents().annotateText(body=body) | ||
response = request.execute() | ||
|
||
return response | ||
|
||
|
||
if __name__ == '__main__': | ||
parser = argparse.ArgumentParser( | ||
description=__doc__, | ||
formatter_class=argparse.RawDescriptionHelpFormatter) | ||
parser.add_argument('command', choices=[ | ||
'entities', 'sentiment', 'syntax']) | ||
parser.add_argument('text') | ||
|
||
args = parser.parse_args() | ||
|
||
if args.command == 'entities': | ||
result = analyze_entities(args.text, get_native_encoding_type()) | ||
elif args.command == 'sentiment': | ||
result = analyze_sentiment(args.text) | ||
elif args.command == 'syntax': | ||
result = analyze_syntax(args.text, get_native_encoding_type()) | ||
|
||
print(json.dumps(result, indent=2)) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think the return type is sufficiently complicated to warrant a doc string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh - good point.