Document#
Definition for Google Cloud Natural Language API documents.
A document is used to hold text to be analyzed and annotated.
-
class
google.cloud.language.document.
Annotations
(sentences, tokens, sentiment, entities)# Bases:
tuple
Annotations for a document.
Parameters: -
entities
# Alias for field number 3
-
sentences
# Alias for field number 0
-
sentiment
# Alias for field number 2
-
tokens
# Alias for field number 1
-
-
google.cloud.language.document.
DEFAULT_LANGUAGE
= 'en-US'# Default document language, English.
-
class
google.cloud.language.document.
Document
(client, content=None, gcs_url=None, doc_type='PLAIN_TEXT', language='en-US', encoding='UTF8')[source]# Bases:
object
Document to send to Google Cloud Natural Language API.
Represents either plain text or HTML, and the content is either stored on the document or referred to in a Google Cloud Storage object.
Parameters: - client (
Client
) – A client which holds credentials and other configuration. - content (str) – (Optional) The document text content (either plain text or HTML).
- gcs_url (str) – (Optional) The URL of the Google Cloud Storage object
holding the content. Of the form
gs://{bucket}/{blob-name}
. - doc_type (str) – (Optional) The type of text in the document.
Defaults to plain text. Can be one of
PLAIN_TEXT
or orHTML
. - language (str) – (Optional) The language of the document text.
Defaults to
DEFAULT_LANGUAGE
. - encoding (str) – (Optional) The encoding of the document text.
Defaults to UTF-8. Can be one of
UTF8
,UTF16
orUTF32
.
Raises: ValueError
bothcontent
andgcs_url
are specified or if neither are specified.-
HTML
= 'HTML'# HTML document type.
-
PLAIN_TEXT
= 'PLAIN_TEXT'# Plain text document type.
-
TYPE_UNSPECIFIED
= 'TYPE_UNSPECIFIED'# Unspecified document type.
-
analyze_entities
()[source]# Analyze the entities in the current document.
Finds named entities (currently finds proper names as of August 2016) in the text, entity types, salience, mentions for each entity, and other properties.
See analyzeEntities.
Return type: list Returns: A list of Entity
returned from the API.
-
analyze_sentiment
()[source]# Analyze the sentiment in the current document.
See analyzeSentiment.
Return type: Sentiment
Returns: The sentiment of the current document.
-
annotate_text
(include_syntax=True, include_entities=True, include_sentiment=True)[source]# Advanced natural language API: document syntax and other features.
Includes the full functionality of
analyze_entities()
andanalyze_sentiment()
, enabled by the flagsinclude_entities
andinclude_sentiment
respectively.In addition
include_syntax
adds a new feature that analyzes the document for semantic and syntacticinformation.Note
This API is intended for users who are familiar with machine learning and need in-depth text features to build upon.
See annotateText.
Parameters: Return type: Returns: A tuple of each of the four values returned from the API: sentences, tokens, sentiment and entities.
- client (