Cloud Datastore

gcloud.datastore

Shortcut methods for getting set up with Google Cloud Datastore.

You’ll typically use these to get started with the API:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id-here',
...                                 'long-email@googleapis.com',
...                                 '/path/to/private.key')
>>> # Then do other things...
>>> query = dataset.query().kind('EntityKind')
>>> entity = dataset.entity('EntityKind')

The main concepts with this API are:

gcloud.datastore.__init__.SCOPE = ('https://www.googleapis.com/auth/datastore ', 'https://www.googleapis.com/auth/userinfo.email')

The scope required for authenticating as a Cloud Datastore consumer.

gcloud.datastore.__init__.get_connection(client_email, private_key_path)[source]

Shortcut method to establish a connection to the Cloud Datastore.

Use this if you are going to access several datasets with the same set of credentials (unlikely):

>>> from gcloud import datastore
>>> connection = datastore.get_connection(email, key_path)
>>> dataset1 = connection.dataset('dataset1')
>>> dataset2 = connection.dataset('dataset2')
Parameters:
  • client_email (string) – The e-mail attached to the service account.
  • private_key_path (string) – The path to a private key file (this file was given to you when you created the service account).
Return type:

gcloud.datastore.connection.Connection

Returns:

A connection defined with the proper credentials.

gcloud.datastore.__init__.get_dataset(dataset_id, client_email, private_key_path)[source]

Shortcut method to establish a connection to a particular dataset in the Cloud Datastore.

You’ll generally use this as the first call to working with the API:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> # Now you can do things with the dataset.
>>> dataset.query().kind('TestKind').fetch()
[...]
Parameters:
  • dataset_id (string) – The id of the dataset you want to use. This is akin to a database name and is usually the same as your Cloud Datastore project name.
  • client_email (string) – The e-mail attached to the service account.
  • private_key_path (string) – The path to a private key file (this file was given to you when you created the service account).
Return type:

gcloud.datastore.dataset.Dataset

Returns:

A dataset with a connection using the provided credentials.

Connections

class gcloud.datastore.connection.Connection(credentials=None)[source]

Bases: object

A connection to the Google Cloud Datastore via the Protobuf API.

This class should understand only the basic types (and protobufs) in method arguments, however should be capable of returning advanced types.

Parameters:credentials (gcloud.credentials.Credentials) – The OAuth2 Credentials to use for this connection.
API_BASE_URL = 'https://www.googleapis.com'

The base of the API call URL.

API_URL_TEMPLATE = '{api_base}/datastore/{api_version}/datasets/{dataset_id}/{method}'

A template used to craft the URL pointing toward a particular API call.

API_VERSION = 'v1beta2'

The version of the API, used in building the API call’s URL.

begin_transaction(dataset_id, serializable=False)[source]

Begin a transaction.

Parameters:dataset_id (string) – The dataset over which to execute the transaction.
classmethod build_api_url(dataset_id, method, base_url=None, api_version=None)[source]

Construct the URL for a particular API call.

This method is used internally to come up with the URL to use when making RPCs to the Cloud Datastore API.

Parameters:
  • dataset_id (string) – The ID of the dataset to connect to. This is usually your project name in the cloud console.
  • method (string) – The API method to call (ie, runQuery, lookup, ...).
  • base_url (string) – The base URL where the API lives. You shouldn’t have to provide this.
  • api_version (string) – The version of the API to connect to. You shouldn’t have to provide this.
commit(dataset_id, mutation_pb)[source]
dataset(*args, **kwargs)[source]

Factory method for Dataset objects.

Parameters:args – All args and kwargs will be passed along to the gcloud.datastore.dataset.Dataset initializer.
Return type:gcloud.datastore.dataset.Dataset
Returns:A dataset object that will use this connection as its transport.
delete_entities(dataset_id, key_pbs)[source]

Delete keys from a dataset in the Cloud Datastore.

This method deals only with gcloud.datastore.datastore_v1_pb2.Key protobufs and not with any of the other abstractions. For example, it’s used under the hood in the gcloud.datastore.entity.Entity.delete() method.

Parameters:
  • dataset_id (string) – The dataset from which to delete the keys.
  • key_pbs (list of gcloud.datastore.datastore_v1_pb2.Key (or a single Key)) – The key (or keys) to delete from the datastore.
delete_entity(dataset_id, key_pb)[source]
http[source]

A getter for the HTTP transport used in talking to the API.

Return type:httplib2.Http
Returns:A Http object used to transport data.
lookup(dataset_id, key_pbs)[source]

Lookup keys from a dataset in the Cloud Datastore.

This method deals only with protobufs (gcloud.datastore.datastore_v1_pb2.Key and gcloud.datastore.datastore_v1_pb2.Entity) and is used under the hood for methods like gcloud.datastore.dataset.Dataset.get_entity():

>>> from gcloud import datastore
>>> from gcloud.datastore.key import Key
>>> connection = datastore.get_connection(email, key_path)
>>> dataset = connection.dataset('dataset-id')
>>> key = Key(dataset=dataset).kind('MyKind').id(1234)

Using the gcloud.datastore.dataset.Dataset helper:

>>> dataset.get_entity(key)
<Entity object>

Using the connection class directly:

>>> connection.lookup('dataset-id', key.to_protobuf())
<Entity protobuf>
Parameters:
  • dataset_id (string) – The dataset to look up the keys.
  • key_pbs (list of gcloud.datastore.datastore_v1_pb2.Key (or a single Key)) – The key (or keys) to retrieve from the datastore.
Return type:

list of gcloud.datastore.datastore_v1_pb2.Entity (or a single Entity)

Returns:

The entities corresponding to the keys provided. If a single key was provided and no results matched, this will return None. If multiple keys were provided and no results matched, this will return an empty list.

mutation()[source]
rollback_transaction(dataset_id, transaction_id)[source]

Rollback an existing transaction.

Raises a ValueError if the connection isn’t currently in a transaction.

Parameters:
  • dataset_id (string) – The dataset to which the transaction belongs.
  • transaction_id (string) – The ID of the transaction to roll back.
run_query(dataset_id, query_pb, namespace=None)[source]

Run a query on the Cloud Datastore.

Given a Query protobuf, sends a runQuery request to the Cloud Datastore API and returns a list of entity protobufs matching the query.

You typically wouldn’t use this method directly, in favor of the gcloud.datastore.query.Query.fetch() method.

Under the hood, the gcloud.datastore.query.Query class uses this method to fetch data:

>>> from gcloud import datastore
>>> connection = datastore.get_connection(email, key_path)
>>> dataset = connection.dataset('dataset-id')
>>> query = dataset.query().kind('MyKind').filter('property =', 'value')

Using the fetch` method...

>>> query.fetch()
[<list of Entity objects>]

Under the hood this is doing...

>>> connection.run_query('dataset-id', query.to_protobuf())
[<list of Entity Protobufs>]
Parameters:
  • dataset_id (string) – The ID of the dataset over which to run the query.
  • query_pb (gcloud.datastore.datastore_v1_pb2.Query) – The Protobuf representing the query to run.
  • namespace (string) – The namespace over which to run the query.
save_entity(dataset_id, key_pb, properties)[source]

Save an entity to the Cloud Datastore with the provided properties.

Parameters:
  • dataset_id (string) – The dataset in which to save the entity.
  • key_pb (gcloud.datastore.datastore_v1_pb2.Key) – The complete or partial key for the entity.
  • properties (dict) – The properties to store on the entity.
transaction(transaction=<object object at 0x350ea60>)[source]

Datasets

class gcloud.datastore.dataset.Dataset(id, connection=None)[source]

Bases: object

A dataset in the Cloud Datastore.

This class acts as an abstraction of a single dataset in the Cloud Datastore.

A dataset is analogous to a database in relational database world, and corresponds to a single project using the Cloud Datastore.

Typically, you would only have one of these per connection however it didn’t seem right to collapse the functionality of a connection and a dataset together into a single class.

Datasets (like gcloud.datastore.query.Query) are immutable. That is, you cannot change the ID and connection references. If you need to modify the connection or ID, it’s recommended to construct a new Dataset.

Parameters:
connection()[source]

Get the current connection.

>>> dataset = Dataset('dataset-id', connection=conn)
>>> dataset.connection()
<Connection object>
Return type:gcloud.datastore.connection.Connection
Returns:Returns the current connection.
entity(kind)[source]
get_entities(keys)[source]
get_entity(key)[source]

Retrieves an entity from the dataset, along with all of its attributes.

Parameters:item_name – The name of the item to retrieve.
Return type:gcloud.datastore.entity.Entity or None
Returns:The requested entity, or None if there was no match found.
id()[source]

Get the current dataset ID.

>>> dataset = Dataset('dataset-id', connection=conn)
>>> dataset.id()
'dataset-id'
Return type:string
Returns:The current dataset ID.
query(*args, **kwargs)[source]
transaction(*args, **kwargs)[source]

Entities

Class for representing a single entity in the Cloud Datastore.

Entities are akin to rows in a relational database, storing the actual instance of data.

Each entity is officially represented with a gcloud.datastore.key.Key class, however it is possible that you might create an Entity with only a partial Key (that is, a Key with a Kind, and possibly a parent, but without an ID).

Entities in this API act like dictionaries with extras built in that allow you to delete or persist the data stored on the entity.

class gcloud.datastore.entity.Entity(dataset=None, kind=None)[source]

Bases: dict

Parameters:
  • dataset (gcloud.datastore.dataset.Dataset) – The dataset in which this entity belongs.
  • kind (string) – The kind of entity this is, akin to a table name in a relational database.

Entities are mutable and act like a subclass of a dictionary. This means you could take an existing entity and change the key to duplicate the object.

This can be used on its own, however it is likely easier to use the shortcut methods provided by gcloud.datastore.dataset.Dataset such as:

You can the set values on the entity just like you would on any other dictionary.

>>> entity['age'] = 20
>>> entity['name'] = 'JJ'
>>> entity
<Entity[{'kind': 'EntityKind', id: 1234}] {'age': 20, 'name': 'JJ'}>

And you can cast an entity to a regular Python dictionary with the dict builtin:

>>> dict(entity)
{'age': 20, 'name': 'JJ'}
dataset()[source]

Get the gcloud.datastore.dataset.Dataset in which this entity belonds.

Note

This is based on the gcloud.datastore.key.Key set on the entity. That means that if you have no key set, the dataset might be None. It also means that if you change the key on the entity, this will refer to that key’s dataset.

delete()[source]

Delete the entity in the Cloud Datastore.

Note

This is based entirely off of the gcloud.datastore.key.Key set on the entity. Whatever is stored remotely using the key on the entity will be deleted.

classmethod from_key(key)[source]

Factory method for creating an entity based on the gcloud.datastore.key.Key.

Parameters:key (gcloud.datastore.key.Key) – The key for the entity.
Returns:The Entity derived from the gcloud.datastore.key.Key.
classmethod from_protobuf(pb, dataset=None)[source]

Factory method for creating an entity based on a protobuf.

The protobuf should be one returned from the Cloud Datastore Protobuf API.

Parameters:key (gcloud.datastore.datastore_v1_pb2.Entity) – The Protobuf representing the entity.
Returns:The Entity derived from the gcloud.datastore.datastore_v1_pb2.Entity.
key(key=None)[source]

Get or set the gcloud.datastore.key.Key on the current entity.

Parameters:key (glcouddatastore.key.Key) – The key you want to set on the entity.
Returns:Either the current key or the Entity.
>>> entity.key(my_other_key)  # This returns the original entity.
<Entity[{'kind': 'OtherKeyKind', 'id': 1234}] {'property': 'value'}>
>>> entity.key()  # This returns the key.
<Key[{'kind': 'OtherKeyKind', 'id': 1234}]>
kind()[source]

Get the kind of the current entity.

Note

This relies entirely on the gcloud.datastore.key.Key set on the entity. That means that we’re not storing the kind of the entity at all, just the properties and a pointer to a Key which knows its Kind.

reload()[source]

Reloads the contents of this entity from the datastore.

This method takes the gcloud.datastore.key.Key, loads all properties from the Cloud Datastore, and sets the updated properties on the current object.

Warning

This will override any existing properties if a different value exists remotely, however it will not override any properties that exist only locally.

save()[source]

Save the entity in the Cloud Datastore.

Return type:gcloud.datastore.entity.Entity
Returns:The entity with a possibly updated Key.

Keys

class gcloud.datastore.key.Key(dataset=None, namespace=None, path=None)[source]

Bases: object

An immutable representation of a datastore Key.

dataset(dataset=None)[source]
classmethod from_path(*args, **kwargs)[source]
classmethod from_protobuf(pb, dataset=None)[source]
id(id=None)[source]
id_or_name()[source]
is_partial()[source]
kind(kind=None)[source]
name(name=None)[source]
namespace(namespace=None)[source]
parent()[source]
path(path=None)[source]
to_protobuf()[source]

Queries

class gcloud.datastore.query.Query(kind=None, dataset=None)[source]

Bases: object

A Query against the Cloud Datastore.

This class serves as an abstraction for creating a query over data stored in the Cloud Datastore.

Each Query object is immutable, and a clone is returned whenever any part of the query is modified:

>>> query = Query('MyKind')
>>> limited_query = query.limit(10)
>>> query.limit() == 10
False
>>> limited_query.limit() == 10
True

You typically won’t construct a Query by initializing it like Query('MyKind', dataset=...) but instead use the helper gcloud.datastore.dataset.Dataset.query() method which generates a query that can be executed without any additional work:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> query = dataset.query('MyKind')
Parameters:
OPERATORS = {'=': 5, '<=': 2, '>=': 4, '<': 1, '>': 3}

Mapping of operator strings and their protobuf equivalents.

dataset(dataset=None)[source]

Get or set the gcloud.datastore.dataset.Dataset for this Query.

This is the dataset against which the Query will be run.

This is a hybrid getter / setter, used as:

>>> query = Query('Person')
>>> query = query.dataset(my_dataset)  # Set the dataset.
>>> query.dataset()  # Get the current dataset.
<Dataset object>
Return type:gcloud.datastore.dataset.Dataset, None, or Query
Returns:If no arguments, returns the current dataset. If a dataset is provided, returns a clone of the Query with that dataset set.
fetch(limit=None)[source]

Executes the Query and returns all matching entities.

This makes an API call to the Cloud Datastore, sends the Query as a protobuf, parses the responses to Entity protobufs, and then converts them to gcloud.datastore.entity.Entity objects.

For example:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> query = dataset.query('Person').filter('name =', 'Sally')
>>> query.fetch()
[<Entity object>, <Entity object>, ...]
>>> query.fetch(1)
[<Entity object>]
>>> query.limit()
None
Parameters:limit (integer) – An optional limit to apply temporarily to this query. That is, the Query itself won’t be altered, but the limit will be applied to the query before it is executed.
Return type:list of gcloud.datastore.entity.Entity‘s
Returns:The list of entities matching this query’s criteria.
filter(expression, value)[source]

Filter the query based on an expression and a value.

This will return a clone of the current Query filtered by the expression and value provided.

Expressions take the form of:

.filter('<property> <operator>', <value>)

where property is a property stored on the entity in the datastore and operator is one of OPERATORS (ie, =, <, <=, >, >=):

>>> query = Query('Person')
>>> filtered_query = query.filter('name =', 'James')
>>> filtered_query = query.filter('age >', 50)

Because each call to .filter() returns a cloned Query object we are able to string these together:

>>> query = Query('Person').filter('name =', 'James').filter('age >', 50)
Parameters:
  • expression (string) – An expression of a property and an operator (ie, =).
  • value (integer, string, boolean, float, None, datetime) – The value to filter on.
Return type:

Query

Returns:

A Query filtered by the expression and value provided.

kind(*kinds)[source]

Get or set the Kind of the Query.

Note

This is an additive operation. That is, if the Query is set for kinds A and B, and you call .kind('C'), it will query for kinds A, B, and, C.

Parameters:kinds (string) – The entity kinds for which to query.
Return type:string or Query
Returns:If no arguments, returns the kind. If a kind is provided, returns a clone of the Query with those kinds set.
limit(limit=None)[source]

Get or set the limit of the Query.

This is the maximum number of rows (Entities) to return for this Query.

This is a hybrid getter / setter, used as:

>>> query = Query('Person')
>>> query = query.limit(100)  # Set the limit to 100 rows.
>>> query.limit()  # Get the limit for this query.
100
Return type:integer, None, or Query
Returns:If no arguments, returns the current limit. If a limit is provided, returns a clone of the Query with that limit set.
to_protobuf()[source]

Convert the Query instance to a gcloud.datastore.datastore_v1_pb2.Query.

Return type:gclouddatstore.datastore_v1_pb2.Query
Returns:A Query protobuf that can be sent to the protobuf API.

Transactions

class gcloud.datastore.transaction.Transaction(dataset)[source]

Bases: object

An abstraction representing datastore Transactions.

Transactions can be used to build up a bulk mutuation as well as provide isolation.

For example, the following snippet of code will put the two save operations (either insert_auto_id or upsert) into the same mutation, and execute those within a transaction:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> with dataset.transaction(bulk_mutation=True)  # The default.
...   entity1.save()
...   entity2.save()

To rollback a transaction if there is an error:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> with dataset.transaction() as t:
...   try:
...     do_some_work()
...     entity1.save()
...   except:
...     t.rollback()

If the transaction isn’t rolled back, it will commit by default.

Warning

Inside a transaction, automatically assigned IDs for entities will not be available at save time! That means, if you try:

>>> with dataset.transaction():
...   entity = dataset.entity('Thing').save()

entity won’t have a complete Key until the transaction is committed.

Once you exit the transaction (or call commit()), the automatically generated ID will be assigned to the entity:

>>> with dataset.transaction():
...   entity = dataset.entity('Thing')
...   entity.save()
...   assert entity.key().is_partial()  # There is no ID on this key.
>>> assert not entity.key().is_partial()  # There *is* an ID on this key.

Warning

If you’re using the automatically generated ID functionality, it’s important that you only use gcloud.datastore.entity.Entity.save() rather than using gcloud.datastore.connection.Connection.save_entity() directly.

If you mix the two, the results will have extra IDs generated and it could jumble things up.

If you don’t want to use the context manager you can initialize a transaction manually:

>>> transaction = dataset.transaction()
>>> transaction.begin()

>>> entity = dataset.entity('Thing')
>>> entity.save()

>>> if error:
...   transaction.rollback()
... else:
...   transaction.commit()

For now, this library will enforce a rule of one transaction per connection. That is, If you want to work with two transactions at the same time (for whatever reason), that must happen over two separate gcloud.datastore.connection.Connection s.

For example, this is perfectly valid:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> with dataset.transaction():
...   dataset.entity('Thing').save()

However, this wouldn’t be acceptable:

>>> from gcloud import datastore
>>> dataset = datastore.get_dataset('dataset-id', email, key_path)
>>> with dataset.transaction():
...   dataset.entity('Thing').save()
...   with dataset.transaction():
...     dataset.entity('Thing').save()

Technically, it looks like the Protobuf API supports this type of pattern, however it makes the code particularly messy. If you really need to nest transactions, try:

>>> from gcloud import datastore
>>> dataset1 = datastore.get_dataset('dataset-id', email, key_path)
>>> dataset2 = datastore.get_dataset('dataset-id', email, key_path)
>>> with dataset1.transaction():
...   dataset1.entity('Thing').save()
...   with dataset2.transaction():
...     dataset2.entity('Thing').save()
Parameters:dataset (gcloud.datastore.dataset.Dataset) – The dataset to which this Transaction belongs.
add_auto_id_entity(entity)[source]

Adds an entity to the list of entities to update with IDs.

When an entity has a partial key, calling save() adds an insert_auto_id entry in the mutation. In order to make sure we update the Entity once the transaction is committed, we need to keep track of which entities to update (and the order is important).

When you call save() on an entity inside a transaction, if the entity has a partial key, it adds itself to the list of entities to be updated once the transaction is committed by calling this method.

begin()[source]

Begins a transaction.

This method is called automatically when entering a with statement, however it can be called explicitly if you don’t want to use a context manager.

commit()[source]

Commits the transaction.

This is called automatically upon exiting a with statement, however it can be called explicitly if you don’t want to use a context manager.

This method has necessary side-effects:

  • Sets the current connection’s transaction reference to None.
  • Sets the current transaction’s ID to None.
  • Updates paths for any keys that needed an automatically generated ID.
connection()[source]

Getter for the current connection over which the transaction will run.

Return type:gcloud.datastore.connection.Connection
Returns:The connection over which the transaction will run.
dataset()[source]

Getter for the current dataset.

Return type:gcloud.datastore.dataset.Dataset
Returns:The dataset to which the transaction belongs.
id()[source]

Getter for the transaction ID.

Return type:string
Returns:The ID of the current transaction.
mutation()[source]

Getter for the current mutation.

Every transaction is committed with a single Mutation representing the ‘work’ to be done as part of the transaction. Inside a transaction, calling save() on an entity builds up the mutation. This getter returns the Mutation protobuf that has been built-up so far.

Return type:gcloud.datastore.datastore_v1_pb2.Mutation
Returns:The Mutation protobuf to be sent in the commit request.
rollback()[source]

Rolls back the current transaction.

This method has necessary side-effects:

  • Sets the current connection’s transaction reference to None.
  • Sets the current transaction’s ID to None.

Helpers

Helper methods for dealing with Cloud Datastore’s Protobuf API.

gcloud.datastore.helpers.get_protobuf_attribute_and_value(val)[source]

Given a value, return the protobuf attribute name and proper value.

The Protobuf API uses different attribute names based on value types rather than inferring the type. This method simply determines the proper attribute name based on the type of the value provided and returns the attribute name as well as a properly formatted value.

Certain value types need to be coerced into a different type (such as a datetime.datetime into an integer timestamp, or a gcloud.datastore.key.Key into a Protobuf representation. This method handles that for you.

For example:

>>> get_protobuf_attribute_and_value(1234)
('integer_value', 1234)
>>> get_protobuf_attribute_and_value('my_string')
('string_value', 'my_string')
Parameters:val (datetime.datetime, gcloud.datastore.key.Key, bool, float, integer, string) – The value to be scrutinized.
Returns:A tuple of the attribute name and proper value type.
gcloud.datastore.helpers.get_value_from_protobuf(pb)[source]

Given a protobuf for a Property, get the correct value.

The Cloud Datastore Protobuf API returns a Property Protobuf which has one value set and the rest blank. This method retrieves the the one value provided.

Some work is done to coerce the return value into a more useful type (particularly in the case of a timestamp value, or a key value).

Parameters:pb (gcloud.datastore.datastore_v1_pb2.Property) – The Property Protobuf.
Returns:The value provided by the Protobuf.