Under the Covers of Google App Engine Datastore

My notes from an older talk done by Ryan Barrett on the Google App Engine Data store at Google IO last year (5/28/2008). Ryan is a co-founder of the App Engine team.

· App Engine Data Store is build on Big Table.

o Scalable structured storage

o Not a sharded database

o Not an RDBMS (MySQL, Oracle, etc.)

o Not a Distributed Hash Table (DHT)

o It IS a sharded sorted array

· Supported operations:

o Read

o Write

o Delete

o Single row transactions (optimistic concurrency control).

o Scans:

1. Prefix scan

2. Range scan

· Primary object: Entity

o Stored in entity table

o Each row has a name and the row name is fully qualified /root/parent/entity/child

o Each entity has a parent or is a root entity and may have child entities

o Primary key is the fully qualified name and this can’t change

o An entity can’t be reparented (it can be deleted and created with a different parent)

· Queries:

o Queries can be filtered on kind and Ryan says kind “is like a table” (kind can be parent, child, grandparent, …)

o Queries can be filtered on ancestor

o Query language is GQL (presumably Google Query Language) which is a small subset of SQL

o All queries must be expressible as range or prefix scans (no sort, orderby, or other unbounded size operations supported)

· Secondary index implementation:

o Indexes are also implemented as BigTable tables

o Kind Index:

· Contents: (kind, key)

o Single property index:

· Coentents: (kind, name, value)

· Two copies of this index maintained: 1) ascending, and 2) descending

o Composite indexes:

· Contents: (kind, value, value)

· Supports multi-property indexes

· Built on programmer request but not on use (query returns error if required doesn’t exist)

· Programmer can specify what composite indexes are needed in index.yaml

· SDK creates composit index specs automatically in index.yaml as queries are run

· Entity group

o Supports multi-entity update

· Defined by root entity (all entities under a root are an entity group)

· All journaling and transactions done at root

· Text and Blobs:

o Not indexed. All other properties are

James Hamilton, Amazon Web Services

1200, 12th Ave. S., Seattle, WA, 98144
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
james@amazon.com

H:mvdirona.com | W:mvdirona.com/jrh/work | blog:http://perspectives.mvdirona.com

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.