Google App Engine: Using the List Property (db.ListProperty)

I. Many-to-One Relationship with Reference Property

A. Models

class UserInfo(db.Model):
  """key name = user id"""

class Story(db.Model):
  content = db.TextProperty()

class Favorite(db.Model):
  user_info = db.ReferenceProperty(UserInfo)
  story = db.ReferenceProperty(Story)

B. Request Handler

Note: The following code is a little inefficient.

user_info = UserInfo.get_by_key_name(users.get_current_user().user_id())
# get all favorites
favorites = user_info.favorite_set.fetch(1000) # equivalent to Favorite.all().filter('user_info =' user_info.key())
# get all favorite stories
story_keys = [f.story.key() for f in favorites]
stories = db.get(story_keys)

f.story.key() retrieves the story object from the datastore before returning its key -- and it's done one by one, due to the for loop.

Use this instead:

To get the key without retrieving the object use:

user_info = UserInfo.get_by_key_name(users.get_current_user().user_id())
# get all favorites
favorites = user_info.favorite_set # equivalent to Favorite.all().filter('user_info =' user_info.key())
# get all favorite stories
story_keys = [Favorite.story.get_value_for_datastore(f) for f in favorites.fetch(1000)] 
stories = db.get(story_keys)

Using get_value_for_datastore() here retrieves the story key without a datastore lookup of the story object. See this discussion.

C. Why Use This?

Calling user_info.favorite_set (a back-reference) returns a Query instance that you can order and filter.

If your Favorite model had a date property you could do this:

favorites = user_info.favorite_set.order('date')


II. Many-to-One Relationship with List Property

A. Models

class UserInfo(db.Model):
  """key name = user id"""
  favorites = db.ListProperty(db.Key) #list of keys of Story entities

class Story(db.Model):
  content = db.TextProperty()

B. Request Handler

user_info = UserInfo.get_by_key_name(users.get_current_user().user_id())
# get all favorites
favorites_keys = user_info.favorites
# get all favorite stories
stories = db.get(favorites_keys)

C. Why Use This?

The code is definitely cleaner, but you lose some flexibility.

1. You can't order or filter the favorites before retrieving them (although they'll probably be sorted in chronological order because that's how they were added.

2. The list of story keys is read when you fetch user_info. If a user has favorited a thousand stories, you have a list of a thousand story keys on your hands. You can fetch a limited number of these stories (e.g. db.get(favorites_keys[:5]), but with the first approach you never run into this problem, you just query only the favorites you want (e.g. user_info.favorite_set.fetch(5))

3. With the code as it is you can't access a list of users who have favorited a story. In the first approach you can (e.g. story.favorite_set)

You can filter all users that favorite a story like this:

favusers = UserInfo.all().filter('favorites =', somestory.key()).fetch(1000)


III. Many-to-One Relationship with List Property (avoid reading the list)

There were some problems with the second approach, even though it's a much simpler design.

Here's another way to approach the problem (from Google I/O: Building Scalable, Complex Apps on App Engine):

A. Models

class FavoriteIndex(db.Model):
  """parent is a Story entity"""
  favorited_by = db.StringListProperty() # list of user ids

class Story(db.Model):
  content = db.TextProperty()

B. Request Handler

index_keys = FavoriteIndex.all(keys_only=True).filter('favorited_by =', users.get_current_user().user_id())
# get keys of favorite stories
story_keys = [k.parent() for k in index_keys] # key.parent() documentation
# get all favorite stories
stories = db.get(story_keys)

C. How Does This Work?

Internally, the datastore represents a list property value as multiple values for the property. (source: List Property documentation)

So all of the items in the 'favorited_by' list property don't have to be traversed because it's as if each item in the list property was indexed separately. And since you're only returning the keys of the FavoriteIndex entities (i.e. FavoriteIndex.all(keys_only=True)) you never have to deserialize the list.

After you get the keys of the FavoriteIndex entities that you want it's easy to extract their parents' keys. And then we get the Story instances for the corresponding keys.

D. Why Use This?

1. Clean, simple code. We got rid of the UserInfo model because we're using the user id's of the built-in User objects.

2. The list property is never serialized/deserialized like with the second approach, making it much faster.

3. You can request a limited number of stories without serializing/deserializing the list (e.g. FavoriteIndex.all(keys_only=True).filter('favorited_by =', users.get_current_user().user_id()).fetch(5))

4. You can easily get access to all of the user id's of people who favorited a particular story. Here's how:

index = FavoriteIndex.all().ancestor(story).get()
user_ids = index.favorited_by

Note: One of the downsides to this approach is that you still can't order or filter favorite stories before you retrieve them.

E. Some Tips

1. a. If you want to be able to quickly figure out if someone favorited a story: for each favorite, create a 'UserFavorite' entity as a child entity of the relevant Story entry, with the key name set to the user's user id. This way, you can determine if a user has favorited a story with a simple get:

UserFavorite.get_by_key_name(user_id, parent=a_story)

(source: StackOverflow answer)

1. b. An alternative to the above (untested by me):

Give all the 'FavoriteIndex' objects the key name 'favs'

#retrieve a story
story = # ...
#get current user's id
uid  = users.get_current_user().user_id()
#build a new key object
fav_index_key = db.Key.from_path('Story', a_story.key.id_or_name(), 'FavoriteIndex', 'favs')
#perform a key only query (faster than normal)
#filter by the favorite index key that was just built and the user's id
is_fav_story = FavoriteIndex.all(keys_only=True).filter('__key__ =', fav_index_ey).filter('favorited_by =', uid).get()

This option is really nice because it doesn't require as many writes to the datastore (or as much datastore space), but it will probably be a little slower than the first option outlined in 1a.

2. If you want to be able to quickly access a user's information, make an entity to store information in (e.g. UserInfo) and set its key name equal to a user's id. This will make it easy to, for example, look up all of the username's of the users who favorited a story.

index = FavoriteIndex.all().ancestor(story).get()
users = UserInfo.get_by_key_name(index.favorited_by)
for user in users:
  print user.name

Comments (0)

Leave a comment...

About

web developer/designer; python, html, css/sass, jquery.

i like exploring:
digging deeper, deeper, deeper, then BAM pieces start to fit together.

currently: http://www.storylog.com/