James Hamilton's Blog RSS 2.0
 Thursday, May 29, 2008

Continued from Yesterday (day 1): Rough notes from Selected Sessions at Google IO Day 1.

 

Marissa Mayer Keynote: A Glimpse Under the Hood at Google

·         Showed iGoogle and talked about how Google Gadgets are a great way to get broad distribution and are a form of advertising.

·         Search is number 2 most used application (after email)

·         The ordinary and the everyday

·         Why is search page so simple?

·         Variation of Occam’s Razor: “the simple design is probably right”

·         Sergey did it and it was because “there was nobody else to do it and he doesn’t do HTML”

·         Described process of answering a query (700 to 1,000 machines in .16 seconds):

·         This time of day we’re busy so the query will likely go to one data center and likely get bounced to another (must be a simplification of what really happens – load ballancing)

·         Mixer

·         Google Web Server

·         Ads + Websearch (300 to 400 systems)

·         Back to mixer

·         Back to Web server

·         Back to load balancer

·         Split A/B Testing:

·         We given a subset of users a different user experience. Web services allow very detailed views and to iterate very quickly and evolve rapidly.

·         Example: amount of white space under Google logo on results page?

·         This test showed convincingly that less white space rather than more (produces more usage and more revenue)

·         Example: yellow or blue as background for paid adds

·         Yellow produced both more satisfaction and more revenue.

·         “If you don’t listen to your customers, someone else will” – Sam Walton

·         But you need to test rather than ask since they often don’t know.

·         Example: would you like 10, 20, or 30 results. Users unanimously wanted 30.

·         But 10 did way better in A/B testing (30 was 20% worse) due to lower latency of 10 results

·         30 is about twice the latency of 10 (I would have expected the other overheads to dominate.  Suggests there is another solution waiting to be found here).

·         Example: Maps was 120k for launch page.  We took 30 to 40k out.  Got a proportional increase in usage.

·         Example: Google Video uploads used to be 1 day to watch while YouTube offered “Watch it now”.  Much more compelling.

·         Urgent can drown out important

·         Users go from unskilled to skilled searchers very fast (under 1 month).  Consequently it’s better to optimize for expert since most are and novices get there fast due to fast feedback loop.

·         The lesson is to think longer term at all levels in design.

·         Think beyond the current development horizon.  10 years for major products and services.

·         Example: Universal search vs vertical search.  Users want verticals now but what they really want is universal search.  They just want to find the answer they are searching for.

·         Goog-411: don’t know if we can make money off this but it helps us develop voice recognition. Applications of voice recognition are monetizable so, even if Goog-411 doesn’t yield revenue, other applications will.

·         International content:

·         50% of the web is English but only aobut 1% of the web is Arabic

·         Conclusion: take an Arabic search, translate find relevant pages, then translate the result.  Opens up MUCH more content and dramatically improves the results for an Arabic user.

·         Larry Paige: ”A Healthy Disrespect for the Impossible” opens up many possibilities.

·         Showed examples of how search is not generally “solvable” but getting to 90 to 95% has HUGE benefit. Search is a hard and unconstrained problem.  Same with health records.

·         Recommendation: Be Scrappy & revel in constraints

·         Google operates in 140 countries and 110 languages.  Described the complexity of pulling out text strings from a web site, sending out to translation, dealing with multiple string versions, etc.

·         Betters solution: let the users help with the translated content.  If you don’t see your language, help us do it.  There are now ¼ million users helping with translation from all over the world.

·         Interesting little Easter egg:  one of the languages on the Google home page is “Bork! Bork! Bork!” – it’s the Swedish chef from the Muppets

·         Interesting little example: they took 11k Googler’s to Indiana Jones last week

·         Marissa went through a bunch of examples of taking on the impossible and brainstorming possible solutions and showing that some just exercised their thinking and others produced cool products/solutions.  Explained that 20% time is just another way of exercising the brain (“Imagination as a muscle”).  And Orkut, Google News, and during one period 50% of their new products, were from 20% time.

·         Random note: What you last searched for is the best context signal for the current search.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Thursday, May 29, 2008 8:59:35 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Wednesday, May 28, 2008

Rough notes from the sessions I attended at Google IO.  The sessions are going to be available in Video so, if you want more detail (or more accuracy :-)), you can check out the videos.

 

Vic Gundotra Keynote:

·         2 hour session walking through entire conference material mostly with demos: Open Social, Google Web Toolkit, Android, Gears,

·         8 Main Conference Tracks with multiple concurrent sessions in each:

o   AJAX & Javascript

o   APIs & Tools

o   Maps & Geo

o   Mobile

o   Social

o   Code Labs

o   Tech Talks

o   Fireside Chat

·         All recorded and will be available publically.

 

OpenSocial: A Standard for the Social Web

·         How do we socialize objects online without creating yet another social network (there are already at least 100)?

·         API for controlled exchange of Friends, Profiles, and Activities (the update stream)

·         Recommends Hal Varian’s (Google Chief Economist) ”Information Rules”

o   OpenSoicial is an implementation of Chapter 8

·         Association of Google, MySpace, and Yahoo!

o   http://opensocial.org

·         More than 275M users of OpenSocial

·         How to build an OpenSocial application?

o   JavaScript Version 0.7 now and REST services coming soon

o   Three groupings of the API

§  People & friends

§  Activities

§  Persistence

o   Programming model is async. Send a request and set a callback function that gets called on completion.

o   Update of activity field: postActivity(text) – also supports setting priority

o   Example server side REST services:

§  /people/{guid}/@all: collection of all people connected to the user

§  /peple/{guid}/@friends: friends

·         Main sell is to allow small sites to gain critical mass when friction of yet another login system and initial lack of users would have blocked.  Make it easier on users.

·         Showed a map of the world showing that different social networks have won in different geographies all over the world.

o   E.g. LiveJournal (Rusia), Orkut (Brazil)

·         OpenSocial gets you to all their users so plan to localize your application (OpenSocial is designed to support localication)

·         OpenSocial Terms:

o   Container:  the site (Hi5, MySpace, etc.)

o   Owner: author/owner of the page

o   Viewer: person viewing the page

·         Apache Shindig is an open source implementation with a goal of allowing new sites to host open social applications in well under an hour.

·         Shindig is an Apache Incubator project: http://incubator.apache.org/shindig

·         Summary: make the web more social, current version is 0.7, and 0.8 includes REST.

·         OpenSocial has 11 sessions in addition to this one at Google IO.

 

Google App Engine

·         This session packed.  Others quite lightly filled.

·         Google App Engine does one thing well

o   App engine handles HTTP requests, nothing else

o   Resources are scaling automatically

o   Highly scalable store based on BigTable

·         An application is a directory with everything underneath it

·         Single file app.yaml in app root directory

o   Defines app metadata

o   Maps URL patterns in regex to request handlers

o   Seperates static files from program fiels

·         Dev Server (SDK) emulates deployment environment

·         Request Handlers:

o   Python script invoked as though it were a CGI script

o   Environment variables give request parameters

§  PATH_INFO

§  QUERY_STRING

§  HTTP_REFERER

o   Write response to stdout

·         Runtime is Python only but the fact that it is specified in app.yaml suggests that more will eventually be added.

·         Showed Django support and how to use GAE with Django

o   Showed a minimal main.py

§  Import os from google.appengine.ext.webapp import util, ….

o   Also showed minimal settings.py

·         Note: Existing Django apps will NOT port easily to GAE.

 

Google Docs + Gears == Google Docs Offline

·         Google Docs Offline Architecture:

o   Document editor

o   Spreadsheet editor

o   Presentation editor

o   Authentication

o   Docs Home (doclist)

·         Overall, no big breakthroughs.  It’s just Docs offline but its work well done.

·         Challenges to disconnected operation:

o   Upgrade is a challenge: Now that code is being installed remotely, the server needs to support old code at least until the new code is pushed out and installed.

·         Possible solutions for static resources: fail to upgrade, sticky sessions, resource database, or serve the old version.

·         Solution implemented: resource database with a per-server cache

o   Rolling upgrade for HTML: hard code the offlineVersion and request it specifically – it will fail during rolling server upgrades but the speaker argued that it wasn’t worth the cost to avoid this failure.

o   Security: Decided to not do auth remotely and rely on O/S facilities (if you have access to the data at the O/S level, you get access). But they do provide support for multiple users since most power users have multiple personas (work and home at least).  Multi-user support is via putting the email address of the user in a cookie.  They have an loggedin and a loggedout manifest.  The loggedout manifest redirects to a dialog to chose one of your existing accounts. This either sets the loggedout cookie to an appropriate email address or fails. (loggedin cookie doesn’t have an email address – it has the google security context).

·         Recommendations:

o   Need to provide debugging tools (online can look at server logs – need something for online)

o   Rollout initially a small number

o   Support disabling offline experience for a user

 

Under the Covers of the App Engine Datastore

·         Speaker: Ryan Barrett: App engine Data Store Lead

·         Bigtable in one slide:

o   Scalable structured store

o   Types on each value

o   Single row transactions

o   Two types of scans: 1) prefix (physically contiguous), 2) range scan (also physically contiguous)

·         The entities table:

o   Primary GAE table

o   Stores all entities in all apps

o   Generic and schemaless

o   Row name is entity key

o   Only column is serialized entity

·         Entity key is based on parent entities (root to child, to child, etc.)

o   Note: Can’t change a primary key but can delete and create a new entity with new key

·         Queries and indexes:

·         GQL: Google Query Language

o   A tiny subset of SQL.  Most clauses restricted. Added the Ancestor clause.

·         Big table only supports scan.  No sorting and no filtering.

o   Because they have no knowledge of the app or data shape, they convert all queries to scans since that is all BigTable can do.

o   Indexes:

§  Kind Index (kind, key) where kind is child, grandparent, parent, …

§  Single-property index (kind, name, value key) : Serves queries on a single property. (there are two indexes: ascending and descending)

§  Composite index: defined by the user in index.yaml (generated by the dev environment if you run queries over all needed composite types).

o   All index comparisons are lexicographic

o   They support index intersection.  Multiple equals filters and an equals filter and an ancestor restriction for example (just do index anding).

·         Indexes space consumption is not charged for since they don’t want to make people go to considerable pain to avoid using, for example, composite indexes.  Ryan went on to explain that this is what he “wants” but it is not a committed decision.

·         If a query can’t be satisfied with a range scan, they query will be failed (need index exception).

·         Transaction model: all writes are transactional

o   All writes are written to journal with timestamp

o   No locking – they use optimistic concurrency control.

o   Each entity has a last committed time.  All reads access last committed time.  All writes check to ensure last committed hasn’t changed. The committed timestamp is only updated after the full value is written out and the log entry is written. The log entry is a big table row and each row supports atomic writes.  He didn’t provide enough detail to fully debug/understand the commit protocol implementation.

o   You define entity groups (defined by the root entity – all descendents are in the same entity group.  Only the root has the timestamp.

o   He did say that all writes to a entity group are serialized so make the entity groups small.

 

Working with Google App Engine Models:

·         Speaker: Rafe Kaplan

·         Other object relational mapping systems:

o   ActiveRecord

o   Django

o   Hibernate

·         Does not map to an RDBMS

·         No pre-existing schema

·         No joins, No Aggs, & no functions

·         Showed how to model relationships

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

Wednesday, May 28, 2008 5:14:59 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Friday, May 23, 2008

Wednesday Yahoo announced they have a built a petascale, distributed relational database.  In Yahoo Claims Record With Petabyte Database, the details are thin but they built on the PostgreSQL relational database system. In Size matters: Yahoo claims 2-petabyte database is world's biggest, busiest, the system is described as an over 2 petabyte repository of user click stream and context data with an update rate for 24 billion events per day.  Waqar Hasan, VP of Engineering at Yahoo! Data group, describes the system as updated in real time and live – essentially a real time data warehouse where changes go in as they are made and queries always run against the most current data. I strongly suspect they are bulk parsing logs and the data is being pushed into the system in large bulk units but, even near real time at this update rate, is impressive.

 

The original work was done at a Seattle startup called Mahat Technologies acquired by Yahoo! in November 2005.

 

The approach appears to be similar to what we did with IBM DB2 Parallel Edition.  13 years ago we had it running on a cluster of 512 RS/6000s at the Maui Super Computer Center and 256 nodes at the Cornel Theory Center.  It’s a shared nothing design which means that each server in the cluster have independent disk and don’t share memory. The upside of this approach is it scales incredibly well. It looks like Yahoo! has done something similar using PostgreSQL as the base technology.  Each node in the cluster runs a full copy of the storage engine.  The query execution engine is replaced with one modified to run over a cluster and use a communications fabric to interconnect the nodes in the cluster.  The parallel query plans are run over the entire cluster with the plan nodes interconnected by the communication fabric.  The PostgreSQL client, communications protocol and server side components with some big exceptions run mostly unchanged.  The query optimizer is either replaced completely with a cluster parallel aware implementation that models the data layout and cluster topology in making optimization decisions.  Or the original, non-cluster parallel optimizer is used and the resultant single node plans are then optimized for the cluster in a post optimization phase. The former will yield provably better plans but it’s also more complex. I’m fearful of complexity around optimizers and, as a consequence, I actually prefer the slightly less optimal, post-optimization phase.  Many other problems have to be addressed including having the cluster metadata available on each node to support SQL query compilation but what I’ve sketched here covers the major points required to get such a design running.

 

The result is a modified version of PostgreSQL runs on each node.  A client can connect to any of the nodes in the cluster (or a policy restricted subset).  A query flows from the client to the server it chose to connect with. The SQL compiler on that node compiles and optimizes the query on that single node (no parallelism). The query optimizer is either cluster-aware or uses a post-optimization cluster-aware component.  The resultant query plan when ready for execution is divided up into sub-plans (plan fragments) that run on each node connected over the communication fabric.  Some execution engines initiate top-down and some bottom up. I don’t recall what PostgreSQL uses but bottom-up is easier in this case.  However, either can be made to work.  The plan fragments are distributed to the appropriate nodes in the cluster.  Each runs on local data and pipes results to other nodes which run plan fragments and forward the results yet again toward the root of the plan. The root of the plan runs on the node that started the compilation and the final results end up there to be returned to the client.

 

It’s a nice approach and as evidenced by Yahoo’s experience it scales, scales, scales.  I also like the approach in that most tools and applications can continue to work with little change.  Most clusters of this design have some restrictions such unique ID generation is either not supported or slow as is referential integrity.  Nonetheless, a large class of software can be run without change.

 

If you are interested in digging deeper into Relational Database technology and how the major commercial systems are written, see Architecture of a Database System.

 

Yahoo has a long history of contributing to Open Source and they are the largest contributor to the Apache Hadoop project. It’ll be interesting to see if Yahoo! Data ends up open source or held as an internal only asset.

 

Kevin Merritt pointed me to the Yahoo! Data work.

 

                                                -jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

Friday, May 23, 2008 6:22:38 AM (Pacific Standard Time, UTC-08:00)  #    Comments [4] - Trackback
Software
 Wednesday, May 21, 2008

Search drives the online commerce world by bringing sellers and buyers together.  As a seller, you most important task is getting your site to rank high organically and to have your advertisements placed most prominently and most frequently to user interested in buying and only to users interested in your product.  A buyer chooses a search engine on the basis of more reliably getting them to what they are looking for.  And, with commercial queries, getting them to the “best” seller where best is a fairly complex and hard to define term in this context.  Happy buyers keep using the search engine and paying the sellers.  Sellers who manage their organic and paid placements correctly sell lots of product.  Successful search engines make considerable profit.  That’s just the way the ecosystem has evolved – it’s the broadly used search engine that has all the influence and so they end up with considerable profit.

 

What if the rules changed?  What if some of the search engine profit was returned to users?  Could this change the ecosystem and could it be a good thing?  Let’s watch because Microsoft is about to announce a “cash back service” later today according to Search Engine Land.  In this posting, Playing with Live Cashback, the blog author demonstrates using the Live Cashback system and concludes that it won’t have much impact.  I’m less certain.  I suspect that respecting users and returning some value to them will change this market in positive way. It’ll be fun to watch over the next 4 to 6 weeks and see