James Hamilton's Blog RSS 2.0
 Friday, June 06, 2008

There was an interesting talk earlier today at Microsoft Research by Jason Cong of the UCLA Computer Science Department on compiling design specifications in C/C++/SystemC and user constraints into ASIC and FPGA design. The advantage of compiler based approaches include, more productivity working at a higher level, automating verification, allows optimization, and allows rapid experimentation at different frequencies and different optimization goals (performance vs power, for example). As design complexity increases higher level language and optimization have excellent potential.  Essentially the same thing is happening in hardware design as happened 30 years ago in operating systems implementation languages. High level implementation languages replace lower level as complexity climbs.  For example, the Intel Core 2 Duo processor is a 1B transistor implementation whereas the 386 was 1/1000th of that complexity at 1M.

 

Also super interesting was the example from a financial institution that is taking a software based stock analysis system where they take the hottest parts of the system and compile these to FPGA implementations. 30x faster at 1/10th the power. Very cool.  

 

Now that AMD supports hyper transport it is possible to implement custom processors with excellent overall system performance.  Intel has opened up the FSB and is also expected to offer a non-compatible hyper transport-like implementation in the future.

 

My rough notes from the talk follow:

 

·         Speaker: Jason Cong, UCLA Computer Science (cong@cs.ucla.edu)

·         Working on on-chip interconnects & communications

o   3D IC design

o   RF-interconnects

§  Note that power restrictions restrict processors to ~5GH

·         But, communications lines can scale to 100s of GH

·         Dividing communications link into 10 or more “channels” that operate at different frequencies

·         This talk focused on  ESL SystemC to FPGA compiler

·         Why?

o   700,000 lines of RTL for a 10M gate design is too much

o   Allows executable specification

o   Verification requires executable design

o   Accelerated computing or reconfigurable computing also need C/C++ based compilation/synthesis to FPGAs

§  CPUs coupled with FPGA to support common functions at high performance and lower power

·         Note that performance limited by communications (getting data to the CPU)

o   Long wires that have to be traversed in a single clock are the limiting factor

o   This research focuses on supporting multi-cycle communications

·         xPilot: Behavior-to-RTL (Register Transfer Level design) synthesis flow

o   takes behavior spec in C/SystemC to front-end compiler to SSDM

o   SSDM is optimized using standard compiler optimization (loop unrolling, strength reduction, scheduling, etc.)

o   SSDM is compiled to:

§  Verilog/VHDL/SystemC

§  FPGAs: Altera, Xilinx

§  ASICs: Magma, Synopsys

o   UPS: Uniform Power Specification

o   During final compilation optimize for power and shut off compute units that are not being used and shut off those that are being during idle periods (a busy disk controller is frequently waiting for mechanicals and not need to execute instructions)

§  Can’t shut FPGAs but can with ASICs

§  The only solution to dynamic power leakage only solution is shut the component of

o   Allows faster experimentation than hand coding.  You can try different frequencies and different power optimizations (too complex for most humans)

o   Scheduling (allocation of operations to compute logic and specific clock cycles) is NP-complete and automated techniques can exceed quality of expert designs

·         Example:

o   Schedule the behavior to RTL using the following characterization, cycle time, constraints, and objectives:

§  Platform characterization: adder (2ns) & multiplier (5ns)

§  Target cycle time (10ns)

§  Resource constraint: only one multiplier is available

§  Objective: high performance or lower power as examples

·         Note as optimizing and reducing component counts, less space is required, which can allow faster clocking

·         Investigating compilation for Reconfigurable Accelerated Computing

o   Take GCC 3DES implementation and synthesis FPGS RTL description

o   Example took 3DES from C level implementation to a FPGA (Xilinx Virtex-5)

·         Investment bank is using this tool to compile financial optimizations from S/W implementation to FPGA accelerators (Black-Scholes S/w kernel)

o   30x speed-up over software implementation and 1/10 the power (6 vs 68W)

 

A related presentation: http://cadlab.cs.ucla.edu/~cong/slides/fpt05_xpilot_final.pdf.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Friday, June 06, 2008 10:52:35 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Hardware
 Tuesday, June 03, 2008

 

Last week at Google IO, pricing was announced for Google Application Engine. Actually it was blogged the night before at: http://googleappengine.blogspot.com/2008/05/announcing-open-signups-expected.html.

 

The prices are close to identical with Amazon AWS although GAE differs substantially from the AWS offerings.  The former offers a easy to use Python  execution environment whereas Amazon offers the infinitely flexible run-this-virtual-machine model. Clearly the Amazon model costs more to provide so, by that measure, AWS pricing is somewhat better:

 

Google Application Engine Pricing:

·    $0.10 - $0.12 per CPU core-hour

·    $0.15 - $0.18 per GB-month of storage

·    $0.11 - $0.13 per GB outgoing bandwidth

·    $0.09 - $0.11 per GB incoming bandwidth

·    From: http://googleappengine.blogspot.com/2008/05/announcing-open-signups-expected.html

Compared with AWS Pricing:

·         $0.10 - $0.80 per VM hour (depending upon resources allocated)

·         $0.15 per GB-month of storage

·         $0.100 - $0.170 per GB outgoing bandwidth

·         $0.100 per GB incoming bandwidth

·         From: http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2?ie=UTF8&node=16427261&no=3435361&me=A36L942TSJ2AJA and http://www.amazon.com/EC2-AWS-Service-Pricing/b/ref=sc_fe_l_2?ie=UTF8&node=201590011&no=3435361&me=A36L942TSJ2AJA.

There are some important differences that make the pricing comparison somewhat biased in a couple of ways. Two important differences: 1) as mentioned above, Amazon gives an entire virtual machine so EC2 is much more flexible than GAE both in that it can run arbitrary applications in arbitrary languages and that it supports all execution models whereas GAE only supports HTTP request/response.  Another key difference is the storage subsystem.  In the numbers above, we’re comparing the Amazon blob store (S3) with the more structured storage model offered by GAE.  The more comparable AWS SimpleDB pricing is considerably higher than the GAE storage pricing. SimpleDB charges $1.50 GB/month in addition to machine usage and network transmission costs.  GAE is offering much more affordable semi-structured storage and the GAE storage model actually supports data types rather than having to force everything to character format.

 

GAE is still free to start with under 5M page views/month and up to ½ GB storage for free.  Obviously this helps developers get started without strings and that’s a good thing. But, more importantly, it avoids Google from having to go to the expense of billing very small values.  In a weird sort of way, I’m more impressed with AWS billing $0.04 on some accounts in that it shows there billing system is incredibly lean. Scaling down billing is hard, hard, hard.

 

In addition to announcing prices, GAE went from a controlled admission beta to a fully open beta where all comers are welcome.  I’m impressed how quickly they have gone from making the service initially available to a full open beta.  Impressive.

 

Also announced last week was a new GAE Memcached API which appears to have been a 20% project of Brad Fitzpatrick (Sriram Krishnan sent my way). And a set of image manipulation APIs supporting image scaling, rotating, etc. will now be part of the GAE API set.

 

My notes from Google IO:

·         http://perspectives.mvdirona.com/2008/05/29/RoughNotesFromSelectedSessionsAtGoogleIODay1.aspx

·         http://perspectives.mvdirona.com/2008/05/29/IO2008RoughNotesFromMarissaMayerDay2KeynoteAtGoogleIO.aspx

·         http://perspectives.mvdirona.com/2008/05/30/IO2008RoughNotesFromSelectedSessionsAtGoogleIODay2.aspx

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

Tuesday, June 03, 2008 8:05:05 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Sunday, June 01, 2008

Yesterday the Tribute to Honor Jim Gray was held at the University of California at Berkeley. We all miss Jim deeply so it really is a tough topic.  But it was great to get together with literally 100s of Jim’s friends and share stories and talk about some of his accomplishments, his contributions to the field, and his contributions to each of us.  Jim is amazing across all three dimensions but what is most remarkable is the profound way he helped others achieve more throughout the industry.  We’re all better engineers, researchers, and human beings for having been lucky enough to have known and worked with Jim.

 

Also announced yesterday was the creation of the Jim Gray Chair at Cal Berkeley.  Bill Gates Eric Schmidt, Marc Benioff, and Mike Stonebraker each donated $250,000 which were matched by a $1,000,000 from the Hewlett Foundation.

 

Seattle PI coverage: Gathering in Berkeley, Calif., today to honor legendary scientist, Microsoft researcher Jim Gray.

 

The morning, general session agenda:

             Welcome - Shankar Sastry

             Opening Remarks - Joseph Hellerstein

             A Tribute, Not a Memorial: Understanding Ambiguous Loss - Pauline Boss

             The Amateur Search - Michael Olson

             Jim Gray at Berkeley - Michael Harrison

             Knowledge and Wisdom - Pat Helland

             Why Did Jim Gray Win the Turing Award? - Michael Stonebraker

             Jim Gray Chair - Stuart Russell

             500 Special Relationships: Jim as a Mentor to Faculty and Students - Ed Lazowska

             Jim Gray: His Contributions to Industry - David Vaskevitch

             A "Gap Bridger" - Richard Rashid

             Thanks to the U.S. Coast Guard - Paula Hawthorn

 

The afternoon, technical session agenda:

·         Welcome - Shankar Sastry

·         Opening Remarks - Joseph Hellerstein

·         A Tribute, Not a Memorial: Understanding Ambiguous Loss - Pauline Boss

·         The Amateur Search - Michael Olson

·         Jim Gray at Berkeley - Michael Harrison

·         Knowledge and Wisdom - Pat Helland

·         Why Did Jim Gray Win the Turing Award? - Michael Stonebraker

·         Jim Gray Chair - Stuart Russell

·         500 Special Relationships: Jim as a Mentor to Faculty and Students - Ed Lazowska

·         Jim Gray: His Contributions to Industry - David Vaskevitch 

·         A "Gap Bridger" - Richard Rashid

·         Thanks to the U.S. Coast Guard - Paula Hawthorn

 

The event was video recorded and streamed via http://webcast.berkeley.edu/.

 

Update: the video will be at:  Tribute to Honor Jim Gray - General Session (thanks to George Spix for sending my way).

Second Update: A good article by John Markoff of the NY Times: A Tribute to Jim Gray: Sometimes Nice Guys Do Finish First.

 

                                    --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Sunday, June 01, 2008 10:08:01 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Ramblings
 Thursday, May 29, 2008

Google IO notes continued from earlier in the day: http://perspectives.mvdirona.com/2008/05/29/IO2008RoughNotesFromMarissaMayerDay2KeynoteAtGoogleIO.aspx and yesterday: http://perspectives.mvdirona.com/2008/05/29/RoughNotesFromSelectedSessionsAtGoogleIODay1.aspx

 

Google Web Toolkit and Client-Server Communications

·         Speaker: Miquel Mendez

·         GWT client/server communication options:

o   Frames

o   Form Panel

o   XHR: RequestBuilder (be careful don’t to start too many—many browsers have limits)

o   XML RPC

·         XML Encoding/Decoding: com.google.gwt.xml defines XML related classes

·         JSON Encoding/Decoding: com.google.gwt.json.JSON defines JSON related classes

·         GWT RPC: Generator that generates code and makes use of RequestBuilder

 

Reusing Google APIs with Google Web Toolkit

·         Speaker: Miquel Mendez

·         GALGWT: Google API Library for GWT.  It’s an open source project lead by Miquel (Javascript bindings to GWT).

o   It’s a collection of easy to use GWT bindings for existing Google JavaScript APIs

o   It’s a Google code open source project

·         Reminder: GWT is a java to Javascript compiler.

·         GWT now has a gadget class.  Google Gadget creation using GWT by extending the Gadget class and implementing the NeedsXXX intrerfaces.

·         Gears support:

o   Exposes database, LocalServer, and WorkerPool JS modules

o   Provides an offline module that automates the process fo going offline (creates the necessary manifests automatically)

·         Google Maps support as well

 

Engaging User Experiences with Google App Engine

·         Speakers: John Skidgel (designer) & & Lindsay Simon (developer).

·         Showed a guest book application written using Djanjo Form.  It’s been modified to run under App Engine (didn’t say how).

·         App engine development environment makes it easy to work with a designer as it’s easy to install and runs well on a Mac.

·         Walked through what they called 3D (Design, Develop, & Deploy) and how they handle it.

·         Authentication options:

·         Do your own

·         Use GAE (any authenticated user or you can  narrow the population to your domain only – all supported out of the box).

·         Don’t make auth a gating factor or you will lose users – auth at the last possible moment

·         Use the App Engine Datastore for sessions

·         Decreasing Latency:

·         Create build rules to concatenate and minify (Yahoo! Minifier) CSS and JS

·         File fingerprinting

·         Set expires headers for a very long time but add a version ID.  He showed how to handle the version number on the server side.  The recommendation was 10 year expiration with version numbers.

·         Recommends “Progressive Enhancement” or “Defensive Enhancement”.  You should still be able to render without JS. JS should give a better experience but you may not have JS (crappy mobile browsers for example).  Another test, shut off CSS and it should still work.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Thursday, May 29, 2008 5:55:26 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services

Continued from Yesterday (day 1): Rough notes from Selected Sessions at Google IO Day 1.

 

Marissa Mayer Keynote: A Glimpse Under the Hood at Google

·         Showed iGoogle and talked about how Google Gadgets are a great way to get broad distribution and are a form of advertising.

·         Search is number 2 most used application (after email)

·         The ordinary and the everyday

·         Why is search page so simple?

·         Variation of Occam’s Razor: “the simple design is probably right”

·         Sergey did it and it was because “there was nobody else to do it and he doesn’t do HTML”

·         Described process of answering a query (700 to 1,000 machines in .16 seconds):

·         This time of day we’re busy so the query will likely go to one data center and likely get bounced to another (must be a simplification of what really happens – load ballancing)

·         Mixer

·         Google Web Server

·         Ads + Websearch (300 to 400 systems)

·         Back to mixer

·         Back to Web server

·         Back to load balancer

·         Split A/B Testing:

·         We given a subset of users a different user experience. Web services allow very detailed views and to iterate very quickly and evolve rapidly.

·         Example: amount of white space under Google logo on results page?

·         This test showed convincingly that less white space rather than more (produces more usage and more revenue)

·         Example: yellow or blue as background for paid adds

·         Yellow produced both more satisfaction and more revenue.

·         “If you don’t listen to your customers, someone else will” – Sam Walton

·         But you need to test rather than ask since they often don’t know.

·         Example: would you like 10, 20, or 30 results. Users unanimously wanted 30.

·         But 10 did way better in A/B testing (30 was 20% worse) due to lower latency of 10 results

·         30 is about twice the latency of 10 (I would have expected the other overheads to dominate.  Suggests there is another solution waiting to be found here).

·         Example: Maps was 120k for launch page.  We took 30 to 40k out.  Got a proportional increase in usage.

·         Example: Google Video uploads used to be 1 day to watch while YouTube offered “Watch it now”.  Much more compelling.

·         Urgent can drown out important

·         Users go from unskilled to skilled searchers very fast (under 1 month).  Consequently it’s better to optimize for expert since most are and novices get there fast due to fast feedback loop.

·         The lesson is to think longer term at all levels in design.

·         Think beyond the current development horizon.  10 years for major products and services.

·         Example: Universal search vs vertical search.  Users want verticals now but what they really want is universal search.  They just want to find the answer they are searching for.

·         Goog-411: don’t know if we can make money off this but it helps us develop voice recognition. Applications of voice recognition are monetizable so, even if Goog-411 doesn’t yield revenue, other applications will.

·         International content:

·         50% of the web is English but only aobut 1% of the web is Arabic

·         Conclusion: take an Arabic search, translate find relevant pages, then translate the result.  Opens up MUCH more content and dramatically improves the results for an Arabic user.

·         Larry Paige: ”A Healthy Disrespect for the Impossible” opens up many possibilities.

·