James Hamilton's Blog RSS 2.0
 Friday, January 18, 2008

Dave Dewitt and Michael Stonebraker posted an article worth reading yesterday titled: MapReduce: A Major Step Backwards (Thanks to Kevin Merrit and Sriram Krishnan for sending this one my way). Their general argument is that MapReduce isn’t better than current generation RDBMS which is certainly true in many dimensions and it isn’t a new invention which is also true.  I’m not in agreement with the conclusion that MapReduce is a major step backwards but I’m fully in agreement with many of the points building towards that conclusion.  Let’s look at some of the major points made by the article:

 

1. MapReduce is a step backwards in database access

In this section, the authors argue that schema is good, separation of schema and application are good, and high level language access is good. On the first two points, I agree schema is good and there is no question that application/schema separation has long ago proven to be a good thing.  The thing to keep in mind is that MapReduce is only an execution framework.  The data store is GFS or sometimes Bigtable in the case of Google or HDFS or HBase in the case of Hadoop. MapReduce is only the execution framework so it’s not 100% correct to argue that MapReduce doesn’t support schema – that’s a store issue and it is true that most stores that MapReduce is run over don’t implement these features today. 

 

I argue that a separation of execution framework from store and indexing technology is a good thing in that MapReduce can be run over many stores.  You can use MapReduce over either BigTable (which happens to be implemented on GFS) or over GFS depending upon the type of data you have at hand.  I think that Dewitt and Stonebraker would both agree that breaking up monolithic database management systems into extensible components is a very good thing to do. In fact much of the early work in extensible database management systems was done by David Dewitt.  The point here is that Dewitt and Stonebraker would like to see schema enforcement as part of the store and, generally, I agree that this would be useful.  However, MapReduce is not a store. 

 

They also argue that high level languages are good.  I agree and any language can be used with MapReduce systems so this isn’t a problem and is supported today.

 

2. MapReduce is a poor implementation

The argument here is that any reasonable structured store will support indexes.  I agree for many workloads you absolutely must have indexes. However, for many data mining and analysis algorithms, all the data in a data set is accessed.  Indexes, in these cases, don’t help.  This is one of the reason why many data mining algorithms run poorly over RDBMS – if all they are going to do is repeatedly scan the same data, a flat file is faster.  It depends upon application access pattern and the amount of data that is accessed.  A common execution approach for data mining algorithms is to export the data to a flat file and then operate on it there.  An index helps when you are looking at a small subset of the data and there is point N where if you are looking at less than N% of the data, the index helps and should be used. But, if looking at more than N%, you are better off table scanning.  The point N is implementation dependent but storage technology trends have been pushing this number down over the years.  Basically some algorithms look at all the data and aren’t helped by indexes and some look at only a portion of the data and for those that look at more than N% of the data, the index again won’t help.

 

There is no question that indexes are a good thing and there is no arguing that much of the worlds persistent storage access is done through indexes.  Indexes are good.  But, they are not good for all workloads and for all access patterns.  Remember MapReduce is not a store – only an execution framework. To implement indexing in a store used by MapReduce would be easy and presumably someone will when it’s need is broadly noticed.  In the interim, indexes can be built using MapReduce jobs and then used by subsequent MapReduce jobs.  Certainly more of a hassle than stores that automatically maintain indexes but acceptable for some workloads.

 

3. MapReduce is not novel

This is clearly true. These ideas have been fully and deeply investigated by the database community in the distant past.  What is innovative is scale.  I’ve seen MapReduce clusters of 3,000 nodes and I strongly suspect that clusters of 5,000+ servers can be found if you look in the right places.  I’ve been around parallel database management systems for many years but have never seen multi-thousand node clusters of Oracle RAC or IBM DB2 Parallel Edition.  The innovative part of MapReduce is that it REALLY scales and, for where MapReduce is used today, scale matters more than everything else.  I’ll claim that 3,000 server query engines ARE novel but I agree that the constituent technologies have been around for some time.

 

4.  MapReduce is missing features

All of the missing features (bulk loader, indexing, updates, transactions, RI, views) are features that could be implemented in a store used by MapReduce.  As these features become important in domains over which MapReduce is used, they can be implemented in the underlying stores.  I suspect, as long as MapReduce is used for analysis and data mining workloads the pressing need for RI may never get strong enough to motivate someone to implement it.  However, it clearly could be done and the absence of RI in many stores is not a shortcoming of MapReduce.

 

5.  MapReduce is incompatible with the DBMS tools

 

I 100% agree. Tools are useful and today many of these tools target RDBMS.  It’s not mentioned by the authors but another useful characteristic of RDBMS is developers understand them and many people know how to write SQL.  It’s an data access and manipulation language that is broadly understood.  The thing to keep in mind is that MapReduce is part of a componentized system.  It’s just the execution framework. I could easily write a SQL compiler that emitted MapReduce jobs (SQL doesn’t dictate or fundamentally restrict the execution engine design).  MapReduce can be run over simple stores as it mostly is today or over stores with near database level functionality if needed.

 

I’m arguing that the languages with which MapReduce jobs are expressed could be higher level and there have been research projects to do this (for example: http://research.microsoft.com/research/sv/dryad/). Even a SQL Compiler is possible over MapReduce.  And I’m arguing that MapReduce could be run over very rich stores with indexes and integrity constraints should that become broadly interesting.  MapReduce is just an execution engine that happens to scale extremely well. For example, in the MapReduce-like system used around Microsoft, there exist layers of languages above the execution engine that offer different levels of abstraction and control on the same engine.

 

An execution engine that runs on multi-thousand node clusters really is an important step forward.  The separation of execution engine and storage engine into extensible parts isn’t innovative but it is a very flexible approach that current generation commercial RDBMS could profit from.

 

I love MapReduce because I love high scale data manipulation.  What can be frustrating for database folks is 1) most of the ideas of MapReduce have been around for years and 2) there has been decades of good research in the DB community focusing on execution engine techniques and algorithms that haven’t yet been applied to the MapReduce engines. Many of these optimizations from the DB world will help make better MapReduce engines.  But, for all these faults, MapReduce sure does scale and it’s hard not to love being able to submit a job and see several thousand nodes churning over several petabytes of data.  Priceless.

 

                                    --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh

Friday, January 18, 2008 12:34:49 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Software
 Tuesday, January 15, 2008

The article below is a restricted version of what I view to be the next big thing.  If I was doing a start-up today, it would be data analysis and optimization as a service.  The ability to run real time optimization over an understandable programming platform for a multi-thousand node cluster is very valuable to almost every industry.  The smaller their margins, the harder it is for them to afford to not to use yield management and optimization systems.  The airlines showed us years ago that yield management systems are a big part of what it takes to optimize the profitability of an airline.  Walmart did the same thing to brick and mortar retail.  Amazon applied the same for online commerce.  The financial community has been doing this for years.  The online advertising business is driven by data analysis.  What’s different as a service is that the technology can be made available to the remaining 99% of the business world.  Those that don’t have the scale to afford doing their own.

 

Storing blobs in the sky is fine but pretty reproducible by any competitor.  Storing structured data as well as blobs is considerably more interesting but what has even more lasting business value is the storing data in the cloud AND providing a programming platform for multi-thousand node data analysis.  Almost every reasonable business on the planet has a complex set of dimensions that need to be optimized. The example below is a good one but it’s only one.  Almost every business can be made more profitable by employing some form of yield management.  And, it’s ideally set up for a service based solution: the output data is small, the algorithms to be run are small, the data that forms the input is HUGE but it accumulates slowly over times.  Historically large in aggregate but not that large per unit time.  The input data per unit time is of manageable size and could be sent to the service.  The service provides petabytes of storage and thousands of processors and a platform to run data analysis and optimization programs.  This service would also be great for ISVs with companies springing up all over to write interesting algorithms that can be rented and run in the infrastructure.

 

                                                                --jrh

 


From: Jeff Carnahan
Subject: Fluidity better than perfection?

Good reading for a Saturday morning…

 

Flight Plan

The math wizards at Dayjet are building a smarter air taxi--and it could change the way you do business.

From: Issue 115 | May 2007 | Page 100 | By: Greg Lindsay | Photographs By: Jill Greenberg and Courtesy DayJet

It's only fitting that a service pitched to traveling salesmen should find itself confronting an especially nasty version of what's known as the "traveling-salesman problem." Stated simply: Given a salesman and a certain number of cities, what's the shortest possible path he should take before returning home? It's a classic conundrum of resource allocation that rears its ugly head in industries ranging from logistics (especially trucking) to circuit design to, yes, flesh-and-blood traveling salesmen: How do you minimize the cost and maximize your efficiency of movement?

Solving for X: Once the FAA clears the way for the Eclipse 500, Iacobucci will get to see how good his models really are.

Back in 2002, that was the question facing DayJet, a new air-taxi service hoping to take off this spring. Based in Delray Beach, Florida, DayJet will fly planes, but its business model isn't built around its growing fleet of spanking-new Eclipse 500 light jets. It's built on math and silicon, and the near-prophetic powers that have in turn emerged from them. "We're a software and logistics company that only happens to make money flying planes," insists Ed Iacobucci, an IBM (NYSE:IBM) veteran and cofounder of Citrix Systems (NASDAQ:CTXS), who started DayJet as his third act.

The advent of affordable air taxis has been heralded by a steady drumbeat of press over the past few years, with an understandable fixation on the sexy new technology that's generally credited with making the market possible: the planes. The Eclipse 500 is a clean-sheet design for a tiny jet that seats up to six and costs about $1.5 million (the Federal Aviation Administration may clear it for mass production as early as next month). It is also the most fuel-efficient certified jet in the sky. Cessna, meanwhile, has rolled out its own, if pricier, "very light jet" (VLJ), with Honda's (NYSE:HMC) set to appear in 2010. No less an authority than The Innovator's Dilemma author and Harvard Business School professor Clayton Christensen has mused in print that the E500 and its ilk "could radically change the airline industry" by disrupting the hub-and-spoke system we all know and despise.

But Iacobucci, who wrote a check long ago for more than 300 orders and options on Eclipse's first planes, isn't relying on the aircraft to make or break him. Instead, it's his company's software platform--and the novel way it attacks the traveling-salesman problem--that will set DayJet apart. On day one of operations, flying from just five cities in Florida with only 12 planes, DayJet's dispatchers will already have millions of interlocking flight plans to choose from. As the company's geographic footprint spreads (with luck) across the Southeast--and as its fleet expands as well--the computational challenge only gets worse. Factor in such variables as pilot availability, plane maintenance schedules, and the downpours that drench the peninsula like clockwork in the summer, and well, you get the idea: Finding the shortest, fastest, and least-expensive combination of routes could take every computer in the universe until the end of time.

"I knew what the complexities were and how the problem degenerates once you reach a threshold," Iacobucci says. So he didn't try to find the optimal solution. Instead, DayJet began looking for a family of options that create positive (if imperfect) results--following a discipline known as "complexity science."

For the past five years, with no planes, pilots, or customers, DayJet has been running every aspect of its business thousands of times a day, every day, in silicon. Feeding in whatever data they could find, Iacobucci and his colleagues were determined to see how the business would actually someday behave. When DayJet finally starts flying, they'll switch to real-time flight data, using their operating system to shuttle planes back and forth the way computers shuttle around bits and bytes.

Iacobucci is an expert at building operating systems--he did it for decades at IBM and Citrix. Because of that, he has zero interest in the loosey-goosey world of Web 2.0. He sees the next great opportunities in business as a series of operating systems designed to model activities in the real world. DayJet looks to be the first, but he has no doubt there will be others, and that new companies, and even new industries, will appear overnight as computers tease answers out of previously intractable problems.

Which brings us back to the traveling salesmen. Iacobucci says his computer models predict that DayJet's true competitors are not the airlines, but Bimmers and Benzes--he says 80% of his revenue will come from business travelers who would otherwise drive. In other words, DayJet, which closed an additional $50 million round of financing in March, is creating a market where none exists, an astonishing mathematical feat. To get there, all Iacobucci needed was five years, a professor with a bank of 16 parallel processors, two so-called Ant Farmers, and a pair of "Russian rocket scientists" who, it turns out, are neither Russian nor rocket scientists.

"This is way nastier than any of the other airline-scheduling work we've ever done," says Georgia Tech professor George Nemhauser, whose PhD students have been helping to map the scope of DayJet's mountain-sized scheduling dilemma. "You can think of this as a traveling-salesman problem with a million cities, and that's a problem DayJet has to solve every day."

Tapping into the school's computing power, Nemhauser and his students have figured out how to calculate a near-perfect solution for 20 planes in a few seconds' worth of computing time and a solution for 300 planes in 30 hours. But as impressive as that is, in the real world, it's not nearly enough. That's because in order for DayJet's reservations system to succeed, Iacobucci and company need an answer and a price in less than five seconds, the limit for anyone conditioned to Orbitz or Expedia (NASDAQ:EXPE). Because DayJet has no preset schedule--and because overbooking is out of the question (DayJet will fly two pilots and three passengers maximum)--any request to add another customer to a given day's equation requires its software to crunch the entire thing again.

One of Iacobucci's oldest pals and investors, former Microsoft (NASDAQ:MSFT) CFO and Nasdaq chairman Mike Brown, pointed him toward a shortcut--a way to cheat on the math. Brown had retired with his stock options to pursue his pet projects in then bleeding-edge topics such as pattern recognition, artificial intelligence, nonlinear optimization, and computational modeling. His dabblings led him first to Wall Street, where he invested in a trading algorithm named FATKAT and eventually to Santa Fe, New Mexico, ground zero for complexity science.

Iacobucci says 80% of his revenues will come from travelers who would otherwise drive. DayJet, in other words, is creating a market where none existed, an astonishing mathematical feat.

Invented by scientists at the nearby Los Alamos National Laboratory in the 1980s, complexity science is a gumbo of insights drawn from fields as diverse as biology, physics, and economics. At its core is the belief that any seemingly complex and utterly random system or phenomenon--from natural selection to the stock market--emerges from the simple behavior of thousands or millions of individuals. Using computer algorithms to stand in for those individual "agents," scientists discovered they could build fantastically powerful and detailed models of these systems if only they could nail down the right set of rules.

When Brown arrived in town in the late 1990s, many of the scientists-in-residence at the Santa Fe Institute--the serene think tank dedicated to the contemplation of complexity--were rushing to commercialize their favorite research topics. The Prediction Co. was profitably gaming Wall Street by spotting and exploiting small pockets of predictability in capital flows. An outfit called Complexica was working on a simulator that could basically model the entire insurance industry, acting as a giant virtual brain to foresee the implications of any disaster. And the BiosGroup was perfecting agent-based models that today would fall under the heading of "artificial life."

By the time Iacobucci mentioned his logistical dilemma to Brown in 2002, however, most of Santa Fe's Info Mesa startups were bobbing in the dotcom wreckage. But Brown knew that Bios had produced astonishingly elegant solutions a few years earlier by creating virtual "ants" that, when turned loose, revealed how a few false assumptions or bottlenecks could throw an entire system out of whack. A model Bios built of Southwest's cargo operations, for example, cost $60,000 and found a way to save the airline $2 million a year.

Brown proposed that Iacobucci supplement his tool kit with a healthy dose of complexity science. Iacobucci was already hard at work building an "optimizer" program that employed nonlinear algorithms and other mathematical shortcuts to generate scheduling solutions in seconds. But what he really needed, Brown suggested, was an agent-based model (ABM) that would supply phantom traveling salesmen to train the optimizer. Without it, he'd essentially be guessing at the potential number and behavior of his future customers. "Eddy took no convincing," Brown says. "He was telling me, 'Get some guys down here and let's do this.'"

Brown dug up the Ant Farmers, a pair of Bios refugees and expert modelers named Bruce Sawhill and Jim Herriot. Sawhill had been a theoretical physicist at the Santa Fe Institute, while Herriot had been a member of the original team that invented Java at Sun Microsystems (NASDAQ:SUNW). Together, they're DayJet's own Mutt and Jeff, with Herriot playing congenial science professor and Sawhill his mischievous sidekick.

Meanwhile, to build the optimizer, Iacobucci recruited his pair of Russian rocket scientists: Alex Khmelnitsky and Eugene Taits, mathematical wizards he'd hired once before at Citrix. Rather than tackle every scheduling contingency via brute-force computing, the not-Russians cheated by slicing and dicing them into more manageable chunks. They used opaque mathematical techniques such as heuristics and algebraic multigrids, which elegantly subdivide a sprawling problem like this one into discrete patches that can be solved (within limits) simultaneously.

Ironically, the more they slaved over the problem, the less it seemed that throwing a perfect bull's-eye every time was the key to their salvation. The speed of their solutions was proving to be more crucial. If they could provide DayJet with a minute-to-minute snapshot of near- perfect solutions, the system could essentially run the company for them. DayJet would become faster--both in the air and operationally--than any of its competitors could ever hope to be.

With one team working on modeling demand and the other calculating baroque flight plans, Iacobucci and his engineers then concocted a third software system called the Virtual Operation Center. The VOC runs the company in silicon, feeding the phantom customers inside the ABM into the optimizer, which does its best to meet each of their demands with optimal efficiency and maximum gain. Seen on-screen, the VOC is a time-lapse photograph of DayJet's daily operations, also drawing upon maintenance and real-time weather information to produce a final data feed that factors in nearly every facet of the business. Iacobucci compares each run of the VOC with a game of baseball in which the ABM is continually pitching to the optimizer; DayJet has already played several thousand lifetimes' worth of seasons.

Armed with its real-time operating system, DayJet is pursuing a very different idea of optimality than, say, the airlines. With their decades of expertise in the dark arts of yield management, the airlines know exactly how to squeeze every last dollar out of their seats, which is indeed pretty optimal. But they also lack an effective plan B--let alone a plan C or D--in the event that the weather intervenes and schedules collapse. In fact, while, say, JetBlue (NASDAQ:JBLU) may now finally have a contingency plan or two, DayJet's business model is nothing but contingency plans.

Herriot offers another sports metaphor: "Total soccer," popularized by the Dutch in the 1970s, replaced brute-force attacks to the goal with continuous ball movement. "Moving straight to the goal is an excellent way to score, except for one slight problem--the other team," Herriot says. "They're a human version of Murphy's Law. In total soccer, you continually place the ball in a position with not the straightest but the greatest number of ways to reach the goal, the richest set of pathways."

"Each individual pathway may have a lower possibility of reaching the goal than a straight shot," Sawhill chimes in, "but the combinatorial multiplicity overwhelms the other team." The Dutch discovered that a better strategy was a series of good, seamlessly connected solutions rather than a single brittle one.

"The Dutch won a lot of games that way," Herriot adds. "It also created a different kind of player, a more agile, intelligent one. In some sense, we're teaching DayJet how to play total soccer."

In complexity lingo, a chart of all the pathways those Dutch teams exploited would be called a "fitness landscape," a sort of topographical map of every theoretical solution in which the best are visualized as peaks and the worst as deep valleys. "We're dealing with a problem where the problem specification itself is changing as you go along," Sawhill says. "You no longer want to find the best solution--you want to be living in a space of good solutions, so when the problem changes, you're still there." Fluidity is the greater goal than perfection.

To that end, the company has been changing the problem inside its simulators every day for the past four and a half years, looking for those broad mesas of good solutions. And after a million or so spins of the VOC, DayJet has produced a clear vision of the total market and its likely place in it. Iacobucci expects to siphon off somewhere between 1% and 1.5% of all regional business trips within DayJet's markets by 2008, with "regional trips" defined as being between 100 and 500 miles. In the southeast states the company initially has its eye on, that's 500,000 to 750,000 trips a year, out of a total of 52 million, more than 80% of which are currently traversed by car. Yes, DayJet's life-or-death competition is Florida's SUV dealerships, not the airlines. DayJet may even help the airlines slightly: The model predicts some customers who fly DayJet one way will take a commercial flight back home.

The reams of data produced by the VOC have already coalesced into a thick sheaf of battle plans framing best- to worst-case scenarios. And having run the scenarios so relentlessly for so long, Iacobucci is now utterly sanguine about his prospects. When I ask over dinner for the dozenth time about DayJet's presumptive break-even number, he flat out admits there isn't one. "Within the realm of all realistic possibilities--at least 25% of our projected demand to 125% demand--we maintain profitability." Even at 25%? "Sure," Iacobucci replies, "it just takes longer, and takes more [airports], and the margin is much lower. But this isn't going to be what the venture capitalists call the 'walking dead.' If it's a hit, it's going to be a hit pretty quickly."

"We'll see more companies integrate modeling," says former Microsoft CFO Mike Brown. "This is just like the Internet: One day no one had heard of it, the next day we were all using it."

I'm not the only one who has trouble wrapping his head around the numbers, or lack thereof. Iacobucci tells the story of one analyst asked to crunch the numbers ahead of an investment. "He asked a direct question: 'All I want to know is, what formula do I put into this cell to tell me how you come up with a revenue number?'" Iacobucci says. "I told him, 'There ain't no formula to put in that cell! It can't be done! We'll sit you down with our modelers, who will explain the range of numbers we came up with, but they can't be encapsulated in a spreadsheet.'" The would-be investors passed.

Not everyone is so put out by the math involved. Esther Dyson, the veteran technologist and venture capitalist, now runs an annual conference called "Flight School," in which DayJet has played a starring role. "I have no doubt it will work," she says, referring to the software, "and I have no doubt they will spend time refining it and that there will be glitches here and there. But I do think Ed knows how to design very highly available systems"--a reference to his days building operating systems--"and that's exactly what they're doing."

Mike Brown, who did ante up and today sits on DayJet's board, is convinced that businesses big and small will increasingly turn to modeling as a way of developing--or troubleshooting--their business plans, mapping out strategies and market expectations that go far, far beyond spreadsheets and PowerPoint (NASDAQ:MSFT) decks. "We'll see more and more companies integrate modeling into the heart of their business. This is just like the Internet: One day no one had heard of it, the next day we were all using it."

Since Iacobucci sees himself as being in the operating-systems business, he has no intention of giving that system away. (He learned that lesson the hard way at IBM.) He doesn't want to build what he calls "horizontal" software that gets shared, e.g., Web 2.0 and Windows, the two great platforms for which every programmer in Silicon Valley seems to be writing widgets these days. Where everyone else in the business sees limitless opportunities in snap-together applications, Iacobucci sees a playing field so flat as to have no barriers to entry at all, and he doesn't like it.

According to Dyson, DayJet's competitors have so far pooh-poohed its software, assuming they'll be able to buy their own off the shelf at some point. Eclipse Aviation's Vern Raburn hopes Iacobucci might be persuaded to license his tools, because Raburn's own business model depends upon air taxis' taking off. Iacobucci says that isn't going to happen. "There's a shift away from building another platform toward building highly integrated, vertical, special-purpose, high-performance systems," he argues. Iacobucci envisions more companies like his own, in which the competitive advantage resides in custom-built, deeply proprietary, real-world operating systems that don't just streamline accounting, but become the central nervous systems of entirely new, scalable businesses. He's looking to build barriers to entry out of brainpower--so much of it that rivals can never catch up. ("It's like in Dr. Strangelove," Sawhill quips. "'Our German scientists are better than their German scientists.'")

Iacobucci points to Google (NASDAQ:GOOG) as an example of what a vertical system can accomplish. While everyone raves about free services on Google, the largely invisible supercomputers in Google's data centers are themselves invisibly tackling a variation on the traveling-salesman problem: How do you solve millions of searches in parallel at any given second? "When you get into mesh computing," the name for Google's technique, "that's what it's all about: managing the complexity," Iacobucci insists.

But no company has ever built a business model around complexity from the ground up--until DayJet. Thumbing his nose at the prevailing ethos in software circles of "the wisdom of crowds," let alone that "IT doesn't matter," Iacobucci has set out to first invent and then dominate a market he might have otherwise just sold software to. "When we built generic software at IBM and Citrix, the other side would always reverse-engineer it," he says. "The only thing the customer sees here is an incredible service. This is 'software as a service.'"

 Source: FastCompany.com

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh

Tuesday, January 15, 2008 12:36:57 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services

Tim O’Reilly of O’Reilly media spoke at Microsoft Research earlier today. It was a great, wide-ranging talk pounding through 103 slides roaming from social networking, through sensor and ambient computing, to Web2.0.

 

Four themes for the talk:

·         Thoughts on social networking

·         Sensors and Ambient Computing

·         Web 2.0 and Wall Street

·         Open Source Hardware

 

“The future is here. It’s just not evenly distributed yet” -- William Gibson

 

Thoughts on social networking:

·         Social networking sites are becoming personal CRM

·         Need to have it reflect my REAL friends vs acquaintances, hangers on, etc.

·         “What’s new” is going to be everywhere

·         Need to be able to have groups of friends based upon what “type” of friends they are. Where they were met, what context, more dimesions, etc.

·         Want sharing of data with control across different social networking sites (need data portability)

·         Want to make social networking applicable to enterprise problems (huge opportunities in that the existing tools are weak)

·         Open social is weak.  We’re bringing together 60 or 70 social networking sites to get sharing that works

·         Wallop was a GREAT as a research project but seemed to have lost what was cool about the original ideas when spun out.

·         Photosynth is very cool as well.  Use ‘em and don’t lose ‘em?

·         Need to bring social networking features into Outlook. It would be killer!

·         What differentiated Web 2.0: real time user-facing services based upon data

·         How do we take deep wells of data that all corporations serving customers have and turn it into value for your users.

o   Note: my phone and my email know who my friends are. They should be running deep heuristics on this data.  They should know who my real friends are.

·         We grew up where the answers were black and white. We’re entering a world where the answers are softer.  We need close enough and more “pretty good” …

o   I shouldn’t have to tell the system who my friends are

o   Xobni does a pretty good job in this dimension

·         There are huge opportunities to re-invent enterprise software along Web 2.0 lines

o   Social networking meets Web 2.0.

o   Exploit the vast wells of data

·         ½ of all mashups are Google Maps based (programmableweb.com)

·         The most useful APIs:

o   Aren’t about you are your services

o   Let developers try new things

o   Exist to be stretched out of shape

o   If someone violates your TOS, perhaps it is wrong

·         Amazon AWS is doing this best (S3 is great)

·         P2p hasn’t been close to fully exploited yet.

·         Amazon ASIN is an extension of ISBN.  It’s a name space for all products rather than just books.

·         On the web, open source doesn’t matter. It’s from an old era. What matters today is open data.

·         Open Source was the solution to software distribution over a fragmented set of computer architectures.  It’s no longer the problem.  If I gave you the source to Google, you couldn’t run it. What’s interesting today is large aggregate data sources rather than access of the programs.

 

Sensors and Ambient Computing

·         We are moving out o the world of people typing on keyboards.  Increasingly apps will be driven by new kinds of sensors.

·         There will be more and more sensor based applications.

·         For example, Nintendo WII

·         Photosynth is this amazing Web 2.0 application.  All of our cameras become sensor in this large collective DB.

·         Pathintellegence is using Gnu Radio to track cell phone carrying customers (everyone) to track shopping patters

·         Jaiku is a smart presence application (using cell tower triangulations)

·         LastFM delivers results (the music you like) without asking me to enter data.  Just watch me using sensors.  This is the future.

o   PageRank found meaning in hidden data (page links).  Finding data from large aggregations and extract meaning statistically.

·         It’s about “Programming Collective Intelligence”

 

Web 2.0 and Wall Street

·         It’s all about networked intelligence applications

·         Dark Pools are used by Hedge funds to avoid moving the market by doing large numbers of small trades through a dark pool.  Liuqidnet, for example, does several multiples of the traffic of the NYSE.  Dark Pools are ultra-high speed, anonymous trading markets. Dark Pools are used by hedge funds to operate privately and anonymously.  I expect to see a backlash across the web2.0 economy and a bigger focus on privacy and anonymous operation.

 

Open Source Hardware

·         Didn’t have time to cover it in this talk.

 

WS-* was a failed attempt by big companies to make it so hard that we needed their tools.

Why doesn’t my address book, email, and phone software tell me about interesting things that relate to me. For example, I know Elop who just took over Office.  Why couldn’t this be done automatically?

There is LOTS of innovation happening right now in hardware. It’s the next big frontier.

 

                        --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh

Tuesday, January 15, 2008 12:35:53 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services | Software
 Thursday, January 10, 2008

Many massively multi-player games have substantial parts of the game played and scored locally.  The only way to get a sufficiently responsive gaming experience is to have the high speed game to player interactions local. Offloading some of the interactions from the server to the client is also an important way to reduce costs since the players pay for client side hardware and running some decisions locally offloads server and network traffic reducing the game providers costs.  Off-loading some interactions to the client works extremely well at maintaining game responsiveness and reducing server-side costs but it does open up opportunity to cheat.  If the client program can be modified or otherwise manipulated, a player can get an unfair advantage.

 

There are now markets for virtual merchandise like special armor, more powerful weapons, etc.  These markets charge real money for “ownership” of virtual merchandise.  That which another player has earned through playing the game can be transferred for use to another.  Essentially there is a market for various forms of virtual merchandise and many of these transactions are conducted using eBay.

 

When the possibility of cheating by modifying or otherwise manipulating the client side game components is coupled with a financial incentive, cheating is almost assured. That’s exactly what’s happening in many instances. The December issue of IEEE Spectrum includes a story about Richard Thurman who, during his game manipulating days, had a fleet of 30 servers running day and night playing  Ultima Online using modified game client software.  At peak, Thurman claims he was making $25,000 each month.

 

Even without cheating, where there is value, there is a potential business model to be found. Companies, often called gold farmers, set up shop in low wage countries where employees play games 12 hours a day. Recent estimates peg the gold farming population in China at over 100,000 farmers.  The bounty earned by each farmer is then sold for hard currency to players interested in the virtual asset. Gold farmers play by the same rules as ordinary players and typically use standard clients, but have some advantage in numbers.  They can be more organized.  Many players can gang up on a single foe that would be ordinarily very difficult to kill and execute the mission, reap the reward for their employer and move on to the next kill.

 

Clearly there are willing sellers of virtual wares. Some obtain their virtual goods by employing others, some using automata as described above, and some by the old fashioned way, by playing the game for hours at a time.  There are also willing purchasers of virtual wares as evidenced by the traffic on eBay. Whenever there are buyers and sellers, there is opportunity to make the market more efficient by bringing together more buyers, more sellers, and helping to match their interests.  Several companies have emerged to make markets in virtual goods.  The largest example of a virtual to real world broker is IGE based in Hong Kong but there are many such companies out there matching buyers to sellers and taking a slice of each transaction as a fee.

 

The IEEE Spectrum article is posted at: http://spectrum.ieee.org/dec07/5719.

 

                                                -jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | Msft internal blog: msblogs/JamesRH

 

Thursday, January 10, 2008 10:44:45 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Monday, January 07, 2008

It’s finally done!  Back in August of 2006 Joe Hellerstein asked me to join him and Mike Stonebraker in producing an article for Foundations and Trends in Database Systems.  The project ended up being bigger than I originally understood, and the review process always takes longer than any of us expect.  The goal for the paper is to document those aspects of commercial database management systems that aren’t well-covered in the literature.  We focused on those aspects of modern relational systems not broadly covered in the existing literature and let that criteria drive what material would be included.

 

The paper was published in Volume 1, Issue 2 of Foundations and Trends in Database Systems and is posted here:  Architecture of a Database System.

 

The abstract:

Database Management Systems (DBMSs) are a ubiquitous and critical component of modern computing, and the result of decades of research and development in both academia and industry. Historically, DBMSs were among the earliest multi-user server systems to be developed, and thus pioneered many systems design techniques for scalability and reliability now in use in many other contexts. While many of the algorithms and abstractions used by a DBMS are textbook material, there has been relatively sparse coverage in the literature of the systems design issues that make a DBMS work. This paper presents an architectural discussion of DBMS design principles, including process models, parallel architecture, storage system design, transaction system implementation, query processor and optimizer architectures, and typical shared components and utilities. Successful commercial and open-source systems are used as points of reference, particularly when multiple alternative designs have been adopted by different groups.

 

                                                                --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh

Monday, January 07, 2008 12:37:40 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Software
 Wednesday, January 02, 2008

FusionIO has released specs and pricing data on their new NAND flash SSD: http://www.fusionio.com/products.html (Lintao Zhang of msft Research sent it my way).  100,000 IOPS, 700 MB/s sustained read, and 600 MB/s sustained write. Impressive numbers but let’s dig deeper.  In what follows, I compare the specs of the FusionIO part with a “typical” SATA disk.  For comparison with the 80GB FusionIO part, I’m using the following SATA disk specs: $200, 750GB, 70 MB/s sustained transfer rate, and 70 random IO operations per second (IOPS).  Specs change daily but it’s a good approximation of what we get from commodity SATA disk these days.

 

Obviously, the sustained read and write rates that FusionIO advertises are substantial. But to be truly interesting, they have to produce higher sustained I/O rates/dollar than magnetic disks, the high-volume commodity competitor.  A SATA disk runs around $200 and produces roughly 70 MB/s sustained I/O.  Looking at read and normalizing for price by comparing MB/s per dollar, we see that the FusionIO part can do roughly 0.29 MB/s/$ whereas the SATA disk will produced 0.35 MB/s/$. The disk produces slightly better sequential transfer rates per dollar. This isn’t surprising, in that we know that disks are actually respectable at sequential access—this workload pattern is not where flash really excels.  For sequential workloads or workloads that can be made sequential, at the current price point, I wouldn’t recommend flash in general or the FusionIO SSD in particular. Where they really look interesting is in workloads with highly random I/Os.

 

Looking at capacity, there are no surprises.  Normalizing to dollars/GB, we see the FusionIO part at $30/GB and the SATA disk at $0.26/GB.  Where capacity is the deciding factor, magnetic media is considerably cheaper and will be for many years.  Capacity per dollar is not where flash SSDs look best.

 

Where flash SSDs really excels, and where the FusionIO part is particularly good, is in random I/Os per second.  They advertise over 100,000 random 4k IOPS (87,500 8k IOPS) whereas our SATA disk can deliver about 70.  Again normalizing for costs and looking at IOPS per dollar, we see the FusionIO SSD at 41 IOPS/$ whereas the SATA disk is only 0.27 IOPS/$.  Flash SSDs win, and win big, on random I/O workloads like OLTP systems (usually random-I/O-operation bound).  These workloads typically run the smallest and fastest disks they can buy, and yet still can’t use the entire disk since the workload I/O rates are so high.  To support these extremely hot workloads using magnetic disk, you must spread the data over a large number of disks to effectively dilute the workload I/O rate to that which disks can support. 

 

For workloads where the random I/O rates are prodigious and the overall database sizes fairly small, flash SSDs are an excellent choice.  How do we define “fairly small”? I look at it this way: it’s a question of I/O density and I define I/O density to be random IOPS per GB. The SATA disk we are using as an example can support 0.09 IOPS/GB (70/750).  If the workload requires less than 0.09 IOPS/GB, then it will be disk-capacity bound whereas if it needs more than 0.09 IOPS/GB, then it’s I/O bound.  Assuming the workload is IO bound, how to decide whether SSDs are the right choice? Start by figuring out how many disks would be required to support the workload and what they would cost: take the sustained random IOPS required by the application and divide by the number of IOPS each disk can sustain (70 in the case of our example SATA drive or 180 to 200 if using enterprise disk). That defines the cost of supporting this application using magnetic disk.  Now figure the same number for flash SSD.  Aggregate workload required I/O rate divided by the sustained random IOPS the SSD under consideration can deliver.  This will determine how many disks are needed to support the I/O rate.  Given that flash SSDs can deliver very high I/O densities, you also need to ensure you have enough SSDs to store the entire database.  Take the maximum of the number of SSDs required to store the database (if capacity bound) and the number of SSDs required to support the I/O rate (if IOPS bound), and that’s the number of SSDS needed. Compare the cost of the SSDs with the cost of the disk required to support the same workload, and see if SSD is cheaper.  For VERY hot workloads, flash SSDs will be cheaper than hard disk drives.

 

I should point out there are many other factors potentially worth considering when deciding whether a flash SSD is the right choice for your workload, including the power consumption of the disk farm, the failure rate and cost of service, and the wear-out rate and exactly how it was computed for the SSDs. The random I/O rate is the biggest differentiator and the most important for many workloads, so I haven’t considered these other factors here.

 

Looking more closely at the FusionIO specs, we see they give specs on random IOPS but they don’t specify