James Hamilton's Blog RSS 2.0
 Wednesday, April 23, 2008

It’s not often I come across three interesting notes in the same day but here’s another. Earlier today the Jim Gray Systems Lab was announced and it will be lead by long time database pioneer David DeWitt.  This is great to see for a large variety of reasons. First of all it’s wonderful to see the contribution of Jim Gray to the entire industry recognized in the naming of this new lab.  Very appropriate.  Second I’m really looking forward to working more closely with DeWitt.  This is going to be fun.

 

This is “earned” in that Madison has been contributing great database developers to the industry for what seems like forever – I’ve probably worked with more Madison graduates over the years than any other single school. It’s good to see a systems focused research lab opened up there. 

 

It’s also good to see this project come together. I was involved in earlier discussions on this project some years back and, although we didn’t find a way to make it happen then, I really liked the idea.  I’m glad others were successful in doing the hard work to get this project to reality.

 

·         University of Wisconsin at Madison News: http://www.news.wisc.edu/15097  

·         DeWitt Interview (from above): http://www.microsoft.com/presspass/features/2008/apr08/04-23DeWitt.mspx

·         Server and Tools Business News Blog: http://blogs.technet.com/stbnewsbytes/archive/2008/04/23/an-addition-to-the-sql-server-team.aspx

·         Information Week: http://www.informationweek.com/news/software/database/showArticle.jhtml;jsessionid=2PMY2VDAXNZHSQSNDLOSKHSCJUNN2JVN?articleID=207401497

 

                                                --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Wednesday, April 23, 2008 11:07:22 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Software

Earlier today, Amazon AWS announced a reduction in egress charges.  The new charges:

·         $0.100 per GB - data transfer in

·         $0.170 per GB - first 10 TB / month data transfer out

·         $0.130 per GB - next 40 TB / month data transfer out

·         $0.110 per GB - next 100 TB / month data transfer out

·         $0.100 per GB - data transfer out / month over 150 TB

 

Compared with the old:

·         $0.100 per GB - data transfer in

·         $0.180 per GB - first 10 TB / month data transfer out

·         $0.160 per GB - next 40 TB / month data transfer out

·         $0.130 per GB - data transfer out / month over 50 TB

 

Most networking contracts charge symmetrically for ingress and egress – you pay the max of the two -- so the ingress cost to Amazon is effectively zero.

 

Note that it’s a non-linear reduction favoring higher volume users.  TechCrunch reported a couple of days back that the Amazon AWS customer base has rapidly swung from a nearly pure start-up community to more of a mix of startups and very large enterprises with the enterprise customers now bringing the largest workloads (http://www.techcrunch.com/2008/04/21/who-are-the-biggest-users-of-amazon-web-services-its-not-startups/).  Not really all that surprising – I expected this to happen and talked about it in the Next Big Thing. What is surprising to me is the speed with which the transformation is taking place. I was predicting workload mix shift to happen at AWS 3 to 5 years from now. Things are moving quickly in the services world.

 

                                                --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Wednesday, April 23, 2008 7:51:41 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services

Live Mesh has been under development for a couple of years now.  Now it’s hear in “technology preview” form. I think the first public mention was probably back in March of last year in a blog entry by Mary Jo Foley that mentioned Windows Live Core (http://blogs.zdnet.com/microsoft/?p=349). Last night Amit Mital, General Manager of Windows Live Core, did a blog entry that coves Live Mesh in more detail that previously seen: http://dev.live.com/blogs/devlive/archive/2008/04/22/279.aspx.

 

UPDATE: The report above attributing first mention of Windows Live Core to Mary Jo Foley was incorrect.  The sleuths at LiveSide appear to have reported this one first: http://www.liveside.net/blogs/main/archive/2007/03/25/windows-live-core.aspx.

 

Live Mesh is a platform that supports synchronizing data across devices, a platform for deploying  and managing apps that run on multiple devices, supports screen remoting making all your devices and applications available from anywhere, and it strikes an interesting balance exploiting both cloud services supported features and unique device capabilities. The initial device support is Windows only but Mac and other device clients are coming as well.

 

Screen shots are up on CrunchBase: http://www.crunchbase.com/product/windows-live-mesh.

 

Ray Ozzie did a 36 min Channel 9 interview with Jon Udell: http://channel9.msdn.com/showpost.aspx?postid=399578.

 

Abolade Gbadegesin, Live Mesh Architect, did a video on Live Mesh Architecture that is worth checking out: http://channel9.msdn.com/Showpost.aspx?postid=399577.

 

Demo video: http://www.on10.net/blogs/nic/Hands-on-with-Live-Mesh/.

 

                                                --jrh   

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Wednesday, April 23, 2008 7:15:15 AM (Pacific Standard Time, UTC-08:00)  #    Comments [3] - Trackback
Services
 Tuesday, April 22, 2008

Here’s a statistic I love, Facebook is running 1,800 MySQL Servers with only 2 DBAs. Impressive. I love seeing services show how far you can go towards admin-free operation. 2:1,800 is respectable and for database servers it downright impressive. This data from a short but interesting report at: http://www.paragon-cs.com/wordpress/?p=144.

 

The Facebook fleet has grown fairly dramatically of late.   For example, Facebook is the largest Memcached installation and the most recent reports I had come across have 200 Memcached servers at facebook.  At the Scaling MySQL panel, they report 805 Memcached servers.

 

1,800 MySQL Servers, insulated by 805 Memcached servers, and driven by 10,000 web servers. Smells like success.

 

                                                --jrh

 

Thanks to Dare Obasanjo for pointing me to this one.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Tuesday, April 22, 2008 7:36:00 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Monday, April 21, 2008

Back in March I speculated that Google was soon to announce a third party service platform. Well, on the evening of April 7th, Google Application Engine was announced.  It’s been heavily covered over the last couple of weeks and I’ve been waiting to get a beta account so I can write some code against it. I’ve not yet got an account but Sriram Krishnan has been playing with it and sent me the following excellent review.

 

·         Guest book development video: Developing and deploying an application on Google App Engine (9:29)

·         Techcrunch: Google Jumps Head First Into Web Services With Google App Engine.

·         Google App Engine Limitations: evan_tech.

·         What’s coming up: We're up and Running!

·         High Scalability: Google App Engine – A second Look

 

Sriram’s review of Application Engine.

 

Overall

-       It’s well designed from end to end, builds on a good ecosystem of tools, most scenarios for a typical web 2.0 app are covered. If I were to ever get into the Facebook-app writing business, AppEngine would be my first choice. However, any startup which requires code to execute outside the web request-reply cycle is out of luck and would need to use EC2.

-       The mailing list is overflowing so there is obviously huge community interest and lots of real coders building stuff.

-       The datastore is a bit wonky for my taste. It neither fits into SQL/RDBMS nor the clean spreadsheet model of Amazon SimpleDb – it’s a ORM with some querying thrown-in and that leads to some abstraction leakages . The limitations on queries are going to take a bit of getting used to since they’re not intuitive at all(they only support queries where they can scan the index sequentially for results, the choice of datatype is not straightforward). The datastore was the area where I found myself consulting the docs most frequently.

-       Python-only is probably a big con at the moment. I’m a big Python fan but its pretty apparent that a lot of people want PHP and Ruby.  However, when you poke around the framework, it is pretty apparent that the framework is built to be language agnostic and that the creators had support for other languages in mind from the beginning.

-       Lack of SSL support, unique IPs per app instance are other problems. The latter really kicks in when you’re calling other Web 2.0 APIs. A lot of them do quota calculations based on IP address and this wont work when you’re sharing your IP with a bunch of other apps. Lack of SSL support is not a blocker (since you can use Google’s inbuilt authentication system) but will block any non-serious app.

-       The beta limits are too conservative and they are too aggressive in enforcing them -  they kept nuking my benchmarking apps for relatively short bursts of activity (more on that later). This really makes me hesitate to put anything non-trivial on AppEngine. If I were them, I would loosen up these limits or get customers to pay a bit extra for more CPU/network slices

 

The Web Framework

-       I’m familiar with Python and Django so I’m probably not the best person to judge the learning curve. It’s very clean and usable (I like it much better than ASP.NET) and I found myself being reasonable productive within a few minutes.

-        There are also put hooks in so that you can use almost any Python framework of your choice with a bit of work – you’re not stuck to the one provided. On the mailing list, there’s a lot of activity around porting other frameworks (pylons, web.py, cherrypy, etc) to AppEngine. If it were up to me, I would be using Aaron Swartz’s web.py but that is more a stylistic personal preference.

-       Python was not originally designed to be sandboxed so Google had to make some major cuts to make it ‘safe’ – they don’t allow opening sockets for example. This has caused a lot of open source Python code to stop working – essential libraries like urllib (the equivalent of .net’s HttpWebRequest) need some porting work.

-       The tools support is a bit sparse – debugging is mostly through printf/exception stack traces However, what it lacks in tooling is made up for in the speed of its edit cycle – just edit a .py file and then refresh the page.

-       Some people are going to have trouble getting used to the lack of sessions but I think the pain will be temporary (some people have started working on using the datastore as a Django session store to session state). From my limited testing, I didn’t see much machine affinity – Google seems happy to spin up processes on different machines and kill them the moment they finish serving the request.

 

The Datastore

-       You specify your data models in Python and there’s some ORM magic that takes place behind the scenes. They have a few inbuilt data types and you can use expando (dynamic) properties to assign properties at runtime which haven’t been defined in your model. Data schema versioning is a big question-mark at the moment – if I were Google, I would look into supporting something like RoR’s migrations

-       Querying is done through a SQL-subset called GQL on specifically defined indexes. For a query to succeed, the query must be supported by an index and the scan needs to find sequential results and this puts some restrictions on the kinds of queries you can execute (you can’t have inequality operators on more than one attribute, for example). Several indexes are auto-generated and you can request others to be created.

-       They appear to auto-generate several indexes.

-       Entities can be grouped together through ReferenceProperties into groups. Each group is stored together. Queries within one group can be bunched together into a transaction (everything is optimistic concurrency by default). Bunching together lots of entities into one group is bad since Google seems to do some sort of locking on the entity group – the docs say some updates might fail.

-       No join support. Like SimpleDb, they suggest de-normalization.

-       The datastore tools are sparse at the moment. I had to write code to delete stale data from my datastore since the website would only show me 20 items at a time.

-       All the APIs (the datastore, user auth, mail) are offered through Google’s internal RPC mechanism. Google calls the individual  RPC messages protocol buffers and all the AppEngine APIs are implemented using the afore-mentioned stub generators (this is what you get with the local SDK as well). 

 

Benchmarks

This section is woefully short - it is very hard to run benchmarks since Google will keep killing apps with high activity. Here’s what I got

 

-       Gets/puts/deletes are all really fast. I benchmarked a tight loop running a fixed number of iterations, each query operating on a single object or retrieving a single object (which I kept tuning to avoid hitting the Google limits). Each averaged 0.001 s(next to nothing – almost noise).

-       Turning up the number of results to retrieve meant a linear increase in numbers. I inserted multiple entities with just a single byte in each to have the least possible serialization/de-serialization overhead.  For 50 results, the query execution time was around 0.15s, for 100, around 0.30s and so on. I saw a linear increase all the way until I hit Google’s limits on CPU usage.

-       I can’t measure this correctly but a ballpark guesstimate is that Google nukes your app if you use up close to 100% CPU (by running in a tight loop like I did) for over 2 seconds for any given request. For every app, they tell you the number of CPU cycles used (a typical benchmark app cost me around 50 megacycles) and I think they do some quota calculations based on megacycles used per second.

 

Overall, perf seems excellent but I would worry about hitting quota limits due to a Digg/Slashdot effect. I plan on trying out some more complex queries and I’ll let you know if I see something weird.

 

The Tools

-       The dashboard is excellent. Gives you nice views on error logs, what’s in the datastore, usage patterns for all your important counters (requests, CPU, bandwidth, etc)

-       Good end-to-end flow for the common tasks – registering a domain and assigning it to your application, managing multiple versions of your app, looking at logs,etc.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

Monday, April 21, 2008 4:59:22 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Friday, April 18, 2008

In the Rules of Thumb post, I argued that many of the standard engineering rules the thumb are changing. On a closely related point, Nishant Dani and Vlad Sadovsky both pointed me towards The Landscape of Parallel Computing Research: A View from Berkeley by David Patterson et al. Dave Patterson is best known for foundational work on RISC and for co-inventing RAID.  He has an amazing ability to spot a problem where the solution is near, the problem is worth solving, and then come up with practical solutions.  This paper has many co-authors but shows some of that same style.  It focuses on parallel systems and some of the conventional wisdom that has driven systems designs for some time that are no longer correct.  The Berkeley web site with more detail is at: http://view.eecs.berkeley.edu/wiki/Main_Page.

 

In the paper they argue that 13 computational kernels can be used to characterize most workloads.  Then they go on to observe that over ½ of these kernels are memory bound today and we expect more to be in the future.  In effect, the problem is getting data up the storage and memory hierarchy to the processors not the speed of the processors themselves. This has been true for years and the problems worsens each year and yet it still seems that the problem gets less focus than scaling processors speeds even though the later won’t help without the first.

 

If you are interested in parallel systems, it’s worth reading the paper.  I’ve included the key changes in conventional wisdom below:

 

1. Old CW: Power is free, but transistors are expensive.

· New CW is the “Power wall”: Power is expensive, but transistors are “free”. That

is, we can put more transistors on a chip than we have the power to turn on.

2. Old CW: If you worry about power, the only concern is dynamic power.

· New CW: For desktops and servers, static power due to leakage can be 40% of

total power. (See Section 4.1.)

3. Old CW: Monolithic uniprocessors in silicon are reliable internally, with errors

occurring only at the pins.

· New CW: As chips drop below 65 nm feature sizes, they will have high soft and

hard error rates. [Borkar 2005] [Mukherjee et al 2005]

4. Old CW: By building upon prior successes, we can continue to raise the level of

abstraction and hence the size of hardware designs.

· New CW: Wire delay, noise, cross coupling (capacitive and inductive),

manufacturing variability, reliability (see above), clock jitter, design validation,

and so on conspire to stretch the development time and cost of large designs at 65

nm or smaller feature sizes. (See Section 4.1.)

5. Old CW: Researchers demonstrate new architecture ideas by building chips.

· New CW: The cost of masks at 65 nm feature size, the cost of Electronic

Computer Aided Design software to design such chips, and the cost of design for

GHz clock rates means researchers can no longer build believable prototypes.

Thus, an alternative approach to evaluating architectures must be developed. (See

Section 7.3.)

6. Old CW: Performance improvements yield both lower latency and higher

bandwidth.

· New CW: Across many technologies, bandwidth improves by at least the square

of the improvement in latency. [Patterson 2004]

7. Old CW: Multiply is slow, but load and store is fast.

· New CW is the “Memory wall” [Wulf and McKee 1995]: Load and store is slow,

but multiply is fast. Modern microprocessors can take 200 clocks to access

Dynamic Random Access Memory (DRAM), but even floating-point multiplies

may take only four clock cycles.

The Landscape of Parallel Computing Research: A View From Berkeley

6

8. Old CW: We can reveal more instruction-level parallelism (ILP) via compilers

and architecture innovation. Examples from the past include branch prediction,

out-of-order execution, speculation, and Very Long Instruction Word systems.

· New CW is the “ILP wall”: There are diminishing returns on finding more ILP.

[Hennessy and Patterson 2007]

9. Old CW: Uniprocessor performance doubles every 18 months.

· New CW is Power Wall + Memory Wall + ILP Wall = Brick Wall. Figure 2 plots

processor performance for almost 30 years. In 2006, performance is a factor of

three below the traditional doubling every 18 months that we enjoyed between

1986 and 2002. The doubling of uniprocessor performance may now take 5 years.

10. Old CW: Don’t bother parallelizing your application, as you can just wait a little

while and run it on a much faster sequential computer.

· New CW: It will be a very long wait for a faster sequential computer (see above).

11. Old CW: Increasing clock frequency is the primary method of improving

processor performance.

· New CW: Increasing parallelism is the primary method of improving processor

performance. (See Section 4.1.)

12. Old CW: Less than linear scaling for a multiprocessor application is failure.

· New CW: Given the switch to parallel computing, any speedup via parallelism is a

success.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Friday, April 18, 2008 4:42:25 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Software
 Wednesday, April 16, 2008

How to ensure that data written to disk, is REALLY on disk?  Yeah, I know, this shouldn’t be hard but the I/O stack is deep, everyone is looking for performance, everyone is caching along the way, so it’s more interesting than you might like.  If you writing code that needs to reliable write through semantics like Write Ahead Logging, then you need to ensure you are writing through to media. If you are writing to a SAN or SCSI, it’s pretty straight forward but if you are using EIDE or SATA, then things get a bit more interesting. What follows is Windows-specific but you need to be aware of these issues on non-Windows systems as well.

 

If it’s a SCSI disk (not SATA or EIDE), then setting FILE_FLAG_WRITE_THROUGH and FILE_FLAG_NO_BUFFERING is sufficient.  FILE_FLAG_WRITE_THROUGH force all data written to the file to be written through the cache directly to disk. All writes are to the media.  FILE_FLAG_NO_BUFFERING ensures that all reads come directly from the media as well by preventing any read ahead and disk caching. What’s happening behind the scenes when these parameters are specified on CreateFile() is that the filsystem and memory manager are not caching and Force Unit Access (FUA) is being sent to the device on writes to ensure they are directly to the media rather than cached in the device cache

 

The reason the above is not typically sufficient with EIDE and SATA drives is that FUA is dropped by the standard SATA and EIDE miniport driver.  The filesystem and memory manager will respect the parameters but the device will likely still cache writes without FUA.

 

FUA is dropped for performance reasons since SATA and EIDE can only process one command at a time and the full flush required by FUA is slow. SCSI can process multiple commands in parallel and the flush is less expensive. Is Native Command Queuing (NCQ) the solution to the performance problem? Unfortunately, no.  NCQ allows multiple commands to be sent to the drive, it gives the drive flexibility in what order to execute the commands but the restriction of only one command executing at a time remains.

 

What’s the solution to getting reliable writes when using commodity disks and needing guaranteed writes. The simple answer is to set the registry flag that turns off the discarding of FUA. This solve the correctness problem but at considerable performance expense. Essentially this will be semantically correct but slow due to the SATA single-command limitation and the length of time it takes to go directly to the media.  Shutting of Write Cache Enable (WCE) on a per-drive basis is another option.

 

Another option is FlushFileBuffers() which is a call fully honored by all device types. FlushFileBuffers takes a file handle arguments and flushes the filesystem/memory manager cache for that handle and flushes the entire system volume that holds that file.  This again works but is broader than required in that the entire device cache will get flushed.  I’m told that you can also use FLUSH_CACHE on the device as an alternative to FlushFileBuffers() on a handle. A paper that shows the use of FLUSH_CACHE to achieve correct write ahead logging semantics is up at: Enforcing Database Recoverability on Disks that Lack Write-Through.  In this paper, using SQL Server running a mini-TPC-C as a test case, the measure performance degradation of as little 2% using FLUSH_CACHE calls to the device as needed. A small price to pay for correctness.

 

                             --jrh

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | Msft internal blog: msblogs/JamesRH

 

Wednesday, April 16, 2008 5:59:43 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Software
 Monday, April 14, 2008

Wow, the pace is starting to pick up in the service platform  world. Google announced their long awaited entrant with Google Application Engine last Monday, April 7th. Amazon announced the SimpleDB to answer the largest requirement they were hearing from AWS customers: persistent, structured storage. Yesterday, another major step was made with Werner Vogles announcing availability of persistent storage for EC2.

Persistence for EC2 is a big one.  I’ve been amazed at how hard customers were willing to work to get persistent storage in EC2.  The most common trick is to periodically snapshot the up to 160GB of ephemeral state allocated to each Amazon EC2 instance to S3. This does work but is very clunky and looses all between the last snap shot and non-orderly shutdown is a bit nasty.  A solution I like is a replicated block storage layer like DRBD.  One innovative solution to all EC2 state being transient is to use DRDB to maintained a replicated file system between two EC2 instances.  Not bad – in fact I really like it but it’s hard to set up and, last time I checked, only supported 2-way redundancy when 3 is where you want to be when using commodity hardware.

 

It appears the solution is (nearly) here with EC2 persistence.  The model they have chosen storage volume as the abstraction.  Any number of storage volumes can be created in sizes of up to 1TB. Each storage volume is created in a developer specified availability zone and each volume supports snapshots to S3. A volume can be created from a snap-shot.  The supported redundancy and recovery models were not specified but I would expect that they are using redundant, commodity storage. Werner did say it was file system semantics which I interpret as cached, asynchronous write with optional application controlled write through/flush.  It is not clear if shared volumes are supported (multiple EC2 instances accessing the same volume).

 

Another blog entry from Amazon “demo’s” Usage: I spent some time experimenting with this new feature on Saturday. In a matter of minutes I was able to create a pair of 512 GB volumes, attach them to an EC2 instance, create file systems on them with mkfs, and then mount them. When I was done I simply unmounted, detached, and then finally deleted them.

 

Unfortunately, persistent storage for EC2 won’t be available until “later this year” but it looks like a good feature that will be well received by the development community.

 

Update: This may be closer to beta than I thought.  I just (5:52am 4/14) reciveved a limited beta invitation.

 

                                                --jrh

 

Thanks to David Golds and Dare Obasanjo for sending pointers my way.

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh  | blog:http://perspectives.mvdirona.com

 

 

Monday, April 14, 2008 4:37:39 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0] - Trackback
Services
 Saturday, April 12, 2008

The only thing worse than no backups is restoring bad backups. A database guy should get these things right.  But, I didn’t, and earlier today I made some major site-wide changes and, as a side effect, this blog was restored to December 4th, 2007.  I’m working on recovering the content and will come up with something over the next 24 hours. However it’s very likely that comments between Dec 4th and earlier today will be lost.  My apologies.

 

Update 2008.04.13: I was able to restore all content other than comments between 12/4/2007 and yesterday morning.  All else is fine.  I'm sorry about the RSS noise during the restore and for the lost comments.  The backup/restore procedure problem is resolved.  Please report any broken links or lingering issues. Thanks,

 

                        -jrh

 

 

James Hamilton, Windows Live Platform Services
Bldg RedW-D/2072, One Microsoft Way, Redmond, Washington, 98052
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |
JamesRH@microsoft.com

H:mvdirona.com | W:research.microsoft.com/~jamesrh