Last year, I pointed out that Amazon has a highly diversified DBMS strategy. Now Mike Vizard has a great interview with Werner Vogel, Amazon’s CTO, where he unearths a lot more detail. And it turns out that Amazon has been a hardcore adopter of DBMS2, since long before DBMS2 was named.
This is most easily seen from the interview in two aspects. One is Amazon’s extreme commitment to SOA — again, since long before SOA was named. The other is in this passage, directly about data management diversity:
Let me step back a bit. We have to kind of think about what storage means. Are we really talking about block level storage, using storage area networks or ANSCSI(?) and things like that? One important thing to realize there is that with our ownership model, we ... Read more
Borrowing the “Fact or fiction?” meme from the sports world:
Data warehouse appliances have to have specialized hardware. Fiction. Indeed, most contenders except Teradata and Netezza — for example, DATAllegro, Vertica, ParAccel, Greenplum, and Infobright — offer Type 2 appliances. (Dataupia is another exception.)
Specialized hardware is a dead-end for data warehouse appliances. Fiction. If it were easy for Teradata to replace its specialized switch technology, it would have done so a decade ago. And Netezza’s strategy has a lot of appeal.
Data warehouse appliances are nothing new, and failed long ago. Fiction, but only because of Teradata. 1980s appliance pioneer Britton-Lee didn’t do so well (it was actually bought by Teradata). IBM an... Read more
SAS has its own data store, called SAS Intelligence Storage. It’s a relational system running on SMP boxes, whose unique feature is that it has fixed-length records and hence is a perfect array, for speedy lookup. This is highly analogous to classical MOLAP systems. However, SAS reports that customers store up to several hundred terabytes of data in SAS Intelligence Storage, which is definitely not very analogous to what goes on in the MOLAP world.
It sounds as if the product is optimized for data mining and generic OLAP alike. Indeed, SAS Intelligence Storage is used to power both SAS’s data mining and other advanced analytics, and also its more conventional BI suite.... Read more
It’s unfortunate that Dataupia has concepts like “Utopia” and “Satori” in its marketing, as those serve to obscure what the company really offers – data warehouse appliances designed for the market’s low end. Indeed, it seems that they’re currently very low-end, because they were just rolled out in May and are correspondingly immature.
Basic aspects include:
Type 1 appliances, which most other data warehouse appliance vendors (Teradata excepted) have moved away from. And there actually seems to be very little special about the hardware design to take advantage of the proprietary opportunity.
Apparently limited redistribution of intermediate query result sets – i.e, the “fat head” architecture most competitors have moved away from. Bu... Read more
I’ve posted several times about Amazon as an innovative, super-high-end user — doing transactional object caching with ObjectStore, building an inhouse less-than-DBMS called Dynamo, or just generally adopting a very DBMS2-like approach to data management. Now Amazon is bring the Dynamo idea to the public, via a SaaS offering called SimpleDB. (Hat tip to Tim Anderson.)
SimpleDB is obviously meant to be a data server for online applications. There are no joins, and queries don’t run over 5 seconds, so serious analytics are out of the question. Domains are limited to 10GB for now, so extreme media file serving also isn’t what’s intended; indeed, Amazon encourages one to use SimpleDB to store pointers to larger objects stored as files in Amazon S3.
... Read more
I don’t know for a fact that the Amazon.com bookstore is the world’s biggest OLTP application — but if it isn’t, it’s close.
And the thing is — that’s never been an entirely relational application. Oh, the ordering part surely is. But the inventory lookup is currently driven by an OODBMS (from Progress). The personalization used to be done in Red Brick (I knew which software replaced it, but I’m forgetting at the moment — it may even be one of the relational warehouse appliance vendors). And of course the full-text search is a custom in-house system.... Read more
Amazon has a very decentralized technical operation. But even the individual pieces have interestingly huge scale. Thus, various different things they’re doing are of interest.
They recently presented a research paper on a high-performance transactional system called Dynamo. (Hat tip to Dare Obasanjo.) A key point is the following:
There are many services on Amazon鈥檚 platform that only need primary-key access to a data store. For many services, such as those that provide best seller lists, shopping carts, customer preferences, session management, sales rank, and product catalog, the common pattern of using a relational database would lead to inefficiencies and limit scale and availability. Dynamo provides a simple primary-key only interface to meet the requirements... Read more
Please do not rely on the parts of this post that draw a distinction between in-memory and disk-based operation. See our February 18, 2008 post about ParAccel instead. It turns out that communication with ParAccel was yet worse than I had realized.
Officially launched today at the TDWI conference, ParAccel is out to compete with Netezza. Right out of the chute, ParAccel may have surpassed Netezza in at least one area: pointlessly annoying secrecy. (In other regards I love them dearly, but that paranoia can be a real pain.) As best I can remember, here are some things about ParAccel that I both am allowed to say and find interesting:
ParAccel offers a columnar, MPP data warehouse DBMS, called the ParAccel Analytic Database.
ParAccel’s product runs in two mai... Read more
I’m getting a flood of press releases today, because many of the companies I write about were selected to Intelligent Enterprise’s list of 12 most influential vendors plus 36 more to watch in the areas Intelligent Enterprise covers (which seems to be pretty much the analytics-related parts of what I write about here and on Text Technologies). It looks like a pretty reasonable list, although I think they forced the issue in some of the small analytics vendors they selected, and of course anybody can quibble with some of the omissions.
Among the companies they cited, you can find topical categories here for IBM (and Cognos), Informatica, Microsoft, Netezza, Oracle, SAP/Business Objects (both), SAS, and Teradata; QlikTech; Cast Iron, Coral8, DATAllegro, HP, ParAccel, and St... Read more
MySQL 4.0 is an OLTP joke. MySQL 5.0, however, shows a lot of progress in terms of real transactions, foreign keys, referential integrity, triggers, stored procedures and so on. In anticipation of the MySQL user conference next week, I got a quick briefing from Paola Lubet and Murat Demiroglu at Solid Information Technology, whose SolidDB is one of the two transactional storage engines for MySQL (the other is InnoDB, now owned by Oracle).
The layer provided by MySQL actually does most of what I think of as “language processing” – parsing, optimization, drivers, triggers, stored procedures, referential integrity, etc. SolidDB is a storage engine providing actual execution. Its features and virtues include:
• Online backup. (Note: Apparently, the extra-cost InnoDB o... Read more