There was once a big hooplah about the MySQL Storage Engine Architecture and how it was easy to just slot in some other method of storage instead of the provided ones. Over the years I’ve repeatedly mentioned how this wasn’t really the case and that it was remarkably non trivial.
Over the years there have been many storage engines crop up and then disappear. So… where are they now?
- ISAM
This became MyISAM…. you know you’ve been around MySQL a long time if you’ve ever had to deal with an ISAM table. - Gemini
This was the first big test of the GPL in court. Basically, you have to obey the GPL (see wikipedia for more info). The code was released as GPL and development stopped. This has been dead since ca 2002. - Amira – http://launchpad.net/amira
Antony first mentioned this in 2008 on his blog. This was a continuation of the Gemini engine, you can actually go over to launchpad and get the code. This was one of the projects to have a transactional storage engine not owned by Oracle after Innobase Oy was acquired by them. It went nowhere special as Netfrastructure was acquired which became Falcon. - BDB
otherwise known as the BerkeleyDB engine. It was seldom used and never gained much of a userbase. It was unceremoniously dropped back in 2006Â and both users didn’t really exist. - PBXT -Â http://pbxt.blogspot.com/
I think we can credit PBXT with at least half of the features and performance improvements to InnoDB since it first emerged back in 2006. It got attention very quickly. Why? Because it was different. It had the very rare ability to outperform InnoDB in some places. You can still find PBXT in MariaDB, but sadly it can be hard to fund development of a MySQL storage engine, especially one as tied to MySQL as PBXT is, and it’s no longer under active development. Closely related was the Blob Streaming project which was way ahead of its time as an AlsoSQL access method. The good news is that the code was released under a BSD license in 2012 (was previously GPL). We even had PBXT in Drizzle for a while. - Blob Streaming (PBMS) - http://bpbdev.blogspot.com/
This project was closely related to (but not depending exclusively on) PBXT. It embedded a HTTP server inside the database and could use it to read and write BLOBs. This was not only fairly cool but way ahead of its time. We owe the existence of both HandlerSocket and the memcached interface to InnoDB to PBMS (it was also an inspiration for the JSON server plugin for Drizzle, to address some of the use cases of the PBMS plugin). - Federated
It’s still there… but is effectively unmaintained and dead. There’s even FederatedX in MariaDB which is an improvement, but still, the MySQL server really doesn’t lend itself kindly to this type of engine… it’s always been an oddity only suitable for very specific tasks. - Archive
Although useful, effectively unmaintained. I kinda don’t want to say dead… but if it went away, I wouldn’t exactly be surprised. - CSV
Currently used to access the log tables in MySQL… and hardly used otherwise. It’s odd that the same code doesn’t deal with SELECT INTO OUTFILE and LOAD DATA INFILE, and I doubt this will ever change. I’d say effectively niche/dead. - SolidDB
Purchased by IBM, abandoned. - DB2
Only ever on System i. Useful for very very few people… but you can still find it around if you’re one of them. - Infobright
OMG it exists! This is probably because they’re largely just using the MySQL server as a way to implement the MySQL network protocol and all of the heavy lifting is done by their own code. - Xeround
I’m quite surprised these guys are still around, as they’re a proprietary storage engine as a service, and initial testing wasn’t entirely promising. - TokuDB
I cannot emphasize how much more interesting TokuDB would be if it were open source. It actually holds some promise… and with their recent work with mongo, perhaps this is a good way forward for them… - Maria/Aria
Another “OMG Oracle just bought Innobase Oy” engine. This was a project to take MyISAM and turn it into a lean, mean, transactional storage engine machine. It’s still not there and I don’t think it ever will be. - Falcon
This was the hot new thing. It came out of Netfrastructure, which MySQL AB acquired in order to help get a transactional storage engine after Innobase Oy was acquired by Oracle. If you’re keeping count, that’s three projects for a transactional storage engine. Falcon was the star though, receiving all the press and publicity (well before it was ready). There are many reasons why Falcon isn’t around today – the chief one probably being that Oracle bought Sun who had bought MySQL and thus a need for an “InnoDB replacement” instantly vanished. There was also immense management pressure for performance to be greater than InnoDB, without any allowance for or focus on correctness…. and this showed. This was quite disappointing as Falcon had a lot of good architectural things going for it. - BlitzDB - https://launchpad.net/blitzdb
IÂ had hoped we’d replace MyISAM with BlitzDB in Drizzle. It was a wrapper around Tokyo Cabinet to the storage engine API in Drizzle. Unfortunately, the ties to MyISAM are incredibly deep (see my recent post on internal temporary tables) and we never quite got there.
I think this is all the notable engines that were aimed at widespread adoption… what ones have I forgotten?
It’s interesting to note that only Archive, CSV, Xeround, TokuDB and Infobright can be gotten anywhere, and the latter two only in their own distribution (one proprietary) and Xeround only as a service.
MySQL supports multiple storage engines…InnoDB 1.1 and InnoDB 2.0. And MyISAM (for now…) ;) plug away.
Ahem.. Yeah.. A certain large non-Oracle ISV set a series of tests of which, as far as I know, only two storage engines for MySQL have successfully passed. One of them is InnoDB. The other is … not Falcon.
Matt Lord liked this on Facebook.
New #mysql planet post : Where are they now: MySQL Storage Engines http://t.co/VELRXqxjx5
pretty much. I wonder how long that’ll last though – the InnoDB memcached plugin could have used the storage engine interface after all…
You didn’t mention the Spider storage engine, which is now abandonware AFAICT.
Antony T Curtis liked this on Facebook.
You forgot KFDB, the storage engine used by Kickfire. Some Kickfire appliances were shipped to customers, so it isn’t a phantom storage engine. Kickfire was purchased by Teradata and the appliance business was terminated.
Oh yeah… there was Kickfire… although that was all dependent on specific hardware, so it’s kinda not mass-market for everybody :)
I only remember it because I worked there :)
http://t.co/4LeOBt1POz — great truths by Stewart Smith about storage engines, he missed #sphinx, great mention of #tokudb #mysql #mariadb
Patrick Crews liked this on Facebook.
Valerii Kravchuk liked this on Facebook.
I can guess Falcon have revived in NuoDB.
Three projects to replace InnoDB. Why not a single one of them was to fork InnoDB?
Missed Sphinx SE and Akiban, and the Casandra SE in the MariaDB 10 release. It’s certainly not trivial to write a storage engine.
Will we see you next week in Santa Clara?
Bill,
Spider just got a shiny new 3.0 release and is working on MariaDB integration. They added the ability to connect to remote Oracle tables too.
MySQL AB would want to sell commercial licenses and they’d only have GPL license to InnoDB code, so they wouldn’t have been able to.
That being said, the whole libmysqld thing was always a mess… the *CORRECT* way would bet to have libmysql be able to fork() and exec() a mysqld….
Of course, not included on Stewart’s list are the many closed-source storage engines; one such storage engine was central to a mid-sized company for ten years, for example.
I assume you intentionally omitted NDBCLUSTER since it is still going strong. MyISAM_MERGE engine could have been listed though as all of its functionality has been replaced by partitioning.
http://t.co/zqibDerK2N # ä¸€å † engine…
Heh, I once had something to say about all the transactional storage engines in MySQL:
http://code.openark.org/blog/mysql/tales-of-the-trade-1-a-day-in-the-life-of-a-mysql-instructor
RT @stewartsmith: Where are they now: MySQL Storage Engines: There was once a big hooplah about the MySQL Storage… http://t.co/54dFNcoGtD
You are becoming the MySQL historian :-) I like it!
BDB was also acquired by Oracle about the same time as they bought InnoDB. Makes sense as with that Oracle bought both of the existing transactional engines.
PBXT was secretly sponsored by MySQL/Sun, so really it was the secret fourth InnoDB replacement project. They say Sun would have acquired it if Oracle hadn’t bought Sun the day before. An interesting parallel universe to think about!
The list is missing InfiniDB, another columnar storage engine. They have a poor approach to handling communications with the community, so no wonder you forget them.
Also missing several one developer engines like S3 engine, wormhole engine.
Oh, one more thing: I seem to remember that DB2 engine is still supported by Zend, and Percona helps Zend do that. You should know!
Yep, i’ll be there next week.
Gerardo Narvaja liked this on Facebook.
The only experience I’ve got with Falcon engine is to see it core dumping as soon as I’ve started any of benchmark test..
Rgds,
-Dimitri
The S3 storage engine is missing: http://fallenpegasus.com/code/mysql-awss3/
And PBXT is now considered a “legacy storage engine” by MariaDB:
https://kb.askmonty.org/en/legacy-storage-engines/
(together with the also missing IBM DB2i storage engine)
Whoops, DB2 is on your list already. Ignore me :)
This blog post made me nostalgic for 2009. http://t.co/ADSafPaqH0 I’m gonna have to hug Stewart at #PerconaLive. http://t.co/Kn48MkdYx9
There were more — NitroDB, for example.
Don’t even mention all the community engines for some-specific-task-you-don’t-really-need-but-8-people-in-the-world-need.
Where are they now: MySQL Storage Engines http://t.co/J5V1XMe0JR via @prismatic
You could argue that this is evidence of a bad idea, but I think the size of the marketplace means it was a success. http://t.co/rBdhvvb76t
RT @xaprb: This blog post made me nostalgic for 2009. http://t.co/ADSafPaqH0 I’m gonna have to hug Stewart at #PerconaLive. http://t.co/ …
A history overview of exotic MySQL storage engines : http://t.co/g53VX38PPf
I heard some of the MySQL storage engines first time i.e. #Gemini, #Amira. Nice post Stewart Smith
http://t.co/dV13dlHA5q
You did not mention MEMORY, which had couple of local hybrids.
You forgot also BlackHole, which is relatively widely used.
You are as forgetful as when you were with us …. ;-)
RT @xaprb: This blog post made me nostalgic for 2009. http://t.co/ADSafPaqH0 I’m gonna have to hug Stewart at #PerconaLive. http://t.co/ …
OQgraph too. It is still being worked on.
Pingback: The MERGE storage engine: not dead, just resting…. or forgotten. | Ramblings
Maybe you didn’t mention ScaleDB (http://www.scaledb.com/), because they are still in development.
Otherwise, thanks for the summary. The MySQL storage engine boom was a really fun! So I have many great memories of that time.
Pingback: The new CONNECT Storage Engine with MariaDB 10.0.2 « Serge Frezefond 's blog
Pingback: TokuDB | Ramblings
Pingback: MariaDB CONNECT Storage Engine vs FEDERATED(X) « Serge Frezefond 's blog
I have been compiling them for some time to the spanish wikipedia. I found thus far:
2.1 Archive
2.2 Aria
2.3 AWSS3
2.4 BDB
2.5 Blackhole
2.6 Cassandra SE
2.7 ClouSE
2.8 Connect
2.9 DDE-GAN
2.10 CSV
2.11 Example
2.12 Federated
2.13 Federated/X
2.14 IBMDB2I
2.15 InfiniDB
2.16 Infobright
2.17 InnoDB
2.18 Mdbtools
2.19 MemcacheDB
2.20 Memory
2.21 Merge
2.22 Mroonga
2.23 MyBS
2.24 MyISAM
2.25 NDB
2.26 OQGraph
2.27 PBXT
2.28 Q4M
2.29 RitmarkFS
2.30 ScaleDB
2.31 SphinxSE
2.32 Spider
2.33 TokuDB
2.34 XtraDB
Some are discontinued, some exotic and some mainstream…
Re Archive, it was based on a false premise – namely that disk I/O was the bottleneck. I proved with my insert test tool that the actual bottleneck is CPU first – even when you clean up the parser overhead compared to 5.0 (as Mark Callaghan) did.
So that makes ARCHIVE irrelevant in terms of writes.
Of course the data will take up less diskspace which for very large datasets might be relevant, and tablescans for reads are faster than MyISAM in that tablesize range (because there’s less disk I/O).
In a nutshell, once you take out the false “reduce write I/O” hypothesis, the actual use case is very limited.
ARCHIVE was always a limited use case – and really won on reduced disk usage for highly compressible data sets.
now that is a pretty impressive list!
Pingback: The MySQL Cluster storage engine | Ramblings