Drizzle Meeting Photos

Posted on 2009-09-18 by Stewart Smith

I didn’t take many photos at the Drizzle Meeting, although I did take a couple at the end at the Hopvine (just down the road from Brian’s place).

A good read is Brian’s wrap up of the meeting.

But we have (courtesy of Brian):

and a couple I took at the Hopvine:

The last one there of Lee looks almost scary… strange light, moving subject, all part of the fun :)

It was really good to get a number of people together and chat. In future, we’ll no doubt have larger gatherings that are really inclusive.

stringstream is completely useless (and why C++ should have a snprintf)

Posted on 2009-08-04 by Stewart Smith

It’s easy to screw up thread safety.
If you’re trying to format something for output (e.g. leading zeros, only 1 decimal place or whatever… you know, format specifiers in printf) you are setting a property on the stream, not on what you’re converting. So if you have a thread running that sets a format, adds something to the stream, and then unsets the format, you cannot have another thread able to come in and do something to that stream. Look out for thread unsafe cout code.
You cannot use streams for any text that may need to be translated.
gettext is what everybody uses. You cannot get a page into the manual before it tells you that translators may want to change the order of what you’re printing. This goes directly against stringstream.
You need another reason? Number 2 rules it out for so much handling it’s not funny.

Table discovery for Drizzle (take 2, now merged!)

Posted on 2009-07-29 by Stewart Smith

Table discovery looks a bit different from the previous time I blogged about it. Everything is now just hanging off the StorageEngine. If you want to not have dfe files on disk and just use your own data dictionary, you need to implement two things:

A method to get table metadata
A iterator over table names in a database in your engine

I’ve done this for the ARCHIVE storage engine (and that’s in Drizzle trunk now), and have been reading up on the Embedded InnoDB docs to see their API to the InnoDB data dictionary and am rather excited about getting it going at some point in the future (feel free to beat me to it and submit a patch though!)

LCA2010 CFP Closing soon!

Posted on 2009-07-24 by Stewart Smith

http://www.lca2010.org.nz/ and get those proposals in before time runs out!
We’re looking for all sorts of talks, so get yours in while you still can!

(Database stuff is damn cool, kernel is kewl, app development is awesome and web is wonderful)

Debian unstable on a Sun Fire T1000

Posted on 2009-07-09 by Stewart Smith

So i got the T1000 working again (finally, after much screwing about trying to get the part). I then hit the ever annoying “no console” problem, where the console didn’t work – kind of problematic.

After a firmware upgrade, and passing “console=/dev/ttyS0” to the kernel, things work.

So the T1000 firmware 6.3 doesn’t work with modern debian kernels. Thing swork with 6.7 though.

Dogfooding a pastebin

Posted on 2009-07-03 by Stewart Smith

http://pastebin.flamingspork.com/

A pastebin running Drizzle andÂ the Drizzle PHP Extension (which is on top of libdrizzle).

On my way to MySQL Conference and Expo, Drizzle Developer Day all in sunny California

Posted on 2009-04-17 by Stewart Smith

Well… Santa Clara – not as cool as California leads you to believe…. but it’ll be an awesome week. See you all there soon.

Drizzle low-hanging-fruit

Posted on 2009-04-16 by Stewart Smith

We have an ongoing Drizzle milestone called low-hanging-fruit. The idea is that when there’s something thatÂ could be done, but we don’t quite have the time to do it immediately, we’ll add a low-hanging-fruit blueprint so that people looking to get a start on the codebase and contributing code to Drizzle have a place to go to find things to do.

Some of my personal favourites are:

Also relatively low hanging fruit can be writing some plugins. Some simple plugin types include:

Authentication
Got somewhere that you could authenticate against for connecting to a DB? Write a plugin for it! Current auth plugins are auth_http and auth_pam.
- Perhaps you want to authenticate against a central DB? checking in memcached first?
- Perhaps a htaccess style method
Functions
Apply some function to a column. These are pretty simple to write (see md5, compress examples). Perhaps interfaces to encryption/decryption? a hashing function?
- ROT13
- 3DES
- AES
  Bonus points if you get any of these to use the T2000 crypto accellerator stuff
- ID3 tag decoding
- file type detection (well.. BLOB)

So there’s a fair bit you can do to get started. Best of all, you can chat with the Drizzle developers next week at the MySQL Conference and Expo and Drizzle Developer Day.

Drizzle Developer Day reminder

Posted on 2009-04-16 by Stewart Smith

We’re having a Drizzle Developer Day just after the MySQL Conference and Expo next week. You don’t have to be attending the conference to come to the Drizzle Developer Day. Just bring your enthusiasm for free databases, Drizzle and good software. Spaces are limited, so head on over to the signup page and fill in your name if you haven’t already

If coming from the Hyatt Regency Santa Clara (where the MySQL Conference and Expo is), at least I will be driving from there, so let me know if you want a lift.

Using Dtrace to find out if the hardware or Solaris is slow (but really just working around the problem)

Posted on 2009-04-08 by Stewart Smith

A little while ago, I was the brave soul tasked with making sure Drizzle was working properly and passing all tests on Solaris and OpenSolaris. Brian recently blogged about some of the advantages of also running on Solaris and the SunStudio compilers – more warnings from the compiler is a good thing. Many kudos goes to Monty Taylor for being the brave soul who fixed most of the compiler warnings (and for us, warnings=errors – so we have to fix them) for the SunStudio compilers before I got to making te tests work.

So, I got to the end of it all and got pointed to an OpenSolaris x86 box where the drizzleslap test was timing out. The timeout for tests is some amazingly long amount of time – 15 minutes. All the drizzle-test-run tests are rather short tests.

To make running the tests quick, I usually LD_PRELOAD libeatmydata – a simple way of disabling pesky things like fsync that take a long time (rumors that I nickname it libmacosxsimulation are entirely true). It’s pretty simple to build libeatmydata on Solaris too (I periodically do this and always intend to check in the associated Makefile but never do).

Unfortunately, on OpenSolaris a bunch of things are built 32bit and others 64bit and just doing “LD_PRELOAD=libeatmydata.so ./dtr” doesn’t work – I’d have to modify the test script to only do the LD_PRELOAD for drizzled – which is annoying.

On my T1000 running Debian, the drizzleslap test takes 42 seconds to complete with libeatmydata, or 393 seconds when it’s really doing fsyncs. So for it to be timing out on this OpenSolaris x86 box – i.e. taking more than 15 minutes, was strange.

So… what was going on? Step 1: is anything actually going on? One way to test this is to see if disk IO is being generated. On Linux, we can use “iostat”. On Solaris, we can use “zpool iostat”. Things were going to disk for the whole time of the test. Time to compare what the difference between the platforms is.

Well.. a typical way that tests have taken forever have been because of lots of transactions: i.e. lots of fsync(). You are then dependent on the fsync() performance.

If we look at “iostat -x” and the avgrq-sz field on Linux, we’ll see that the average request size is on 10-12 sectors (512 byte blocks). i.e. about 5 or 6kb.

If we look at “zpool iostat 1” on OpenSolaris, we see a bit of a different story, but similar enough that you could safely assume that lots of small synchronous IOs were going on. After a bit of reading of the ZFS on-disk format documents, I had a slightly better idea what was going on that could be causing me seeing a larger average request size on ZFS than on Linux with XFS.

So… perhaps it’s the speed of these syncs? Ordinarily, I’d just write up a quick LD_PRELOAD library that wraps fsync() and times it (perhaps writing to a file so I could do analysis on it later). Since I was working on Solaris… I thought I’d try DTrace. Some google-foo and dtrace hacking later, I tried this:

stewart@drizzle-dev:~/drizzle/sparc$ time pfexec dtrace -n ‘syscall::fdsync:entry /execname == “drizzled” / { self->ts[self->stack++] = timestamp; } syscall::fdsync:return /self->ts[self->stack – 1]/ { this->elapsed = timestamp – self->ts[–self->stack]; @[probefunc] = count(); @a[probefunc] = quantize(this->elapsed); self->ts[self->stack] =0; }’

dtrace: description 'syscall::fdsync:entry ' matched 2 probes
^C

  fdsync                                                         1600
  fdsync
           value  ------------- Distribution ------------- count
        33554432 |                                         0
        67108864 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   1520
       134217728 |@@                                       79
       268435456 |                                         1
       536870912 |                                         0        

real	4m26.837s
user	0m0.657s
sys	0m0.566s

Which did seem like an awful long time for an fsync() to take. Although the filesystem was on a single disk, it was meant to be made remotely recently, and it’s sitting on a Sun controller… so it should be a bit better than that. From reading some of the ZFS on-disk spec, it could be some bug that means we’re waiting for a checkpoint to be written instead of forcing the sync out when we call fsync() – but I sought another solution (as on other Solaris/OpenSolaris systems this wasn’t a problem – so perhaps fixed in newer kernels or it’s a driver issue).

So I went and added “–commit=100” to a bunch of places in the drizzleslap test to batch things into transactions. The idea being to greatly reduce the number of fsync() calls to bring the execution time of the drizzleslap test on the machine to get below 15minutes. A bit of jiggerypokery later (some tests needed to not have the –commit to avoid various locking foo) and I had something that should run.

Now, ~113 seconds on the T1000 on Linux (with a single SATA disk, down from an original 393 seconds) and ~437 seconds on the OpenSolaris box. For giggles, tried it on a Solaris box that’s running UFS on a 10k RPM SAS drive: ~44 seconds.

In Summary:

T1000, Linux, libeatmydata, XFS: ~42 seconds (before optim)
T1000, Linux, 7200RPM SATA, XFS: ~113 seconds
T5240, Solaris 10, 10k RPM SAS, UFS: ~44 seconds
16 core Xeon, OpenSolaris, 7200RPM, ZFS: ~437 seconds

So, on that hardware setup – something is strange. The 10k SAS drive on UFS on the CoolThreads box is really nice though…. makes me want that kind of disk here.

This page was useful, and I used it as a basis for some of my DTrace scripts: http://fav.or.it/post/1146360/dtrace-and-the-mighty-hercules

Also thanks to several people on #opensolaris on Freenode who helped me out with various Solaris specific commands in tracking this down.

Drizzle tests all pass on Solaris/Sparc

Posted on 2009-03-26 by Stewart Smith

Stopping All Servers
All 221 tests were successful.
The servers were restarted 14 times
Spent 1424.921 of 1521 seconds executing testcases

(All tests have passed on OpenSolaris on x86 for a while now).

drizzle group – Identi.ca

Posted on 2009-02-26 by Stewart Smith

drizzle group – Identi.ca

Join us – you know you want to.

Keep up to date with what’s happenning in Drizzle land.

Doing something with Drizzle? just add “!drizzle” to your identi.ca update.

Drizzle Commit Statistics

Posted on 2009-02-25 by Stewart Smith

Per day:

Per Month:

Or more interestingly… What day are commits being made? Are we working over the weekend?

Do we work all night?

Drizzle hackers are just as likely to commit something at 3am as they are at 10am.

No implicit defaults

Posted on 2009-02-25 by Stewart Smith

Improving the Storage Engine “API”

Posted on 2009-02-20 by Stewart Smith

I increasingly enclose the API part of “Storage Engine API” in quotes as it does score a rather large number on the API Design Rusty levels (Coined by Rusty Russell). I give it a 15 (out of 18. lower is better) in this case “The obvious use is wrong”.

The ideas is that your handler gets called to write a row (the amazingly named handler::write_row()). It’s passed a buffer which is the row to be stored. An engine that uses the MySQL row format (lets say, ARCHIVE) will simply pack the row and write it out.

Unless there is a TIMESTAMP field with auto set on insert. Up until now (and still now in MySQL) the engine had to check if this was the case and make sure the timestamp field was updated.

To remove this particular bonghit is actually a really small patch, which Jay recently got merged:

~drizzle-developers/drizzle/development : revision 873.1.16

Hopefully somebody does this soon for MySQL as well.

linux.conf.au 2009 wrap-up (incl Open Source Databases Mini-conf): Day 0-1

Posted on 2009-02-20 by Stewart Smith

It’s no secret that I love linux.conf.au. My first was linux.conf.au 2003, in Perth and I’ve been to every one since (there are at least two people who’ve been to every single one, including CALU as it was called in 1999).

I’ve been on the board of Linux Australia for some insane proportion of the years since then (joining in 2003). Linux Australia is the not-for-profit community organisation that puts on linux.conf.au. It’s all volunteers and amazingly enough we have more than one group of people wanting to put on linux.conf.au each year!

This year, we Marched South to Hobart.

Here I detail what I saw, what I wish I saw and whatever else comes to mind.

Sunday – Before the conference

Ran into Bdale while checking in. Short flight down. A million and one people on the plane and on the ground that I knew. It must be linux.conf.au.

Seeing way too many awesome people I know, checking into accommodation (oh my, what a hill), registering for conf, beer and then off to a “ghosts of conferences past” dinner – where a few people who had organised previous linux.conf.au’s were hastily gathered together to chat to part of the 2010 team.

Monday – Open Source Databases Miniconf Day 1

Oh, that’s right – I’m running the OSDB Miniconf :)

First up, Monty Taylor spoke on “NDB/Bindings – Use the MySQL Cluster Direct API from languages you actually like for fun and profit”. Possibly taking the prize for the longest talk title of the conference. The NDB API is not SQL, it’s what the MySQL server (and one day, when Monty and I get around to it, Drizzle) translates SQL into for NDB. That being said, you can (pretty much always) write NDB API code that dramatically outperforms equivilent SQL (for a variety of reasons). Monty maintains the NDB/Bindings project that lets you use languages other than C++ for the NDB API.

At the same time as Monty was speaking, I wish I’d been able to fork() and go and see “Is Parallel Programming Hard, And, If So, Why?” by Paul McKenney and Michael Still talking about MythNetTV (pull RSS feeds of video in as MythTV programs).

After morning tea, we were meant to have “InnoDB scaling up and performance” by Bruce Huang, but he was a no-show. Hint: if you don’t want bad things to be said about you by conference organisers, either show up or let them know you’re not able to make it.

Instead, we led a crazy Q&A type session around the room which was a whole lot of fun. Really a “ask the experts” meets running up-and-down stairs with a microphone.

Next up, Arjen Lentz who runs Open Query spoke on “OurDelta: Builds for MySQL”. The best way to describe OurDelta is a “distribution of MySQL”. It’s the MySQL server plus a bunch of patches provided by various people that haven’t yet made it into the main source tree (for any number of reasons).

At the same time (if you’ve never been to linux.conf.au, you’ll find that you often want to be in at least 3 places at once) I would have really liked to see “MythTV Internals by Nigel Pearson” (I co-wrote Practical MythTV with Michael Still, which is having a “second edition” in wiki form over at http://www.mythtvbook.com/) as well as the panel on geek parenting as this may be something I’m one day faced with.

Up next: Russell Coker filled in for Kaigai (same talk, different speaker) to talk on The Security-Enhanced PostgreSQL – “System-wide consistency” in access controls. I found this quite interesting and different approaches to database security are worth looking at. Modern applications (read: web applications) don’t map their uses to database users at all. There are usually two users on the database server: the super user and the user that the app uses. It would be nice to have a good solution for those who want it.

Again, If I had the ability to be in two places at once, I would have also seen “How I Learned To Stop Worrying And Love ACPI” by the extremely handsome Matthew Garrett.

Monty Widenius (blog here – and yes, we have two Monty’s now… which does cause confusion) talking about the Maria storage engine. Maria is based on MyISAM, but adding crash safety and transactions (among other things).

Again, if I was able to be in several places at once I would have also seen Rusty‘s “Large CPUmasks”, Nathan Scott talking about “System level performance management with PCP” and Bdale’s “Collaborating Successfully with large corporations”.

An awesome start to the conference.

The FRM file format

Posted on 2009-02-15 by Stewart Smith

It’s fortunate that I’m watching Veronica Mars again with a mate; a more-than-you-think amount of detective work is required to understand the relationship (and format) of the TABLE_SHARE, the FRM file and HA_CREATE_INFO. Oh, also you’ll need drizzled/base.h and drizzled/structs.h and drizzled/table_share.h is also a good one to have open.

The FRM file is really a FoRM file from UNIREG (see copies of really old mysql docs around the place or even better, the links off Sheeri‘s blog post). Also, Jan has some thoughts on FRM too and Thava has a scary frmdump php script.

I have to agree completely with Jan:

“the internals document is missing all the interesting parts”
I’ve just read the source. Everything in the internals doc is easily gotten from the source, so when I finally did take a close look it was “I know all this”. I can’t fault the docs team here at all – I’d place this 3rd from the bottom in priorities. In fact, it’s better to fix it than to document it.
” get the hands dirty and get into the code … it got really dirty”
Oh yeah – and it’s only gotten worse with things added to it.

It contains interesting nuggets like unireg_check (or unireg_type, depending on where you read) that does:

 enum utype  { NONE,DATE,SHIELD,NOEMPTY,CASEUP,PNR,BGNR,PGNR,YES,NO,REL,
                CHECK,EMPTY,UNKNOWN_FIELD,CASEDN,NEXT_NUMBER,INTERVAL_FIELD,
                BIT_FIELD, TIMESTAMP_OLD_FIELD, CAPITALIZE, BLOB_FIELD,
                TIMESTAMP_DN_FIELD, TIMESTAMP_UN_FIELD, TIMESTAMP_DNUN_FIELD};

But really only the timestamp things… which should be default magic, but it’s somewhere tied in (Jay had thoughts last time I spoke to him… hopefully going away soon). A bunch of these aren’t ever used and are just relics from UNIREG. In fact… I went and removed what wasn’t needed and just ended up with:

Â  enum utypeÂ  { NONE,
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  NEXT_NUMBER,
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  TIMESTAMP_OLD_FIELD,
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  TIMESTAMP_DN_FIELD, TIMESTAMP_UN_FIELD, TIMESTAMP_DNUN_FIELD};

Which does seem a bit nicer. TheÂ fact that TIMESTAMP_OLD_FIELD is used as in interim value is, wel, scary. At least with a smaller set of possiblities it will be easier to convert into the proto format.

A hint of a brighter future is in the comment there:

/*
    We use three additional unireg types for TIMESTAMP to overcome limitation
    of current binary format of .frm file. We'd like to be able to support
    NOW() as default and on update value for such fields but unable to hold
    this info anywhere except unireg_check field. This issue will be resolved
    in more clean way with transition to new text based .frm format.
    See also comment for Field_timestamp::Field_timestamp().
  */

Hrrm… a text based FRM? That would be much nicer to read the code for. Unfortunately, it doesn’t really exist. Some FRMs are text in MySQL, but not ones to do with tables. You can look at the FRM for a VIEW in a text editor and see the SQL quite easily (the file format is text).

So I can’t go look at any nice text based format code – it’s all uint2korr() and friends. Yes folks, this is about the only place left in the code with function names in Swedish. What does korr mean? “accurate, correct, correctly”. If you look at korr.h, you’ll see that it’s just for storing in machine independent format: low byte first.

My favourite korr functions:

uint3korr
which reads 4 bytes, so remember to alloc it, initialise it or Valgrind will make you its bitch.
uint5korr
err… 5 bytes of course
uint6korr
6 bytes (getting the pattern now)
uint7korr
which doesn’t actually exist. Nobody loves Seven – George Costanza was wrong.

It also (as Jan showed) does the whole layout on a 80 column terminal for you! This functionality is going, going gone in Drizzle and won’t be coming back.

There’s also an “empty record on start of formfile” (see make_empty_rec in unireg.cc). This bit is going to cause me some pain relatively soon. Not so much for writing something like it out (default values can be easily put in the proto) but by then constructing it on open (with some careful footing around the issue of the egg coming before the chicken).

Incidently, when discussing with Daniel Stone about this (and explaining all the weirdness) it did cause him to exclaim “omg, it’s XKB!” – so that probably helps the X hackers in the room to relate.

The biggest test in moving from FRM to proto is to only rewrite this part of the code – the TABLE_SHARE, field, Create_foo etc have sooo many bits I want to change/fix. Going down the rat-hole into an endless cycle of fixing is always a possibility. Sometimes (like with unireg_type) the cleanup lets me really discover what the code is doing, so that’s being done (but will go away “soon”).

Performance Schema: Show me the code

Posted on 2009-02-12 by Stewart Smith

For such a long worked on feature, with such potential – I find the resistence to publishing a source tree curious (my comments on the topic have been moderated away but others have asked too). I could go and grep through the commits list searching for things (hint: look for mysql-6.0-perf), and then start to re-construct a tree; but I have more important things to do (yes, Brian, like FRM patches :)

Instead of re-inventing the wheel in Drizzle for a performance schema like interface, it’d be great to go with existing work. Evaluating the code as it’s coming along is important.

I also have concerns about the code itself:

Mutex instrumentation:
- how expensive is this in the common case of not instrumenting.
- Is this yet-another wrapper around pthread_mutex_t?
- Could this be done in another, more generic way?
- How can engine devs use this? Do you have to completely be integrated with the MySQL server way of doing things (and give up being able to be a sep piece of software) if you’re going to use this. If there’s a null header, what license is it under?
- Can we use some of the code in (for example) the ndbd and then pass back mutex data from remote systems to the SQL server in a usable mannar? If we do this via NDB$INFO, could this show up in the performance schema?
- Is the code clean or littered with ugly #ifdef?
- Could this be done without having a special mutex type?
memory instrumentation
- is it there at all?
- how are they doing it?
- MEM_ROOT based? what about new/delete malloc/free and various buffers and how this all ties to session etc?
  - In drizzle I’ve been seriously looking at talloc to help with instrumentation of memory usage.
IO instrumentation
- mmap?
how are the performance schema tables being generated (constructing table shares, running CREATE TABLE manually by the user, CREATE TABLE in a helper thread, similar to I_S or some black magic?)
- for NDB$INFO, we generate SQL CREATE TABLE statements and run them to generate a FRM file (I architected and wrote the base kernel code, Martin wrote the MySQL code) as other approaches were considered too hairy and likely to produce bugs.
- For Drizzle, I’m completely removing the FRM files in favour of a discovery based interface that’ll let the engines be in charge of metadata. It’s all in protobufs, so a standard and easy to read data format. So I’m one of the few people on the planet that know about the related data structures. I would like our proto based code to work for performance schema as well.
Is the instrumentation always-in, or using d-trace style no-op funkyness?
Is it better to hook into d-trace (or similar) on platforms that have it instead of providing custom code?
Could performance be gained by using LD_PRELOAD or similar linker foo to only enable some instrumentation but allow others to be selectable on server startup?

So stuck waiting for some code to look at to answer the questions (random commits on the commits list doesn’t really do it like the finished product – i really hate commit lists).

So I can’t currently comment on the performance schema work much at all, nor if it’s useful to Drizzle. Hopefully will soon.

row id in MySQL and Drizzle (and the engines)

Posted on 2009-02-02 by Stewart Smith

Some database engines have a fundamental concept of a row id. The row id is everything you need to know to locate a row. Common uses include secondary indexes (key is what’s indexed, value is rowid which you then use to lookup the row).

One design is the InnoDB method of having secondary indexes have the value in the index be the primary key of the row. Another is to store the rowid instead. Usually (or often… or sometimes…) rowid is much smaller than the pkey of the row. This is how innodb can answer some queries just out of the index. If it used rowid, it may involve more IO to answer the query. All this is irrelevant if you never want just the primary key from a secondary index.

Some engines are designed from the start to have rowid, others it’s added later (e.g. NDB).

Anyway… all beside the point. Did you know you can do this in mysql or drizzle:

drizzle> create table t1 (a int primary key);
Query OK, 0 rows affected (0.02 sec)

drizzle> insert into t1 (a) values (1);
Query OK, 1 row affected (0.01 sec)

drizzle> select _rowid from t1;
+--------+
| _rowid |
+--------+
|      1 |
+--------+
1 row in set (0.00 sec)

Is that the rowid from the engine? No (although at least NDB will let you select the real ROWID through a pseudo column through NDBAPI). Quoting from the MySQL manual:

If a PRIMARY KEY or UNIQUE index consists of only one column that has an integer type, you can also refer to the column as _rowid in SELECT statements.

Unfortunately, this isn’t correct… as this lovely bit of “oh my, what an excellent way to obfuscate my database app!” shows:

drizzle> create table t1 (a int primary key, b varchar(100));
Query OK, 0 rows affected (0.02 sec)

drizzle> insert into t1 values (1,”foo”);
Query OK, 1 row affected (0.00 sec)

drizzle> update t1 set b=”foobar!” where _rowid=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

drizzle> select * from t1;
+—+———+
| a | b |
+—+———+
| 1 | foobar! |
+—+———+
1 row in set (0.00 sec)

So how is this implemented? In two places: in sql_base.cc find_field_in_table() and in table.cc during FRM parsing (this is how I found it). We can even do things Oracle can’t (insert, update and delete):

drizzle> update t1 set a=2 where _rowid=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

drizzle> select * from t1;
+---+---------+
| a | b       |
+---+---------+
| 2 | foobar! |
+---+---------+
1 row in set (0.00 sec)

drizzle> update t1 set _rowid=3 where _rowid=2;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

drizzle> select * from t1;
+---+---------+
| a | b       |
+---+---------+
| 3 | foobar! |
+---+---------+
1 row in set (0.00 sec)

SQLite also has something similar (see the autoinc docs).

I do wonder if anybody uses this functionality. It’s even tested (I was quite shocked at this) in the auto_increment and heap_auto_increment tests.

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Sunday – Before the conference

Monday – Open Source Databases Miniconf Day 1

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: