ndb_mgmd on Win32 (an Alpha)

So, here is an Alpha quality port of the MySQL Cluster management server to Win32 based on the current MySQL 5.0 tree.

This isn’t going into 5.0, so don’t expect to ever have that.

This isn’t going into 5.1 either, so don’t expect it there.

It’ll go into some future release at some level of general “supported” status that has yet to be decided.

ONLY USE THIS FOR EXPERIMENTAL PURPOSES.
IT IS EARLY RELEASE – IT HARMS PUPPIES!

But, it would be great for those who may be interested in having a ndb_mgmd on Win32 at some point to grab the binary, have a play and find some bugs.

For any bugs filed, please submit to bugs.mysql.com and explicitly mention that it’s version “5.0.50-ndbwin32r1” and mention that it’s the specific build (i.e. it shouldn’t go through the normal bug verification procedure and instead end up with me looking at it directly).

So, here’s the files:

Hopefully this brings you joy.

Oh, and yes, you can go and run it under WINE so you don’t have to actually use MS Windows.

enjoy!

ratting on “leading” platforms…

Yes, I really, really really dislike the Microsoft Windows platform. I think you have to approach insanity to even remotely consider using it in a HA environment.

That doesn’t mean that we shouldn’t support it. Switching an entire software stack can be a lot of work. Much better to gradually move to complete freedom and sanity.

How not to get a sensible response on a mailing list/web forum

For a start, you shouldn’t be using a web forum. Are you too stupid to use email or what?

Ask yourself, does your question boil down to “pls do my job 4 me. kthxbye” (because it’s in stupid speak because it’s an obviously stupid question to be asking). If it does, don’t post it. Instead, RTFFM (where F is for Fine…. at least one of them is anyway), look at the archives. You will then find approximately eleventy-billion messages pointing you to exactly the right tool and docs to let you answer your question – easily.

Applying BK produced patches with new files using patch…

So, if anybody is crazy enough to grab patches from the commits@lists.mysql.com list and try and use them… you may have run into this problem (which I do every few weeks/months): new files aren’t created. That’s right kids, BitKeeper (or at least the post-commit hook that mails out the patches) produces patch files where new files are in a completely different format than GNU patch expects. Not even BK can import these patches (bk import -temail or -tpatch just don’t do it).

So… I present this script – unfuck_bk_patches.pl which once run across a patch that includes new files, allows you to apply it using patch (or, for example, quilt).

I release it under the “sworn at software license”… which means you’re allowed to do anything you want with it as long as at some point you have sworn at computer software for being crappy.

thoughts on MySQL release cycle

Thoughts on latest changes:

  • don’t think there’s really much to it.
  • I rather disagree with this slashdot headline (MySQL Closing Off Its Source) as I just don’t think it’s true.

However, I have other thoughts (that are a lot more interesting to discuss):

We should:

  • Release major version every 6 months. e.g. N.0, N.2, N.
  • Odd numbers are used during the 6months of development, with very frequent releases. In fact, with a strict policy of keeping pushbuild green, you could automate this. Yes, some of these releases would be utter shit due to whatever problem seeped in. Get over it – it’s called a development release.
    • No new features merged for last 3 months of cycle for release.
    • source only releases… if you can’t build from source then you’re too stupid to run this. (or, diplomatically “you shouldn’t run this”)
    • For features that take longer than 3 months, we can have a “-proposed” patch set. i.e. a staging area for new things before they go in.
  • Minor versions to latest N.x when needed (N.x.z)
    • Until the next N.x version, where N.x-1 is forgotten. I mean forgotten as in no patches at all… others can if they care.
  • Pick a N.x and support it for Y years (a good N.x that is)
    • i.e. provide N.x.z
    • this can be our “RHEL” so to speak.
    • fixes go here.
    • call it ‘enterprise’ or whatever.

Past problems:

  • 5.0 took too long to get to a real GA status
    • A bunch of things were broken in that release cycle… Although I joined it relatively late.
    • It’s been a decent release for a good while now, so that’s a good thing.
  • 5.1 has taken too long to get to GA
    • good news is that 5.1 at GA should be a lot better than 5.0 at GA
    • As a developer I can honestly say I think we’ve improved processes a lot for making sure that a release doesn’t suck.
      • and as a result of this… I feel like 5.1 is the release where a lot of this stuff is fixed, and others should go a lot smoother.
    • It’s passed the dot-twenty rule for a release that doesn’t annoy you.

There are a lot of things looking good and being done right too (or if not right, a lot better than a year or two ago). e.g.

  • NDB -telco (Carrier Grade Edition) releases
  • worklog (open to the wider web)
  • forge
  • bugs db
  • commits, code reviews and all that

Things we should fix with commits, code review and all that:

  • drop the commits list all together except for crazy people.
  • everything posted to internals@lists for review, and reviews take place there
    • or IRC or whatever… but outcome posted there
    • hrrm… i should make people do that to get me to review things… (i.e. i should listen to myself)

Things we should fix internally:

  • We should have 20% time… if only for random MySQL related things… lots of cool stuff has come out of engineers just hacking… even when we weren’t 100% meant to.

Things I don’t think will happen but could be useful…:

  • dropping commercially licensed product
    • It would be really nice to use  GPL licensed libraries around the place instead of either having #ifdef or reinventing the wheel.

What if it all goes proprietary:

  • Some people speculate this could happen. Well, what then happens is a crapload of engineers leave the company – and not on good terms. So at least unlikely to happen without a massive implosion.
  • I’ll say it here: if the code I’m writing isn’t available under the GPL (or other good free software license), I’m looking for work (and you should contact me with offers).

My thoughts on the non-free Network Monitoring and Advisory Service:

  • It’s not free software… so really isn’t interesting to me personally.
  • Others see it differently and attach value to it – good for them. I hear it makes us money as well – which does keep me in adequate supplies of scotch.
  • I have used it a bit and it is quite neat – so hats off for a neat product.

P.S. there’s nothing here I wouldn’t say to anybody… and they’re welcome to disagree (and they do… sometimes even for good reasons).

linux.conf.au 2008 Mini-Conf Selection

So, last night a group of us sat down and went through all the mini-conf proposals for linux.conf.au 2008

There were a lot of proposals. There were also a lot of good ones.

We’re not announcing anything yet… but in the interest of openness… here’s the procedure.

We started out as any responsible group of selectors would…. looking at the proposals over beer:

dsc_8260.JPG dsc_8261.JPG

a few jokes thrown in… frank discussion and all that. But really, we came to the conclusion that it’d been all done before and we needed to somehow narrow down all the excellent suggestions…

Luckily, the pub we were meeting at had the right facilities!

dsc_8262.JPG

And we went about selecting a few more…

dsc_8263.JPG

Of course, there are simply some mini-confs that we all agreed were a must have…  although nothing was certain…

dsc_8266.JPG

One of the more hilarious suggestions of the evening was to force somebody to organise a PostgreSQL miniconf, convince Marten to hold a MySQL company meeting in Melbourne around Jan 2008 and have all of MySQL AB come and sit in the back of the room for the PostgreSQL miniconf.

MySQL Storage Engine API Gotcha #42

(Filling 1 through 41 is an exercise left to the reader… I just like the number 42)

handler::info can be called on a handler that has never had ::exeternal_lock called. So if you rely on a call to handler::external_lock to set up something (e.g. a pointer to a transaction object), you may explode in a heap.

See: Bug#26793

Backup and Recovery (the book)

A little while ago now, I did some tech reviewing of a book called Backup & Recovery… specifically MySQL related things (and MySQL Cluster). Curtis was kind enough to send me a copy of the book as well – and I’ve been reading the rest of it bits at a time since I got it.

I’m rather impressed… it gives a good mix of overview and digging deeper on just about every way to back up and recover systems. It also discusses several products that I didn’t know about (and have partly investigated now because of it).

It also has good sections on process: as in how to decide what to backup, encouraging the use of checklists and all that. Heck… recently when doing a restore I realised I never backed up /boot (annoying, not catastrophic… as I know my way around).

I recommend getting a copy of it if you need to back up and restore systems and don’t know everything already.

ndb_mgm pr0n

ndb_mgm> all report MemoryUsage

Node 1: Data usage is 11%(632 32K pages of total 5440)
Node 1: Index usage is 22%(578 8K pages of total 2592)
Node 2: Data usage is 61%(3331 32K pages of total 5440)
Node 2: Index usage is 40%(1039 8K pages of total 2592)
ndb_mgm>

Oh, and that’s coming from saved command history.

(as seen when upgrading my cluster here to mysql-5.1.19 ndb-6.2.3 – i.e. MySQL Cluster Carrier Grade Edition – i.e. the -telco tree)

Things that have recently stalled….

  • compressed backup patch
    • actually works rather well… and restoring from compressed backup too.
    • need to modify the rate-limiting code though… may as well rate limit the writing of the *compressed* data stream… otherwise the option isn’t nearly as useful
  • compressed LCP patch
    • well… the *restoring* of compressed LCPs…. can write them
  • working out exactly what more information I want out of the linux memory manager to find out what kswapd is really doing (and the patch that exports the right info)
  • re-jigging my procmail filters for commits@lists.mysql.com
  • fixing up my offlineimap patch and getting it in upstream
  • disk pre-allocation for MythTV recordings
  • buying workstation
  • unpacking the last few boxes around the house
  • finishing this list.

Run Backup, Run!

Over the past N weeks/couple of months, we’ve been making a number of improvements to how backups are done in MySQL Cluster.

Once you get to large data sets, you start to really care about how long a backup takes.

Traditionally, MySQL Cluster has been in-memory only. The way to back this up is to just write from memory to disk (rate limited) and synchronised across the cluster.  Since memory is really fast (compared to the rate we’re writing out to disk) – never had a problem.

In MySQL 5.1 (and Cluster Carrier Grade Edition- CGE), disk based attributes are supported. This means that a row has both in memory and disk based parts.  As we all (should) know, disk seeks take a very long time. We don’t want to seek.

So, at some point recently we changed the scanning order from in-memory order (which previously made perfect sense) to on disk order. Randomly seeking through RAM is much cheaper than all the disk seeks. This greatly improved backup performance.

We also did some read-ahead work, which again, greatly improved performance.

Today, I see mail from Jonas about changing the way we read tuples for backup (and LCP) to make it even more efficient (READ_PACKED). This should also reduce CPU usage for LCP/Backup… which is a casual issue. I should really take the time to look closely at this and review.

I also wrote a patch to the code in NDB that writes files to disk to write a compressed gzio stream instead of an uncompressed one. This happens in a different thread, so potentially using one of those CPU cores that ndb wouldn’t otherwise use… and also dramatically reducing the amount of data written to disk…. this patch isn’t in any tree yet, and I’ve yet to try it with the READ_PACKED patch, which together should work rather well.

I also need to grab Brian at some point and find out why azio (as used by the ARCHIVE engine) doesn’t work the same way as gzio for basic stream writing…