MemberDB speed improvements

So I finally installed the xdebug PHP extension and started doing some performance analysis of MemberDB using xdebug and kcachegrind. The upshot of which is a number of commits to the bzr tree that dramatically improve performance in several key areas. The answer? Caching.

I’m not even talking using memcached or caching things in database tables or anything like that – just about everything is still the same dynamically produced content as before, but I’m now caching some simple things avoiding many round-trips to the database while executing a script.

There were a few things that were taking a fair bit of execution time:

  1. The generation of the menu. In MemberDB, there’s a menu on the left. There’s also a powerful (read: non-trivial) permissions system allowing relatively fine grained granting of permissions. So, we need to check that the user has permission to go to the page before showing the page in the menu.
    Previously, for each item in the menu, we’d do a lookup to the database – checking if they have the permission or they are an admin. This ended up taking a bit of time – up to 30% of the time for the front page was taken up just generating the menu!
    So, now I cache the set of permissions for the user. One function to fetch it from the DB into a structure, another function to check the permissions of the user in that struct.
    While testing this, I actually used memcached to cache the menu to see how much of an improvement I could get… I’m about 69/70ths of the speed of using memcached with a purely PHP implementation caching the permissions info.
  2. Getting the information about a member is done in a variety of places. On some pages, you want information on the current logged in user (or just need to find their member ID). These are now cached for the duration of the script. Saved quite a few DB round trips
  3. When viewing an election (not the results, just the normal “view election” page that lists candidates), we need to get the membership information on a number of users (okay… so technically I should rewrite some of the queries to use joins in the DB… but this was easier). I now have a (limited) cache of membership info. So now, when a member has nominated multiple people, we only pull the member info out of the database once.
  4. Rewrite the “current_members” view. The old one was not as efficient as it could be. While the new one has slightly different semantics (can have duplicate rows, it turns out the use of DISTINCT was adding a bit of execution time, which for a bunch of queries is not needed) it’s significantly quicker.

I used the faithful Apache Bench (ab) to do benchmarks against the modified PHP code. I think the biggest improvement was the view election page which went from about 6seconds/page to 0.2seconds/page.

My 2nd book is available! (MySQL 5.1 Cluster DBA Certification Study Guide)

Neither of the books I’ve been an author of has been just me. For Practical MythTV (Christmas is coming, buy it for all your TV and tech loving friends!), Michael Still and I worked hard to get a well rounded and practical (not to mention good) book. I think we succeeded – certainly has gotten positive reviews (check the amazon page).

For my second endeavor (just to make it fun, I was working on both at the same time) we have a much longer list of authors. The aim was to write a study guide for those wishing to be certified in MySQL Cluster. Being a developer with a fair bit of knowledge in the product (and somebody who also presents and writes) – I was a natural fit to join the team (some may say “roped into the team”… and they could possibly be me, but I couldn’t possibly comment).

My fellow authors:

  • Jon Stephens
    Among other things, he’s brought the MySQL Cluster section of the MySQL Manual forward leaps and bounds. He also edited the study guide – no doubt a daunting task.
  • Mike Krukenberg
    Author of Pro MySQL (along with Jay Pipes) and numerous blog posts on MySQL and related topics.
  • Roland Bouman
    Who took on the brave task of the actual Certification. This is a certification to be proud of – really makes sure that people deserve it without being overly hard or tricky.
  • Solomon Chang
    founding member of LAMPSIG of Los Angeles and a professional DBA.

So, the book is now shipping, from lulu.com (an on-demand printing service) for $49.99USD.
MySQL 5.1 Cluster DBA Certification Study Guide by Jon Stephens, Mike Kruckenberg, Roland Bouman, Stewart Smith, Solomon Chang

It feels good to have it out there now. Daniel va Eeden has received his copy (shipping box and all!). This is another book you should buy for all your database friends  everybody you’ve ever met.

libeatmydata

Following my successful linux.conf.au talk “Eat My Data: How Everybody Gets POSIX File I/O Wrong“, I started to feel the need to easily be able to have my data eaten.

Okay, not quite. However, when you’ve written your software properly, so it uses fsync() correctly, opening files with O_SYNC or whatever – tests take longer as you’re having to wait for things to hit the rust.

So….. LD_PRELOAD=libeatmydata.so to the rescue! With a POSIX compliant fsync() (that does nothing) and filtering on open(2), it can take your test run times down dramatically.

The only time you shouldn’t use it for your tests is when you end up crashing the machine to test durability (i.e. when the OS doesn’t have the opportunity to cleanly write out the data to disk).

See the libeatmydata project page: http://www.flamingspork.com/projects/libeatmydata/

and the bazaar repository: http://www.flamingspork.com/src/libeatmydata

(it’s seemed to have saved somewhere between 20 and 30% of the time for innodb/ndb tests in mysql-test-run).

mysql-5.1.22-stew2

New:

  • Updated NDB Compressed LCP and BACKUP patches (now with O_DIRECT support)
  • InnoDB patch for Windows that should give ~5x improvement on commits/sec (Bug31876)
  • Everything in current telco-6.3 tree (ndb ~6.3.5)
    • Lots of NDB improvements and new features over regular 5.1.
        • WL3686 Remove read before update
        • WL2680 NDB Batched Update
        • WL2679 NDB Batched Delete
        • WL4108 NDB Handler statistics
        • WL4096 NDB Realtime performance and settings
        • WL3126 and WL3127 Client and Replication bind address
        • NDB Online ALTER TABLE ADD COLUMN
        • NDB Multi-Master replication conflict resolution (limitations apply :)
        • NDB prepare for endian independence
        • NDB micro-gcp (reduces replication lag)
        • NDB SendBuffer throttling
        • NDB MySQL Server TC selection (improve performance)

    Old (In previous patchset too):

    • Remove ndb_use_exact_count giving up to 300% performance improvements on Joins in NDB
    • INFORMATION_SCHEMA table for NDB node status
    • NDB Cluster Log as CSV file (suitable for ENGINE=CSV)
    • Skeleton Engine (build from storage/skeleton)
    • MyHTTP Engine (build from storage/myhttp)
    • PBXT Engine (build from storage/pbxt)
    • Make ARCHIVE faster at compressing (at slight expense of space usage)

    Availability:

    • Patch (apply with -p1 to mysql 5.1.22) 4.0 MB
      • Applies cleanly on a BK source tree… a few files don’t exist in the tarball on dev.mysql.com (due to the way it’s built)… so when asked for “file to patch” just hit enter and then choose y to skip that patch.
    • README (list of patches, descriptions) 13kb
    • quilt patch series tarball (individual patches) 4.1MB
    • diffstat 228k

    Feedback much appreciated.

    Speaking at VITTA (Victorian IT Teachers Association Inc) Conference

    I’m speaking at the upcoming VITTA conference.

    Title:MySQL database administration for non DBAs

    Abstract: MySQL is incredibly ubiquitous. MySQL database administrators are not everywhere; MySQL is. Often MySQL is run to power a small web site or two, an application or two, or run on a machine purely for someone else’s use (and the install made MySQL just work so you don’t have to care). This session goes over the things you need to know about your MySQL installations to keep them healthy without burdening you with work, including MySQL Basics, installation, security, backup, restore, performance and upgrades.

    When: 12:15 PM, Wednesday 21 November 2007

    Should be fun!

    MySQL 5.1.22(ish)-stew1

    I’ve decided to publish my patch series. The goal of the -stew patches is to collect things I find interesting and that at some point could (should) make it into the main MySQL tree (even if others don’t think so).

    It’s not designed for use in production.. I don’t really care if there’s failing test cases…. if it builds it’s perfect.

    It includes the following which could be interesting:

    • Removal of ndb_use_exact_count (performance for NDB)
    • NDB node status in an INFORMATION_SCHEMA table
    • Compressed Backup and LCP for NDB
    • Cluster log as CSV
    • Skeleton Engine
    • MyHTTP Engine
    • PBXT Engine
    • Skeleton of MyBS support for NDB
      • (in the hope that somebody finishes it)

    Currently the additional engines have to be built separately in their storage/ENGINE directories. I have some preliminary patches to get them to build in-tree via the plug.in file, but it’s not finished (patches welcome).

    This is all currently based off the 5.1-ndb tree. In future, it will likely be based off 5.1-telco.

     http://www.flamingspork.com/mysql/patch-5.1-ndb-stew1-20071016.patch.gz

    and broken out in:
    http://www.flamingspork.com/mysql/patch-5.1-ndb-stew1-20071016/
    (including the not-quite-working ENGINE_in_tree_build patches)

    Known to apply against this tree:

    http://www.flamingspork.com/mysql/mysql-5.1-ndb-20071016.tar.bz2

    Comments, thoughts, patches to include, all welcome!

    Compressed LCP and Compressed Backup (and switching them on/off online)

    Quick experiment with online changing of enabling/disabling compressed backups and local checkpoints (LCPs).

    Backup is incredibly trivial and correct (even have some nodes do compressed, some not).

    LCPs are a bit trickier when it comes to restore… currently how the code sits is that a block using the compressed file interface in NDBFS must specify if it wants to use the compressed read/write interface or not. So when you have LCPs that differ in compressed/non-compressed than the current config file setting, you’re not going to be able to restore them (although setting CompressedLCP=1 should let you restore either compressed or non-compressed LCPs).

    At some point, I’ll probably move AsyncFile (our async file IO class) to just use azio alway, and modify azio to be transparent for non-compressed files…. I just have to fix up azio for direct io.

    Things that break while travelling….

    This year, it seesm that whenever I go out for significant travel, the following things will break on my trip:

    • a laptop power supply
    • a disk

    At least this time the disk is part of a RAID1 array.

    Oh, and for some reason my mythbackend stopped doing anything a few days ago…. and I wasn’t checking it. grr… annoying. At least there’s not much on TV.

    ndb_mgmd on Win32 (an Alpha)

    So, here is an Alpha quality port of the MySQL Cluster management server to Win32 based on the current MySQL 5.0 tree.

    This isn’t going into 5.0, so don’t expect to ever have that.

    This isn’t going into 5.1 either, so don’t expect it there.

    It’ll go into some future release at some level of general “supported” status that has yet to be decided.

    ONLY USE THIS FOR EXPERIMENTAL PURPOSES.
    IT IS EARLY RELEASE – IT HARMS PUPPIES!

    But, it would be great for those who may be interested in having a ndb_mgmd on Win32 at some point to grab the binary, have a play and find some bugs.

    For any bugs filed, please submit to bugs.mysql.com and explicitly mention that it’s version “5.0.50-ndbwin32r1” and mention that it’s the specific build (i.e. it shouldn’t go through the normal bug verification procedure and instead end up with me looking at it directly).

    So, here’s the files:

    Hopefully this brings you joy.

    Oh, and yes, you can go and run it under WINE so you don’t have to actually use MS Windows.

    enjoy!

    ratting on “leading” platforms…

    Yes, I really, really really dislike the Microsoft Windows platform. I think you have to approach insanity to even remotely consider using it in a HA environment.

    That doesn’t mean that we shouldn’t support it. Switching an entire software stack can be a lot of work. Much better to gradually move to complete freedom and sanity.