Cool Dolphin SCI interconnect stuff

http://www.dolphinics.com/ make this cool SCI socket hardware that can be used with MySQL Cluster (for example, like their example setup).

Their tech provides high performance (350MB/sec avail to the user) and low latency (worst case is like 2 or 3 micro seconds to send 512bytes to another node). So can pretty much kick the butt of gig-e.

We could probably do some really cool stuff with boosting performance (even further) when using SCI with some of the things I have in mind for multithreaded ndb kernel – basically changing some of the ways we do sending and receiving signals and improvements in shared memory stuff.

Big points from the presentation are:

  • small messages sent using basic CPU instructions (it’s remote memory mapping)
  • low cost to write to remote memory address
  • raw worst case send latency for 8 bytes is about 210 nanoseconds
  • no need to lock down or register memory
  • TCP/IP processing not done in software
  • Just LD_PRELOAD the library and it does your (user specified) Ip communication over the SCI interconnect
  • can be fully redundant (dual cards, distributed switching)
  • each card is about 5w of power (rather insignificant compared to other techs apparrently)
  • really small time for failover

It’s also good to note that 10 gigabit ethernet doesn’t really buy you anything in reducing latency. SCI gives you both improved bandwidth and latency.

People looking into wanting more performance in MySQL Cluster should have a good look at it.

It’s also used in fighter planes – which make cool loud jet noises.

(err… i didn’t mean to sound to rah rah. hopefully i’ve just sounded like i think the tech is shiny)

UPDATE: corrected milli to nano.

UPDATE mk2: corrected nano to micro. Oh how I wish I just typed correctly to begin with. At least I’ve had some rest now :)

Mike Hillyer’s laptop melting and backup fun at UC

Mike Hillyer’s Personal Web Space » Blog Archive » It’s Alive!!

Mike’s laptop went funny, but he had a backup of his presentation.

So, something about my backup strategy.

I have a policy that anything that I really care about is backed up. If it’s not backed up, I don’t care about it.

e.g. while I’d be sad if my mythtv box suddenly had a disk failure, I can always put in a blank disk and I don’t loose too much.

My email is fetched onto a server at home, and I use offlineimap to keep an up to date (nearly) copy on my laptop. I also, at least weekly, burn the entire thing to DVD (it still fits, when bzip2 compressed).

Also, for all that other stuff that is pretty important (/home), I do a xfsdump to external disk.

I also now (on a paranoid spending trip at Fry’s) have a small portable drive that is roughly twice the size of my /home partition. The idea is that on the road I can regularly do an xfsdump to this  – in fact, two complete dumps (and one or two incrementals on it).

Call me paranoid, but I like my data.

I also make sure I burn photos to DVD, but that’s more periodic as there’s a lot of them now.

All your Cluster BOF is belong to us

So, we had a really good Cluster BOF last night. Started at 8:30 and at 11pm everybody was tired enough to go to bed :)

Healthy mix of people with deployed clusters, prototype clusters and even some who have looked at MySQL Cluster in the past, decided it wasn’t for them at that point in time, but are still interested enough to show up to the BOF.

It was really freeform (as in I got up and said “there is no agenda for this – what do we want to talk about?”).

We got some really valuable feedback about what people like, dislike and even did hands-up polls of “what do you want us to do first?”. Also got some good suggestions on what to tweak (small fixes) to make people’s lives a lot easier.

The room was pretty well populated as well. my guess was half full (which means nothing until you see the size of the rooms. i’ll try and get a photo at some point).

The consensus at the end seemed to be that people found the BOF valuable as well.

I’d love to hear further feedback. Be brutally honest too.

My UC Talk tomorrow (what you should know)

MySQL Users Conference 2006 – MySQL Cluster: New Features and Enhancements

If you are coming to my talk, make sure you know a bit about cluster beforehand. Being at Johan’s talk today was a good idea.

Or reading the manual chapter.

Otherwise you may end up being quite lost for a lot of the talk. Mine isn’t an intro to cluster one.

PostgreSQL 7.3: SQL Key Words

PostgreSQL: Documentation: Manuals: PostgreSQL 7.3: SQL Key Words

It’s very annoying that ‘user’ is a reserved word in postgresql. You also get really crappy error messages (at least with the various forms of quoting I’ve tried to use) when you try to create a table called ‘user’

$ psql web
Welcome to psql 7.4.8, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

web=# create table user (a int(10), b int); ERROR:  syntax error at or near “user” at character 14
web=# create table “user” (a int(10), b int);
ERROR:  syntax error at or near “(” at character 27
web=# create table ‘user’ (a int(10), b int);
ERROR:  syntax error at or near “‘user'” at character 14
web=# create table `user` (a int(10), b int);
ERROR:  syntax error at or near “`” at character 14
web=#
web=#

You will even get the “at character X” if you’re piping something into psql. Hrrm… a line number would be useful.

It also means that I can’t compare results from MySQL and Postgresql involving a table called ‘user’. Bummer.

Any postgresql gurus out there got a solution for me?

Rusty on floating point (and keeping neat code)

Rusty talks about the “fun” of floating point and how this all ties into Wesnoth.

Platform consistency is certainly a good thing – so I’m guessing the attack_prediction code isn’t run by each node in a network game in a way where machines could disagree on the outcome.

This does however bring up an interesting thing. What if, in the future, it was going to be on a per-node basis and people wanted it to be consistent. How do you warn that this isn’t the case (to somebody who is really just reading the docs on this function)?

Is it easy (or is there even a good way) to separate code that’s on one machine versus every one? In NDB we have some protocols where some things are done on a master and others on the slaves (and sometimes, when we go back to refactor the code, we move some of this stuff around – e.g. some work on the BACKUP block that I did a while ago).

In NDB we rely on separate documentation (a diagram showing what signals go where and from who) and keep the code for executing the signals together in the code. We require the coder to think when they’re changing things about where the code is going to be executed (on the master, the slave or both).

We’ve also started to get some better habits in naming structures that are only going to be filled out on the master (or slave) or both. Writing code that looks at the wrong thing has been a source of bugs (especially while hacking on something) that are annoying to track down.

So how do we have these functions that in some cases shouldn’t be used (e.g. when consistency across platforms is important, or should only be used on the slave side of a distributed protocol)? Or rather, how do we warn others (and ourselves) from getting it wrong in the future?

Is the ultimate answer just that “you should read the code and understand it before you use it”? Probably, because any comments are going to be out of date anyway….

i now look forward to some sort of discussion.

Beat on “state of the dolphin” (or: Why Software is never really ready until a .20 release)

Beat Vontobel blogs about “fuþark: The silence of futhark and the state of the dolphin” which is basically about how he’s found that the 5.0.20 release of MySQL is when the 5.0 release is really starting to shine.

This confirms my theory (that I’ve had for quite a while now… like years) that a software release is never really mature until it hits about .20 (that’s dot twenty, not dot two).

When something reaches .10 (dot ten) it’s no longer going to be annoying for most uses, but .20 means that you’re going to be happy. Don’t ask me really why this is the case, but it is.

Think about the 2.6 kernel (yes, Linux Kernel – honestly, you think i was talking about something else?). At about 2.6.10, it would no longer be a pain to use and get things going – everything was starting to be smooth. As we’re getting closer to .20, things are getting better too. Mind you, everything here does run 2.6 now (and so does my mum’s machine – which is always a good sign of something being ready). With 2.4 hitting .20 – you’d never even think about using 2.2, 2.4 was perfect (except when you wanted 2.6).

GNOME (and everything attached to it) is getting to be a really good desktop – ever since about the 2.10 release I’ve been using just much more of the GNOMEy way of doing things because they’re actually getting useful and usable (don’t get me wrong, previous releases were good too – but a lot more things annoyed me). As the releases have progressed, I’m increasingly convinced that 2.20 will be the “we’re here” release. 2.14 is a lot better, but there’s still a bunch of stuff that has to be done before it’s totally kick-ass.

There are no surprises in MySQL 4.0 (it’s past .20 – at .26 now). Everybody knows and trusts it. 4.1 is at 4.1.18 – which is about as good as a .20 and it’s a pretty happy release. But due to 4.0 being rather solid – a lot of people have just stuck there. We’re seeing a bunch move to 5.0 – but my theory is that this will be 5.0.20 or above. Hrrm… anybody see a pattern?

MySQL 5.1 is at 5.1.10 (or so) and it’s stopped being annoying, and that great march towards a .20 is healthy and active.

GCC 2.95 had a lot of respect for a very long time (now it’s just a bit old). Note that .95 is higher than .20 :)

EMACS is at version 21, but ed is only at .2 (hrrm.. and which is used by more people as their editor i wonder).

aptitude at 0.2.15 (getting to .20) – while apt is at 0.6.40 (above .20). RPM is only at 4.0.4 – so a bit to go there :)

The version of postgresql is 7.5.9 over here… so getting to the .1 stage, but away from the .20. (now I’m going to watch comments fill up with postgesql guys going on about something, i just know it :) But there is 7.3.14 – a lot closer to .20!

MythTV is at 0.19 – getting closer to the .20 release (it’s a lot better than even just a few releases ago).

(versions here mostly taken from whatever ubuntu 5.04 has)

Note that attempting to skip a whole bunch of versions and label your software 95, 98, 2003 or whatever doesn’t get you “.20” status. Neither does just skipping to “.20” automatically. It’s about hard work and removing annoying things (we tend to call them bugs).

This is a really stupid metric of software maturity. It is, however, disturbingly accurate.