Continuing on from my previous posts, MySQL code size over releases and MariaDB code size I’ve decided to also look into some other code branches. I’ve used the same methodology as my previous few posts: sloccount for C and C++ code only.
There are also other branches around in pretty widespread use (if only within a single company). I grabbed the Google, Facebook and Twitter patches and examined them too, along with Percona Server 5.1 and 5.5.
Codebase | LoC (C, C++) | +/- from MySQL | |
Google v4 patch 5.0.37 | 970,110 | +26,378 (from MySQL 5.0.37) | |
MySQL@Facebook | 1,087,715 | +15,768 (from MySQL 5.1.52) | |
Twitter 5.5.29.t10 | 1,192,718 | +3,624 | |
Percona Server 5.1 trunk | 1,066,418 | +14,878 (from MySQL 5.1.66) | |
Percona Server 5.5 trunk | 1,208,577 | +19,483 (from MySQL 5.5.29) | +142,159 (from PS 5.1) |
Drizzle trunk | 334,810 |
The Google patch has always had a reputation of being large, and with an extra 26kLOC of code, it certainly is the biggest of any of the more current branches – and that’s actually a surprise to me that it adds this much code.
The Facebook and Percona Server 5.1 branches are amazingly similar in how much extra code they add, and they’re not carbon copies of each other. The Twitter patch quite notable for how little extra code it adds.
For giggles, I included Drizzle – which is (even with all the plugins) less than a third of the size of MySQL 5.1.
It’s clear that the Percona Server and Facebook patches introduce much less code than MariaDB does, which does go with the general wisdom of them being closer to Oracle MySQL than MariaDB is.
If we look at Percona Server, we see that with Percona Server 5.5 there is indeed a bunch more code than was in Percona Server 5.1, with roughly 5,000 more lines of code than we’d expect from a simple port from MySQL 5.1 to MySQL 5.5. This feels about right, we’ve added new things to Percona Server 5.5 that weren’t in Percona Server 5.1.
Kostja Osipov liked this on Facebook.
Andrew Hutchings liked this on Facebook.
Antony T Curtis liked this on Facebook.
Patrick Crews liked this on Facebook.
I think there might be something wrong with the way lines are counted:
$ git diff mysql-5.5.29..mysql-5.5.29.t10 –stat
429 files changed, 23707 insertions(+), 1766 deletions(-)
$ git diff mysql-5.5.29..mysql-5.5.29.t10 –stat –relative sql/ include/ client/ mysys/ storage/ unittest/
142 files changed, 7110 insertions(+), 821 deletions(-)
That’s because I’m not counting diff size, I’m doing differences between sloccount totals. This means that these counts are more a “hey, what did they add” rather than “went and changed code to fix bugs”.
I plan to look at diff size in the not too distant future, as it likely tells a very different story.
It might be worth updating the post to reflect that. Also, if I understand correctly, it’s “hey, what new lines they added”. Which, in my humble opinion, is meaningless because if you add 10 lines to a file, but later remove 10, the count will be zero using this methodology.
Yep.. it’s flawed. So is diffstat too though, as if you just ran indent over everything you’d look like you completely rewrote the thing :)
aaannd updated to mention sloccount.
RT @stewartsmith: Other MySQL branch code sizes: Continuing on from my previous posts, MySQL code size ove… http://t.co/xkFwJPxXFY
Valerii Kravchuk liked this on Facebook.