Not only is it hard to define defect (and it is very obvious that some defects are worse than others), but this code review sounds like it only spots "grammatical" or style errors in the code. It doesnt sound like it could find a defect in an algorithm implementation or logic. To me, these are where the true defects are, in the logic/reasoning breakdowns.
It does indeed sound a bit like that, and with good reason. If you notice, the "independent review" was carried out by Reasoning Inc., and weve heard of them before in these parts.
For the benefit of those who havent seen this trollfest^H^H^H^H^H^H^H^H^Hstory in its previous incarnations, Reasonings services spot what some people call "systematic" errors, things like NULL pointer dereferencing or the use of uninitialized variables. As many people note every time this subject comes up, any smart development team will use a tool like Lint to check their code anyway, as a required step before check-in and/or as a regular, automated check of the entire code base, and so any smart development team should find all such errors immediately. IOWs, its grossly unfair to compare open and closed source "code quality" on this basis. Any project that has errors like this in it at all isnt serious about quality, and it shouldnt take an external study to point this out.
Serious code quality is not dictated by how many mechanical errors there are that slip through because of weaknesses in the implementation language. Rather, it is indicated by how many "genuine" logic errors—cases where the output differs unintentionally from the specifications—there are. Of course, no automated process can identify those, but to get a meaningful comparison of code quality, youd need to investigate that aspect, rather than kindergarten mistakes.
There are other objections to their principal metric as well. For starters, source code layout is not normally significant in C, C++ or Java, so any metric based on line count is going to be flawed at best. But the big objection is that theyre talking about childish mistakes, and comparing supposedly world-class software based on childish mistakes isnt helpful (except to dispel the myth that some big name products have sensible development processes).
Re: they quantified it by dividing verified defects by lines of code.
Problem with that is that it assumes the same "code density". Granted, its probably not going to differ by a factor of six, but remember the old question about programmer productivity:
Whos more productive: the coder who solves a given problem with 100 lines of code written in 1 hour, or the coder who solves it with 10 lines in 2 hours?
I mean, simple stuff like doing this:
bool function(int i);
//blah blah blah
bool function(int i);
foo = false;
foo = function(i);
//blah blah blah
...will give you a threefold difference in line count (specifically counting lines in the main() function). Throw in an identical line using malloc in each, both forgetting to free it later, and youve got a "bug density" of .33 for the former, and .14 for the latter. Heck, you could have two un-freed mallocs in the latter and itd still only be at .25! Im not saying the study is wrong—Id rather have the code out where I can see it, no matter WHAT the "bug density"—Im just saying that I wouldnt take any statistic that is derived using "lines of code" as a variable as a serious, hard number.
If only it were MySQL just lacking features that would, after much mudslinging at the ideas themselves, be grudgingly retrofitted into a new table type.
MySQLs attitude toward data integrity can be summed up as "if the constraint cant be satisfied, do it half-assed anyway." I find myself having to write application code to manage data integrity with MySQL, something I can take for granted with a real database.
No defects != good software.
A flawless implementation of a crap algorithm is still crap. I dont care if your bubble-sort routine has no memory leaks or buffer overruns; it still scales O(N^2). Likewise, a so-called "database" which does not implement key features like transactions and stored procedures is fundamentally flawed even if there are zero coding errors. o
MySQL may be well-written, but its still a piece of crap by the standards of any professional DBA.
Sorry, but until MySQL has a mode where ALL tables are transaction safe, or at least throws an error when you try to create a fk reference to a non-transaction safe table, its transactions are too prone to data loss due to human error.
Its a good data store, but the guys programming it have to "get it" that transactions cant be optional in certain types of databases, and neither can constraints, or fk enforcement.
MySQL has a tendency of failing to do what you thought it did, and failing to report an error so you know. This is a legacy left over from being a SQL interpreter over ISAM files. It makes MySQL a great choice for content management, but a dangerous choice for transactional systems.
Yeah, and the 3 users on the planet who actually need a full fledged SQL database can install Oracle or DB2. Although Ive had my indexes corrupted and other horrible things with both those database packages. §
Ive worked on several projects interacting with SQL databases and Ive only seen one really take advantage of the power of the database. Most of them are using Oracle as a glorified DBASE III, and as a glorified DBASE III, MySQL is much less expensive. And Ive seen entire companies built around DBASE III applications.
Re: Six times better?
Sadly, this isnt what most people assume it means. Reasonings software only finds "obvious" defects, such as null pointer assignments. It doesnt (and cant) determine if a bit of code does what its supposed to do, only that it does whatever it does without any danger of crashing.
Basically, its no different from running your code through BoundsChecker or CodeWizard, or any number of other such tools that check for obvious errors (Null pointers, obvious buffer overflows, dangling references, etc.)
While I have no doubt that MySQLs code is perhaps "cleaner" than your typical unpublished code, I have plenty of doubt that MySQLs code is "better" than unpublished code in terms of efficiency, logic errors, etc.
Re: On paper it looks better
Thats like asking how a little red wagon compares to a Formula 1 racecar -- there is simply no comparison. The list of missing features in MySQL could fill a book. MySQL is not a true relational database, so comparing it to Oracle, Sybase, DB2, or MS-SQL is like comparing apples and very small rocks. Theyre not the same thing at all.
It would be more accurate to compare MySQL to dbaseIII, Berkely DB, or Microsoft Access. Against those products, MySQL compares favorably. MySQL performs well for tasks in a narrowly-constrained domain of problems, and is totally incapable of dealing with anything else.
This "proves" that MySQL is better than commercial offerings. Good. A lot of people knew that. Hats off to the developers. But... 1. This cannot be generalized into a property of all open source projects. 2. Its more a tribute to the architecture and original core developers of MySQL than anything else. 3. Realize that even though MySQL is an open source product, MySQL AB is the *company* that organizes and pays for MySQL development. So, again, you cant generalize this into something that covers late night hackers working on personal projects in their basements (the open source geek fantasy). MySQL is awesome! But lets be careful about this story, okay? Its the over-generalization that gives OSS/Linux advocates a bad name ("The Gimp is equivalent to Photoshop!").
MySQL is a "TOY" as far as RDBMs goes
First off, I think MySQL is a fantastic product. Its the perfect mix of speed and ease of use well suited for small to medium sized datastores where speed and relaibility are a must. That being said, I think its unfair to describe this product alongside others such as Oracle, MSSQL (blow me guys, its a great product) and even PostgreSQL and SAP DB (which is be best OpenSource option in my opinion). The codebase for MySQL will never acheive the magnitude of the aforementioned products so it should be used that way. Just my 2 cents.