In a letter sent last week to Linux companies, The SCO Group Inc made a number of specific claims about programs within Linux it contends were stolen from its Unix intellectual property. However, several Linux experts, including Linux founder Linus Torvalds, on Monday countered SCOs assessment, wondering if the programs cited by SCO are Linux through and through.
Eric Raymond, president of the Open Source Initiative, told eWEEK.com there was a good reason why some of the code looked similar. “Do you know that there is not one bit of executable code in those files? Theyre pretty much all macros and declarations forced by POSIX and other technical standards.”
Meanwhile, Bruce Perens, an open-source leader, told eWEEK.com that some parts of the code seemed to show gaps in Lindon, Utah-based SCOs interpretation of evolutionary history. “There are mistakes in the Linux versions that dont exist in the Unix ones, and i386 Linux doesnt even use the same numbers as in Unix, Perens said.
Torvalds went into far deeper detail. “Im pretty sure the same is true of the errno.h file too (which is then duplicated several times for each architecture),” Torvalds told eWEEK.com.
“In fact, Im pretty sure the error numbers arent even the same on Linux/x86 as they are on traditional Unix, exactly because the Linux header file was written independently,” he said.
“But [the errno.h files] obviously have the same error names. Thats not because they were copied; its because thats specified by several standards, not Unix per se—youll find those error names in any operating system that has a C compiler,” Torvalds said.
Torvalds said he picked two of the 71 files SCO listed as examples of intellectual-property theft; ones that he had written himself.
“This is just a quick analysis, but it boils down to the fact that SCO is [yet again] claiming copyright on something that they did not write, and that I can prove that they did not write,” Torvalds said.
Next page: Torvalds take on the contested code.
Inside the Code with
Torvalds moved his discussion into the code itself.
“SCO lists the files include/linux/ctype.h and lib/ctype.h, and some trivial digging show that those files are actually there in the original 0.01 distribution of Linux [of September, 1991]. I can state I wrote them. Looking at the original ones, Im a bit ashamed—the toupper() and tolower() macros are so horribly ugly that I wouldnt admit to writing them if it wasnt because somebody else claimed to have done so!”
He continued that “the details in them arent even the same as in the BSD/Unix files. The approach is the same, but if you look at actual implementation details you will notice that its not just that my original tolower/toupper were embarrassingly ugly; a number of other details differ, too.”
“In short: for the files where I personally checked the history, I can definitely say that those files are trivially written by me personally, with no copying from any Unix code, ever. So its definitely not a question of all derivative branches, [rather] its a question of the fact that I can show—and SCO should have been able to see—that [SCOs] list clearly shows original work, not copied work,” Torvalds asserted.
In addition, Torvalds claimed that some similarities (and differences) between Linux and traditional Unix can be attributed to the limited number of ways available to efficiently implement programming functions and other features.
“Both Linux and traditional Unix use a naming scheme of underscore and a capital letter for the flag names. There are flags for is upper case (_U) and is lower case (_L), and surprise, surprise, both Unix and Linux use the same name. But think about it: If you wanted to use a short flag name, and you were limited by the C standard naming, what names would you use? Maybe youd select U for Upper case and L for Lower case?”
“Looking at the other flags, Linux uses _D for Digit, while traditional Unix instead uses _N for Number. Both make sense, but they are different.”
“I personally think that the Linux naming makes more sense (the function that tests for a digit is called isdigit(), not isnumber()), but on the other hand I can certainly understand why Unix uses _N—the function that checks for whether a character is alphanumeric is called isalnum(), and that checks whether the character is an upper-case letter, a lower-case letter or a digit (a k a number),” Torvalds said.
“In short, there arent that many ways you can choose the names, and there is lots of overlap, but its clearly not 100 percent,” he said.
Discuss This in the eWEEK Forum