Normally I don’t just rant about technology but sometimes I want to document things so I can go back later to see if anything is any better than it was. The other thing is these items are just plain broken and people just tolerate it even though there are better options on the market.
Three broken items in linux:
- Process priorities to not translate to I/O priorities
- Load Average includes processes that are blocked waiting for I/O
- SCSI I/O busy for “too long” causes a kernel panic
So before I elaborate the better options on the market are: Solaris and OpenBSD.
Now to elaborate on the issues:
- Process priorities and the associated I/O priority. In most multitasking operating systems you can set process priority. So imagine my web server should be faster but my backups should be slower. And normally this works. But in linux it’s ALL JACKED UP. So let’s say the web server has a high priority and the backup process has a low priority. All is good, until the backup process starts actually … you know … backing up files. What happens is it asks the system to open a file and starts reading the contents so that can copy the file to the backup. This triggers a system call that is NOT prioritized lower as it should be to start reading large chucks of a file. This degrades the system’s performance, even tho the process has a low priority. And the web server reacts slower. Which is completely unacceptable.
- Load Average in LinSux wrong. Load average on every other system is an indication of how busy the CPU is. So in more detail, there are concepts here. There are processes that are basically process that are running and processes that are sleeping. Running processes are churning through stuff. Like maybe some complicated and lengthy mathematical computation. Sleeping processes are waiting for things to happen, maybe waiting for a web browser to request a page. Most systems have hundreds of sleeping processes and very few running processes. Load Average is the average number of processes that are running at any given time. So maybe it’s .63. Which means 63% of the time one process is running. Or maybe it’s 2.26. Which means there are 2.26 processes on average on the run queue. Now in LinSux processes that are WAITING for the DISK are counted in the number. This is HIGHLY misleading because they aren’t bogging down the CPU at all. A load of 10 can still yield a very usable system. On Solaris or BSD the system would be VERY busy and you should DEFINITELY add CPU. But in linux you can’t tell what’s going on from just that data. You have to probe much further. I love the standard linux bigot reaction, “It’s just linux it can deal with higher load than other systems.” Okay JACKASS, get off your ego. That’s the dumbest assessment on the planet.
- Busy SCSI channels cause panic. This is just plain dumb. Apparently if the machine is very busy like with say vmware server and the scsi channel gets too much traffic the linux kernel assumes there’s a problem with the scsi driver or scsi devices and it just decides to stop (panic). Rather than assume the system is being USED LIKE IT SHOULD BE! That’s just whack. I haven’t actually verified this behavior, but it’s supposedly a known issue with vmware. How can that be a KNOWN issue!!!
I think that about sums it up. The moral of the story is use OpenBSD or Solaris.
This entry was posted on Thursday, July 19th, 2007 at 2:36 pm and is filed under Technology. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.