Monday, September 12, 2005

Windows 2000: got pid?

If you get that joke, we want to hire you. We spent the entire last month working out various Windows 2000 problems in BitKeeper, most of them related to the agressive reuse of PIDs that Windows 2000 does.

BitKeeper uses MSYS's bash as a shell. it turns out that bash does not like PIDs to be reused. Here's the scenario. Bash launches processes A, B, and C. Due to the PID recycling of Windows 2000, both A and C get the same PID. This is possible because PIDs are guaranteed to be unique only while the process is running. If A finishes before C is started, C can, and will, get the same PID as A. The problem is that whenever bash waits for a process, it records whether it has already waited on it or not. When bash needs to wait for C, it sees in its table that it has already waited for that PID (because of A) and decides not to wait.

What are the symptoms? Well, to sum it up in one phrase: unintended parallelism. Needless to say, this wrecks havoc in every shell script ever written. It's like the system randomly adding an ampersand (&) at the end of every line of the script. Not very nice.

Microsoft seems to have fixed the problem of PID recycling in later versions of Windows. Neither XP nor 2003 reuse PIDs like Windows 2000.

I back-ported a patch to bash 2.05b into bash 2.04 that fixes this issue. I'll post it to the MSYS mailing list later this week.

By the way, I think that a cool idea for a T-Shirt would be the title of this post: "Windows 2000: got PID?"

Technorati Tags:

No comments: