>pid locking

>So I’ve had some issues with duplicate processes running and that kind of confused me.  I looked in the code and I’m using Proc::Pid::File to control the processes and exiting out if the old process is still running. I finally found some time to dig into this further late last week.  It seems that the logic the module currently has you use is essentially:

if pidAlive
  then exit
else
  updatePid
end if

The problem here is that there is a race condition between the checking for alive and updating the pid file to take control.  What is happening on our system is:

process1: is pidAlive? no.
process2: is pidAlive? no.
process1: update pidFile.
process2: update pidFile.

Now both process1 and process2 are running, the pid file reflects process2 is alive, nothing indicates that process1 is alive, so if the restart or shutdown systems are used, they talk to process2 but process1 keeps on trucking.  At this point process3 can check the pid and it’s not alive so it will go ahead and start up perpetuating the problem.

What is really needed would be a single method that will eliminate the race condition:

begin getPid
    create pid file if it does not exist
    exclusive lock pid file
    if pid and pid is running then
        result is false
    else
        result is true
        update pid file to our pid
    end if
    release exclusive lock
    return result
end getPid

The problem with this solution?  Proc::Pid::File has been around for quite a while and a lot of people are using it.  We can’t just change the API like that.  We need to find a way to keep the same API, but change the logic to match what I’ve proposed.

I’ll need to give this some more thought on how to work around this without changing the API. Perhaps it’s just continue to support the current API, and add a new method to do both at once but do it cleanly instead of just using it as a wrapper to call both in sequence.