05.16.07
Mind the P’s and Q’s
Redhat Cluster Services is very touchy. It expects everything to be set up properly or it’ll snap like a rubber band stretched too far. One of the things it expects is LSB-compliant init scripts.
Some init scripts do not implement these specs properly or completely. MySQL is a noted violator (both the scripts from MySQL and from Redhat - go figure.)
In particular, you need to be aware of two issues:
- does the script implement a ’status’ function?
- does the script return success when the service is already stopped?
Item #1 is critical to RHCS. It calls the init script once a minute, passing it the ’status’ parameter. If the script does not implement a status function, then RHCS will continuously bounce the service, thinking it has failed somehow.
Item #2 is a little more subtle. When a service is already stopped, stopping it again is of no real benefit/use and according to the LSB, it should be considered a successful action. If a stop-after-stop returns an error, then RHCS will assume something happened that cannot be resolved by failing over to another node and flags that service as failed, a situation that would require manual intervention to correct.
Since MySQL init scripts are notoriously broken, here’s what to look for. In the ’stop)’ branch of the script argument handling case statement (or in the ’stop()’ function if it’s used) there is usually an ‘if’ statement that checks for a MySQL pid file. Many times, it looks like this:
MYSQLPID=`cat "$mypidfile" 2>/dev/null `
if [ -n "$MYSQLPID" ]; then
Or even this:
if test -s "$pid_file"
then
Now, we need to check what happens if there is no pid file in the else clause. Here’s one example:
else
ret=0
action $"Stopping $prog: " /bin/true
fi
return $ret
Here’s another example:
else
echo "No mysqld pid file found. Looked for $pid_file."
fi
In the first example, the return code, zero, is stored in the ret variable. The action call prints a message to the screen and the /bin/true tells action to print a success message. This would be the preferred method of coding an init script.
In the second example, a statement is written to the screen. The return code is simply the exit status of the echo command, which is success (unless the echo command, itself, failed to write to standard out - something that isn’t very likely to happen.) While this example achieves the desired result, it also isn’t 100% LSB-compliant (since it doesn’t use logging functions for reporting the status) and should be avoided.
MySQL may not be the only init script to fail LSB-compliance. If you use RHCS for other services, you should check the init scripts to make sure they pass muster.
