Oracle consultant Tim Hall, after explaining how to write a shell script that starts two Oracle services and then wrap it up in a System 5 rc
script, went on to explain how to replace System 5 rc
with systemd.
RemainAfterExit=yes ExecStart=/home/oracle/scripts/startup.sh >> /home/oracle/scripts/startup_shutdown.log 2>&1 & ExecStop=/home/oracle/scripts/shutdown.sh >> /home/oracle/scripts/startup_shutdown.log 2>&1
This clearly has never been tested.
Anyone who had ever used this service definition would have wondered, just for starters, why the log files were never written to or even created.
The problem is that this has just transplanted a line of shell script into a systemd service unit without either thought or even the simplest of tests.
ExecStart
and ExecStop
do not use shells as interpreters, and are not lines of shell script.
The systemd manual itself explicitly mentions this:
Specifically, redirection using
<
,<<
,>
, and>>
, pipes using|
, running programs in the background using&
, and other elements of shell syntax are not supported.
What is actually happening is that all of those characters are being passed as arguments to the two shell scripts. Fortunately, they both entirely ignore their command-line arguments.
But that is just for starters. There is worse to come.
The scripts operate as follows:
They set up a bunch of environment variables.
The startup script forks off a process, a child of the main shell script interpreter process, to run lsnrctl start
.
The shutdown script forks off a process, a child of the main shell script interpreter process, to run lsnrctl stop
.
They also fork off a process, a child of the main shell script interpreter process, to run sqlplus
with a here document that passes commands to its interactive command line.
They are not the actual dæmon program, which systemd is expecting them to be.
But lsnrctl
isn't the actual dæmon program either.
It is a control program that either launches or stops the actual dæmon process.
The actual dæmon process is a grandchild of the main process that systemd spawned.
(Gunther Pippèrr took this and managed to turn it into an even worse version where the actual dæmon process is the great-great-grandchild of the main process that sytemd spawned.
systemd spawns a shell that interprets a startdb.sh
script, which forks a child process to execute a shell that interprets a startStop.sh
script, which forks a child process to execute a shell that interprets a dbstart
script, which forks a child process to run sqldba
, which forks a child process to be the actual database dæmon.)
There are three models that a systemd-managed dæmon can follow, and this doesn't match any of them.
In the Type=simple
model, the process spawned by systemd is the dæmon.
But that's not the case here.
Here, it's an instance of the shell interpreting the startup.sh
script and forking child processes.
In the Type=forking
model, the directly forked child of the process spawned by systemd is the dæmon.
But that's not the case here, either.
Here, that child is the process running lsnrctl start
, and there's another layer to go before reaching the actual dæmon process.
Once again a dæmon is not even attempting to speak the forking
dæmon readiness protocol.
In the Type=notify
model there's a lot more flexibility about which process ends up being the one that systemd knows is the dæmon.
But that involves a direct communication mechanism between systemd and the dæmon, the "sd_notify" protocol, which Oracle's DB Listener doesn't implement.
Add to that the matter that the scripts are also running sqlplus
, another control utility program that is not the actual dæmon program either.
This creates another grandchild that is another dæmon process that has a separate job to do.
This ridiculous edifice ends up running two main server processes as a single systemd service, and to systemd neither of them is the actual single main server process that it needs to monitor and to send signals to.
To systemd, it appears that the main process rather swiftly exits. That's an indication that the service is going back to the inactive state, and usually systemd then proceeds to clean up any stray child processes left around. But here, the stray child processes are the dæmons, two of them no less. systemd originally ended up killing them almost as soon as they were started.
Hence the bodge that was suggested by other people, who did test this and couldn't get it to work.
People have bodged this by marking the service as RemainAfterExit=true
to prevent the dæmons from being cleaned up.
But this just leaves systemd with two services in one.
It reports that the main service process has "exited" and does not track the individual service statuses or their actual main dæmon processes.
If you have two services, have two service definitions. It really is that straightforward and obvious.
An example of this was created by an anonymous ServerWorld person. The two service definitions are:
lsnrctl.service
This runs lsnrctl start
and lsnrctl stop
.
oracledb.service
This runs dbstart
and dbshut
; and could quite easily be adapted to instead run sqlplus
with the STARTUP
and SHUTDOWN
commands.
Donghua Luo almost got this, too but resorted to using /bin/su - oracle
all over the place instead of simply defining the service as User=oracle
as the anonymous ServerWorld person did.
This is an abuse of su
, as su
is a tool for adding privileges not for dropping them.
Of course, this is not perfect, merely better.
Strictly speaking, lsnrctl
is just unnecessarily duplicating work that systemd already does and can do directly.
systemd will start and stop the dæmon.
systemd will ensure that only one dæmon is ever started at any time.
Having an intermediary special-purpose lsnrctl
program, that forks one fixed child process and attempts to track it so that it can kill it later, is unnecessary when one already has a general-purpose dæmon manager that can do this, and one is already using it.
© Copyright 2016
Jonathan de Boyne Pollard.
"Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.