tame/bin at 7c9d6837fe18e444c33bfa855613d31460ccb051 - tame

employer

tame

History

Mike Gerwitz 7c9d6837fe Improve runner reload stability The `tame` client has the ability to request a runner reload by issuing SIGHUP to the runner PID. This will cause `tamed` to kill the runner and respawn it. There were situations where this process hung or did not operate as expected; the process was not reliable. This does a number of things to make sure the process is in a good state before proceeding: 1. The HUP trap is set a single time, rather than being reset each time the signal is received for the runner. 2. A `reloading` flag is set by the `tame` client when reloading is requested, and the flag is cleared by `tamed` when spawning a new runner. This allows the client to wait until the reload is complete, and bail out otherwise. Without this, there is a race, where the client may proceed to issue additional requests before the original runner terminates. 3. Kill the runner with SIGKILL (signal 9). This gives no opportunity for the runner to ignore the signal, and gives us confidence that the runner is actually going to be killed. This may have caused errors that look like this (where 129 is the HUP reload, which is expected): warning: failed runner 0 ack; requesting reload warning: runner 0 exited with code 129; restarting error: runner 0 still unresponsive; giving up The last line may also be omitted, with instead an empty `xmlo` being generated. DEV-10806		2023-10-03 14:14:47 -04:00
..
dslc.in	dscl: Replace process with `java`	2023-10-03 14:14:43 -04:00
tame	Improve runner reload stability	2023-10-03 14:14:47 -04:00
tamed	Improve runner reload stability	2023-10-03 14:14:47 -04:00