[flow-tools] Zombies on solaris from flow-capture -R ...
Mark Fullmer
maf@eng.oar.net
Mon, 13 May 2002 13:13:41 -0400
You're right, this is broken. The handler/reaper code has multiple
race conditions. The sig_chld_counter should be a flag with a loop
around non blocking wait*().
Most configurations would create a child every 5 or 15 minutes so I
doubt anyone has ever run into the bug, still I don't know what I was
thinking when I wrote that :)
support.c has a function mysignal() that should fix the signal() semantic
problem.
mark
On Mon, May 13, 2002 at 06:15:31PM +0300, Jarkko Torppa wrote:
>
> flow-capture -R creates zombies on solaris. Singals are only delivered
> once on solaris, after that the signal is reset.
>
> I see 3 ways to fix this:
>
> use sigaction to set the signal
> set signals again in the handler
> use bsd_signal on solaris
>
> Third alternative seems is easiest to implement, but I doubt it is
> portable. Here is patch (agains 0.56) to use bsd_signal where appropriate.
>
> This patch also tries to be bit more sensible about collecting zombies,
> current code has the following problems:
> - does not set WNOHANG on wait
> - if two child processes terminate at the samemoment, it is not
> guaranteed that the
> handler will be called twice
>
> This has not been tested too heavily yet
>
> Index: flow-capture.c
> ===================================================================
> RCS file: /usr/cvsroot/kbus/flow-tools/src/flow-capture.c,v
> retrieving revision 1.1.1.1
> diff -u -r1.1.1.1 flow-capture.c
> --- flow-capture.c 2002/02/11 20:43:41 1.1.1.1
> +++ flow-capture.c 2002/05/13 15:13:31
> @@ -133,6 +133,13 @@
> int calc_rotate (int next, double *trotate, int *cur);
> double doubletime(void);
>
> +/* On solaris signal are reset to default after being delivered
> + once. bsd_signal behaves more nicely, something needs to be done
> + for other sysV platforms */
> +#ifndef HAVE_BSDSIGNAL
> +# define bsd_signal(a,b) signal(a,b)
> +#endif
> +
> int main(argc, argv)
> int argc;
> char **argv;
> @@ -408,16 +415,16 @@
> * configure signal handlers
> */
>
> - if (signal(SIGPIPE, sig_pipe) < 0)
> + if (signal(SIGPIPE, sig_pipe) == SIG_ERR)
> fterr_err(1, "signal(SIGPIPE)");
>
> - if (signal(SIGHUP, sig_hup) < 0)
> + if (bsd_signal(SIGHUP, sig_hup) == SIG_ERR)
> fterr_err(1, "signal(SIGHUP)");
>
> - if (signal(SIGQUIT, sig_quit) < 0)
> + if (signal(SIGQUIT, sig_quit) == SIG_ERR)
> fterr_err(1, "signal(SIGQUIT)");
>
> - if (signal(SIGCHLD, sig_chld) < 0)
> + if (bsd_signal(SIGCHLD, sig_chld) == SIG_ERR)
> fterr_err(1, "signal(SIGCHLD)");
>
> /* sandbox */
> @@ -583,31 +590,39 @@
> tt_now = now = doubletime();
>
> /* stake the zombies */
> - while (sig_chld_counter) {
> -
> - /* one less zombie */
> - sig_chld_counter --;
> -
> - child_pid = wait3(&child_status, 0, 0);
> -
> - if (WIFEXITED(child_status)) {
> -
> - if (WEXITSTATUS(child_status))
> - fterr_warnx("Child %d exit_status=%d", (int)child_pid,
> - WEXITSTATUS(child_status));
> -
> - } else if (WIFSIGNALED(child_status)) {
> -
> - fterr_warnx("Child %d signal=%d", (int)child_pid,
> - WTERMSIG(child_status));
> -
> - } else {
> -
> - fterr_warnx("PID %d exited status=%d", (int)child_pid,
> - child_status);
> -
> + if(sig_chld_counter) {
> + do {
> + while((child_pid = waitpid(-1,&child_status, WNOHANG)) > 0) {
> +
> + /* one less zombie */
> + sig_chld_counter --;
> +
> + if (WIFEXITED(child_status)) {
> + if (WEXITSTATUS(child_status))
> + fterr_warnx("Child %d exit_status=%d", (int)child_pid,
> + WEXITSTATUS(child_status));
> + } else if (WIFSIGNALED(child_status)) {
> + fterr_warnx("Child %d signal=%d", (int)child_pid,
> + WTERMSIG(child_status));
> + } else {
> + fterr_warnx("PID %d exited status=%d", (int)child_pid,
> + child_status);
> + }
> + }
> + } while(child_pid > 0 || ( pid == -1 && errno == EINTR));
> + if(child_pid == -1) {
> + if(errno == ECHILD) {
> + /* No children left */
> + sig_chld_counter = 0;
> + } else {
> + fterr_warn("Did not get child exit status sig_chld_counter=%d",
> + sig_chld_counter);
> + }
> + } else if (child_pid == 0) {
> + fterr_warnx("Nobody was ready to report status sig_chld_counter=%d",
> + sig_chld_counter);
> }
> -
> + if (sig_chld_counter < 0) sig_chld_counter = 0;
> } /* buffy */
>
> if (stat_interval) {
>
>
> --
> Jarkko Torppa, Elisa Internet
>
>
>
> _______________________________________________
> flow-tools@splintered.net
> http://www.splintered.net/sw/flow-tools