Discussion:
setting distinct core file names
Willem Jan Withagen
2018-11-27 17:27:59 UTC
Permalink
Hi,

Looking at core(5) and sysctl it looks like these are system wide
settings....

Is there a possibility that a program can set its own corefile name (and
path?)

During parallel testing I'm running into these scripts that generate
cores, but they end up all in the same location. But it would be nice if
I could one way or another determine which file came from what script.

But for that I would need to be able to set something like
%N."script".core
as the core name. I could then put that in then ENV of the script and
the program would pick it up and set its own corefile name.

Possible??
--WjW
Konstantin Belousov
2018-11-27 17:43:55 UTC
Permalink
Post by Willem Jan Withagen
Hi,
Looking at core(5) and sysctl it looks like these are system wide
settings....
Is there a possibility that a program can set its own corefile name (and
path?)
During parallel testing I'm running into these scripts that generate
cores, but they end up all in the same location. But it would be nice if
I could one way or another determine which file came from what script.
But for that I would need to be able to set something like
%N."script".core
as the core name. I could then put that in then ENV of the script and
the program would pick it up and set its own corefile name.
Possible??
No.

Do not expect any proposal that requires kernel to read user mode
environment variable to work.
Lowell Gilbert
2018-11-27 18:31:04 UTC
Permalink
Post by Willem Jan Withagen
Looking at core(5) and sysctl it looks like these are system wide
settings....
Is there a possibility that a program can set its own corefile name
(and path?)
During parallel testing I'm running into these scripts that generate
cores, but they end up all in the same location. But it would be nice
if I could one way or another determine which file came from what
script.
But for that I would need to be able to set something like
%N."script".core
as the core name. I could then put that in then ENV of the script and
the program would pick it up and set its own corefile name.
Possible??
If you can run the scripts in arbitrary paths, you can encode any extra
information you need in a directory name. [I'd recommend just changing
the process name, but I'm guessing that the cores themselves are being
generated by something running in a subshell.]
Conrad Meyer
2018-11-27 20:46:49 UTC
Permalink
One (ugly) trick is to use multiple filesystem links to the script
interpreter, where the link names distinguish the scripts. E.g.,

$ ln /bin/sh /libexec/my_script_one_sh
$ ln /bin/sh /libexec/my_script_two_sh
$ cat myscript1.sh
#!/libexec/my_script_one_sh
...

Cores will be dumped with %N of "my_script_one_sh."

Best,
Conrad
Post by Willem Jan Withagen
Hi,
Looking at core(5) and sysctl it looks like these are system wide
settings....
Is there a possibility that a program can set its own corefile name (and
path?)
During parallel testing I'm running into these scripts that generate
cores, but they end up all in the same location. But it would be nice if
I could one way or another determine which file came from what script.
But for that I would need to be able to set something like
%N."script".core
as the core name. I could then put that in then ENV of the script and
the program would pick it up and set its own corefile name.
Possible??
--WjW
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
Willem Jan Withagen
2018-11-28 10:43:50 UTC
Permalink
Post by Conrad Meyer
One (ugly) trick is to use multiple filesystem links to the script
interpreter, where the link names distinguish the scripts. E.g.,
$ ln /bin/sh /libexec/my_script_one_sh
$ ln /bin/sh /libexec/my_script_two_sh
$ cat myscript1.sh
#!/libexec/my_script_one_sh
...
Cores will be dumped with %N of "my_script_one_sh."
Neat trick... got to try and remember this.
But it is not the shell scripts that are crashing...

When running Ceph tests during Jenkins building some
programs/executables intentionally crash leaving cores.
Others (scripts) use some of these programs with correct input and
should NOT crash. And test during startup and termination that there are
no cores left.

One jenkins test run takes about 4 hours when not executed in parallel.
I'm testing 4 version multiple times a day to not have this huge list of
PRs the go thru when testing fails.

But the intentional cores and the failure cores here collide.
And when I have a core program_x.core I can't tell if they are from a
failure or from an intentional crash.

Now if could tell per program how to name its core that would allow me
to fix the problem, without overturning the complete Ceph testing
infrastructure and still keep parallel tests.

It would also help in that "regular" cores just keep going the way the
are. So other application still have the same behaviour. And are still
picked up by periodic processing.

--WjW
Post by Conrad Meyer
Best,
Conrad
Post by Willem Jan Withagen
Hi,
Looking at core(5) and sysctl it looks like these are system wide
settings....
Is there a possibility that a program can set its own corefile name (and
path?)
During parallel testing I'm running into these scripts that generate
cores, but they end up all in the same location. But it would be nice if
I could one way or another determine which file came from what script.
But for that I would need to be able to set something like
%N."script".core
as the core name. I could then put that in then ENV of the script and
the program would pick it up and set its own corefile name.
Possible??
--WjW
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
Willem Jan Withagen
2018-11-28 11:21:33 UTC
Permalink
Post by Willem Jan Withagen
Post by Conrad Meyer
One (ugly) trick is to use multiple filesystem links to the script
interpreter, where the link names distinguish the scripts.  E.g.,
$ ln /bin/sh /libexec/my_script_one_sh
$ ln /bin/sh /libexec/my_script_two_sh
$ cat myscript1.sh
#!/libexec/my_script_one_sh
...
Cores will be dumped with %N of "my_script_one_sh."
Neat trick... got to try and remember this.
But it is not the shell scripts that are crashing...
When running Ceph tests during Jenkins building some
programs/executables intentionally crash leaving cores.
Others (scripts) use some of these programs with correct input and
should NOT crash. And test during startup and termination that there are
no cores left.
One jenkins test run takes about 4 hours when not executed in parallel.
I'm testing 4 version multiple times a day to not have this huge list of
PRs the go thru when testing fails.
But the intentional cores and the failure cores here collide.
And when I have a core program_x.core I can't tell if they are from a
failure or from an intentional crash.
Now if could tell per program  how to name its core that would allow me
to fix the problem, without overturning the complete Ceph testing
infrastructure and still keep parallel tests.
It would also help in that "regular" cores just keep going the way the
are. So other application still have the same behaviour. And are still
picked up by periodic processing.
So I read a bit more about the prcctl and prctl(the Linux variant) and
turns out that Linux can set PR_SET_DUMPABLE. And that is actually used
in some of the Ceph applications...

Being able to set this to 0 or 1 would perhaps be a nice start as well.

--WjW
Post by Willem Jan Withagen
--WjW
Post by Conrad Meyer
Best,
Conrad
Post by Willem Jan Withagen
Hi,
Looking at core(5) and sysctl it looks like these are system wide
settings....
Is there a possibility that a program can set its own corefile name (and
path?)
During parallel testing I'm running into these scripts that generate
cores, but they end up all in the same location. But it would be nice if
I could one way or another determine which file came from what script.
But for that I would need to be able to set something like
         %N."script".core
as the core name. I could then put that in then ENV of the script and
the program would pick it up and set its own corefile name.
Konstantin Belousov
2018-11-28 14:43:28 UTC
Permalink
Post by Willem Jan Withagen
Post by Willem Jan Withagen
Post by Conrad Meyer
One (ugly) trick is to use multiple filesystem links to the script
interpreter, where the link names distinguish the scripts.  E.g.,
$ ln /bin/sh /libexec/my_script_one_sh
$ ln /bin/sh /libexec/my_script_two_sh
$ cat myscript1.sh
#!/libexec/my_script_one_sh
...
Cores will be dumped with %N of "my_script_one_sh."
Neat trick... got to try and remember this.
But it is not the shell scripts that are crashing...
When running Ceph tests during Jenkins building some
programs/executables intentionally crash leaving cores.
Others (scripts) use some of these programs with correct input and
should NOT crash. And test during startup and termination that there are
no cores left.
One jenkins test run takes about 4 hours when not executed in parallel.
I'm testing 4 version multiple times a day to not have this huge list of
PRs the go thru when testing fails.
But the intentional cores and the failure cores here collide.
And when I have a core program_x.core I can't tell if they are from a
failure or from an intentional crash.
Now if could tell per program  how to name its core that would allow me
to fix the problem, without overturning the complete Ceph testing
infrastructure and still keep parallel tests.
It would also help in that "regular" cores just keep going the way the
are. So other application still have the same behaviour. And are still
picked up by periodic processing.
So I read a bit more about the prcctl and prctl(the Linux variant) and
turns out that Linux can set PR_SET_DUMPABLE. And that is actually used
in some of the Ceph applications...
Being able to set this to 0 or 1 would perhaps be a nice start as well.
Isn't setrlimit(RLIMIT_CORE, 0) enough ? It is slightly different syntax,
but the idea is that you set RLIMIT_CORE to zero, then we do not even
start dumping.
Willem Jan Withagen
2018-11-28 15:27:26 UTC
Permalink
Post by Konstantin Belousov
Post by Willem Jan Withagen
Post by Willem Jan Withagen
Post by Conrad Meyer
One (ugly) trick is to use multiple filesystem links to the script
interpreter, where the link names distinguish the scripts.  E.g.,
$ ln /bin/sh /libexec/my_script_one_sh
$ ln /bin/sh /libexec/my_script_two_sh
$ cat myscript1.sh
#!/libexec/my_script_one_sh
...
Cores will be dumped with %N of "my_script_one_sh."
Neat trick... got to try and remember this.
But it is not the shell scripts that are crashing...
When running Ceph tests during Jenkins building some
programs/executables intentionally crash leaving cores.
Others (scripts) use some of these programs with correct input and
should NOT crash. And test during startup and termination that there are
no cores left.
One jenkins test run takes about 4 hours when not executed in parallel.
I'm testing 4 version multiple times a day to not have this huge list of
PRs the go thru when testing fails.
But the intentional cores and the failure cores here collide.
And when I have a core program_x.core I can't tell if they are from a
failure or from an intentional crash.
Now if could tell per program  how to name its core that would allow me
to fix the problem, without overturning the complete Ceph testing
infrastructure and still keep parallel tests.
It would also help in that "regular" cores just keep going the way the
are. So other application still have the same behaviour. And are still
picked up by periodic processing.
So I read a bit more about the prcctl and prctl(the Linux variant) and
turns out that Linux can set PR_SET_DUMPABLE. And that is actually used
in some of the Ceph applications...
Being able to set this to 0 or 1 would perhaps be a nice start as well.
Isn't setrlimit(RLIMIT_CORE, 0) enough ? It is slightly different syntax,
but the idea is that you set RLIMIT_CORE to zero, then we do not even
start dumping.
Right,

At one point I think I had this code in some tests code....
I also think this is the default on the CentOS when I tested it there.

So I set it from the top-shell to propagate.
But then I could have run into:
     [EPERM]            The limit specified to setrlimit() would have
raised
                        the maximum limit value, and the caller is not the
                        super-user.
When do wanting dumps.

I'm not sure, it was quite some time ago.
But that might be a nice suggestion to look into.

--WjW
Matthew D. Fuller
2018-11-29 13:36:21 UTC
Permalink
On Wed, Nov 28, 2018 at 11:43:50AM +0100 I heard the voice of
Post by Willem Jan Withagen
Neat trick... got to try and remember this.
But it is not the shell scripts that are crashing...
You could still make per-test hardlinks to the binary and run from
them instead.
--
Matthew Fuller (MF4839) | ***@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
Continue reading on narkive:
Loading...