crashinfo doesn't support compressed crashdumps

Discussion:

crashinfo doesn't support compressed crashdumps - which way forward?

Alexander Leidinger

2018-07-12 12:14:09 UTC

Hi,

the crashinfo script doesn't know how to handle compressed coredumps.
What would be acceptable behavior (ordered by my preferrence)?
1) decompress in /var/crash and then proceed normally (already
implemented locally)
2) decompress to CRASHTMPDIR:/var/tmp/xxx and delete when finished
3) keep it like it is
4) teach tools to understand compressed dumps (gzip / zstd)

Implicitly there is the question what what is the purpose of
compressing crashdumps, to have more RAM than space in dumpdev (which
is valid in my case), or to save space in /var/crash (which I don't
care much about).

Bye,
Alexander.

--
http://www.Leidinger.net ***@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org ***@FreeBSD.org : PGP 0x8F31830F9F2772BF

Mark Johnston

2018-07-12 14:31:18 UTC

Permalink

Post by Alexander Leidinger
Hi,
the crashinfo script doesn't know how to handle compressed coredumps.
What would be acceptable behavior (ordered by my preferrence)?
1) decompress in /var/crash and then proceed normally (already
implemented locally)
2) decompress to CRASHTMPDIR:/var/tmp/xxx and delete when finished
3) keep it like it is
4) teach tools to understand compressed dumps (gzip / zstd)
Implicitly there is the question what what is the purpose of
compressing crashdumps, to have more RAM than space in dumpdev (which
is valid in my case), or to save space in /var/crash (which I don't
care much about).

I think jhb has a patch which implements 2). I do not have strong
feelings on which is the right way forward, but I mildly prefer 2) to
1). It looks like crashinfo can be disabled in rc.conf, so users who
are space-constrained in /var/crash can take the additional step of
setting crashinfo_enable=NO. FWIW, when I committed the compression
support, my use-case involved both a small dump device and a small
/var filesystem.

Alan Somers

2018-07-12 14:50:08 UTC

Permalink

Post by Mark Johnston

I briefly looked into this option, but it's harder than it sounds. There
are several libraries that would need to be modified, and I think some of
them work by mmap(2)ping the core file rather than by fread(2)ing it. I
don't know of anyway to mmap a compressed file.

Post by Mark Johnston

Post by Alexander Leidinger
Implicitly there is the question what what is the purpose of
compressing crashdumps, to have more RAM than space in dumpdev (which
is valid in my case), or to save space in /var/crash (which I don't
care much about).

Compressed crashdumps are also great for systems with slow dumpdevs. They
greatly speed up the dumping process.

Post by Mark Johnston
I think jhb has a patch which implements 2). I do not have strong
feelings on which is the right way forward, but I mildly prefer 2) to
1). It looks like crashinfo can be disabled in rc.conf, so users who
are space-constrained in /var/crash can take the additional step of
setting crashinfo_enable=NO. FWIW, when I committed the compression
support, my use-case involved both a small dump device and a small
/var filesystem.
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers

Larry Rosenman

2018-07-12 14:51:49 UTC

Permalink

Post by Alan Somers

Post by Mark Johnston

Compressed crashdumps are also great for systems with slow dumpdevs. They
greatly speed up the dumping process.

I have jhb's patch applied and it works great.

borg.lerctr.org /usr/src $ svn diff
Index: usr.sbin/crashinfo/crashinfo.sh
===================================================================
--- usr.sbin/crashinfo/crashinfo.sh (revision 336184)
+++ usr.sbin/crashinfo/crashinfo.sh (working copy)
@@ -38,6 +38,13 @@
exit 1
}

+# Remove an uncompressed copy of a dump
+cleanup()
+{
+
+ [ -e $VMCORE ] && rm -f $VMCORE
+}
+
# Find a gdb binary to use and save the value in GDB.
find_gdb()
{
@@ -133,7 +140,7 @@

# Figure out the crash directory and number from the vmcore name.
CRASHDIR=`dirname $1`
- DUMPNR=$(expr $(basename $1) : 'vmcore\.$[0-9]*$$')
+ DUMPNR=$(expr $(basename $1) : 'vmcore\.$[0-9]*$')
if [ -z "$DUMPNR" ]; then
echo "Unable to determine dump number from vmcore file $1."
exit 1
@@ -174,8 +181,16 @@
fi

if [ ! -e $VMCORE ]; then
- echo "$VMCORE not found"
- exit 1
+ if [ -e $VMCORE.gz ]; then
+ trap cleanup EXIT HUP INT QUIT TERM
+ gzcat $VMCORE.gz > $VMCORE
+ elif [ -e $VMCORE.zst ]; then
+ trap cleanup EXIT HUP INT QUIT TERM
+ zstdcat $VMCORE.zst > $VMCORE
+ else
+ echo "$VMCORE not found"
+ exit 1
+ fi
fi

if [ ! -e $INFO ]; then
borg.lerctr.org /usr/src $

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ***@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

John Baldwin

2018-07-12 16:09:02 UTC

Permalink

Post by Mark Johnston

Yes, here's the patch for 2 which has been tested. 4) is pretty hard
to do in practice as you have to basically decompress into RAM when
reading the core to do anything useful as opposed to a format that
only compressed certain parts (e.g. if the page tables were not
compressed only the payload in a minidump, and if it were compressed
on some kind of block boundaries so that you could locate a given
block and decompress it when reading specific data). Coming up with
such a format would be more useful but requires more work.

Index: usr.sbin/crashinfo/crashinfo.sh
===================================================================
--- usr.sbin/crashinfo/crashinfo.sh (revision 335896)
+++ usr.sbin/crashinfo/crashinfo.sh (working copy)
@@ -38,6 +38,13 @@
exit 1
}

+# Remove an uncompressed copy of a dump
+cleanup()
+{
+
+ [ -e $VMCORE ] && rm -f $VMCORE
+}
+
# Find a gdb binary to use and save the value in GDB.
find_gdb()
{
@@ -133,7 +140,7 @@

# Figure out the crash directory and number from the vmcore name.
CRASHDIR=`dirname $1`
- DUMPNR=$(expr $(basename $1) : 'vmcore\.$[0-9]*$$')
+ DUMPNR=$(expr $(basename $1) : 'vmcore\.$[0-9]*$')
if [ -z "$DUMPNR" ]; then
echo "Unable to determine dump number from vmcore file $1."
exit 1
@@ -174,8 +181,16 @@
fi

if [ ! -e $VMCORE ]; then
- echo "$VMCORE not found"
- exit 1
+ if [ -e $VMCORE.gz ]; then
+ trap cleanup EXIT HUP INT QUIT TERM
+ gzcat $VMCORE.gz > $VMCORE
+ elif [ -e $VMCORE.zst ]; then
+ trap cleanup EXIT HUP INT QUIT TERM
+ zstdcat $VMCORE.zst > $VMCORE
+ else
+ echo "$VMCORE not found"
+ exit 1
+ fi
fi

if [ ! -e $INFO ]; then

--
John Baldwin

Mark Johnston

2018-07-12 18:35:57 UTC

Permalink

Post by John Baldwin

Post by Mark Johnston

That patch looks ok to me, FWIW. As I pointed out, it's easy enough to
just disable crashinfo if one doesn't want the extraction to take place.

Alexander Leidinger

2018-07-21 17:15:53 UTC

Permalink

Post by Mark Johnston

Post by John Baldwin
Yes, here's the patch for 2 which has been tested. 4) is pretty hard
to do in practice as you have to basically decompress into RAM when
reading the core to do anything useful as opposed to a format that
only compressed certain parts (e.g. if the page tables were not
compressed only the payload in a minidump, and if it were compressed
on some kind of block boundaries so that you could locate a given
block and decompress it when reading specific data). Coming up with
such a format would be more useful but requires more work.

That patch looks ok to me, FWIW. As I pointed out, it's easy enough to
just disable crashinfo if one doesn't want the extraction to take place.

Hi John,

what about committing this patch?
If you are too busy, do you mind if I commit it?

Bye,
Alexander.

--
http://www.Leidinger.net ***@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org ***@FreeBSD.org : PGP 0x8F31830F9F2772BF