Discussion:
Help diagnose my Ryzen build problem
Meowthink
2018-08-26 11:20:00 UTC
Permalink
Hello all,

Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )

But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.

Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.

In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.

Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.

The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.

Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?

Best regards,
Meowthink

[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224886
[2] https://reviews.freebsd.org/D11780

Backtraces newer - older:
------------------------------------------------------------------------
Panic while compiling gcc:

#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081e962790,
eva=18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081e962790, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081e962790)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff822950a8 in arc_change_state ()
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800
#9 0xffffffff8229328b in arc_access () at time.h:145
#10 0xffffffff82296232 in arc_write_done (zio=0xfffff8065f886410)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169
#11 0xffffffff82334cbe in zio_done (zio=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032
#12 0xffffffff8233070c in zio_execute (zio=0xfffff8065f886410)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768
#13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=0xfffff8000d9e6e00)
at /usr/src/sys/kern/subr_taskqueue.c:463
#14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=<value optimized out>)
at /usr/src/sys/kern/subr_taskqueue.c:755
#15 0xffffffff80abd813 in fork_exit (
callout=0xffffffff80b53d90 <taskqueue_thread_loop>,
arg=0xfffff8000d967030, frame=0xfffffe081e962ac0)
at /usr/src/sys/kern/kern_fork.c:1072
#16 0xffffffff80f5cc7e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:972
#17 0x0000000000000000 in ?? ()
Current language: auto; currently minimal
(kgdb)

------------------------------------------------------------------------
backtrace panic when shuting down:

#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081ed30700, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081ed30700, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081ed30700)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80dfe4ad in vm_object_terminate (object=0xfffff805bf66d5a0)
at /usr/src/sys/vm/vm_object.c:768
#9 0xffffffff80dfd0f8 in vm_object_deallocate (object=0x0)
at /usr/src/sys/vm/vm_object.c:677
#10 0xffffffff80df3189 in _vm_map_unlock (map=<value optimized out>,
file=<value optimized out>, line=<value optimized out>)
at /usr/src/sys/vm/vm_map.c:2939
#11 0xffffffff80df7be2 in vm_map_remove (map=0xfffff80018673000, start=4096,
end=140737488351232) at /usr/src/sys/vm/vm_map.c:3137
#12 0xffffffff80df2e49 in vmspace_exit (td=0xfffff80039ec0620)
at /usr/src/sys/vm/vm_map.c:337
#13 0xffffffff80ab72b9 in exit1 (td=0xfffff80039ec0620,
rval=<value optimized out>, signo=<value optimized out>)
at /usr/src/sys/kern/kern_exit.c:401
#14 0xffffffff80ab6ced in sys_sys_exit (td=<value optimized out>,
uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:180
#15 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff80039ec0620, traced=0)
at subr_syscall.c:132
#16 0xffffffff80f5c5ad in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008028d034a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)

------------------------------------------------------------------------
Panic while only running my single thread python script

#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081f635ec0, eva=952)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081f635ec0, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081f635ec0)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80af57ad in __rw_wlock_hard (c=0xfffff80016f8f798,
v=<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:977
#9 0xffffffff80bbca92 in bufobj_invalbuf (bo=<value optimized out>, flags=1,
slpflag=1017770744, slptimeo=<value optimized out>)
at /usr/src/sys/kern/vfs_subr.c:1609
#10 0xffffffff80bbf8be in vgonel (vp=0xfffff8053ca9f1d8)
at /usr/src/sys/kern/vfs_subr.c:1655
#11 0xffffffff80bbbcc4 in vnlru_free_locked (count=1, mnt_op=0x0)
at /usr/src/sys/kern/vfs_subr.c:1227
#12 0xffffffff80bbbe14 in getnewvnode_reserve (count=1)
at /usr/src/sys/kern/vfs_subr.c:1287
#13 0xffffffff82327fb4 in zfs_zget (zfsvfs=0xfffff80076574000, obj_num=34941,
zpp=0xfffffe081f6362a8)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1122
#14 0xffffffff823421ad in zfs_dirent_lookup (dzp=0xfffff804ff94e420,
name=0xfffffe081f6363e0 "filename.ext", zpp=0xfffffe081f6362a8, flag=2)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:187
#15 0xffffffff82342267 in zfs_dirlook (dzp=0xfffff804ff94e420,
name=<value optimized out>, zpp=0xfffffe081f636360)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
#16 0xffffffff8235a4ef in zfs_lookup ()
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658
#17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=0xfffffe081f636548)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4956
#18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>,
a=0xfffffe081f636548) at vnode_if.c:195
#19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=<value optimized out>)
at vnode_if.h:80
#20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=<value optimized out>,
a=0xfffffe081f636610) at vnode_if.c:127
#21 0xffffffff80bb2761 in lookup (ndp=0xfffffe081f636748) at vnode_if.h:54
#22 0xffffffff80bb1c29 in namei (ndp=0xfffffe081f636748)
at /usr/src/sys/kern/vfs_lookup.c:448
#23 0xffffffff80bc8238 in kern_statat (td=0xfffff8013ba1b620,
flag=<value optimized out>, fd=-100,
path=0x80332c910 <Address 0x80332c910 out of bounds>,
pathseg=UIO_USERSPACE, sbp=0xfffffe081f636900, hook=0)
at /usr/src/sys/kern/vfs_syscalls.c:2023
#24 0xffffffff80bc817d in sys_stat (td=<value optimized out>,
uap=0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978
#25 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff8013ba1b620, traced=0)
at subr_syscall.c:132
#26 0xffffffff80f5c5ad in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#27 0x0000000801a5b9ca in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)

------------------------------------------------------------------------
Panic while using mplayer

#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80af91cb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80af95f1 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80af9433 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7a13f in trap_fatal (frame=0xfffffe081f70d380, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7a199 in trap_pfault (frame=0xfffffe081f70d380, usermode=0)
at pcpu.h:230
#6 0xffffffff80f79974 in trap (frame=0xfffffe081f70d380)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5a00c in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff8088c030 in hdac_stream_start (dev=<value optimized out>,
child=<value optimized out>, dir=0, stream=1, buf=1889533952, blksz=2048,
blkcnt=2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927
#9 0xffffffff8088437d in hdaa_channel_start (ch=<value optimized out>)
at hdac_if.h:84
#10 0xffffffff80887e0d in hdaa_channel_trigger (obj=<value optimized out>,
data=0xfffff8007102c480, go=1)
at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161
#11 0xffffffff80893b8e in chn_trigger (c=0xfffff80071058400, go=1)
at channel_if.h:131
#12 0xffffffff8089751b in chn_notify (c=0xfffff80071058400,
flags=<value optimized out>) at /usr/src/sys/dev/sound/pcm/channel.c:2281
#13 0xffffffff808b697f in vchan_trigger (obj=<value optimized out>,
data=<value optimized out>, go=1)
at /usr/src/sys/dev/sound/pcm/vchan.c:171
#14 0xffffffff80893b8e in chn_trigger (c=0xfffff80071057c00, go=1)
at channel_if.h:131
#15 0xffffffff8089de10 in dsp_ioctl (i_dev=<value optimized out>,
cmd=<value optimized out>, arg=0xfffffe081f70d8d0 "\003",
mode=<value optimized out>, td=<value optimized out>)
at /usr/src/sys/dev/sound/pcm/dsp.c:1733
#16 0xffffffff809c5b38 in devfs_ioctl_f (fp=0xfffff802c5563c80,
com=2147766288, data=0xfffffe081f70d8d0, cred=0xfffff8004c482500,
td=0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
#17 0xffffffff80b5c00d in kern_ioctl (td=0xfffff802f24ac000, fd=51,
com=2147766288, data=<value optimized out>) at file.h:323
#18 0xffffffff80b5bd2c in sys_ioctl (td=0xfffff802f24ac000,
uap=0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745
#19 0xffffffff80f7b1c8 in amd64_syscall (td=0xfffff802f24ac000, traced=0)
at subr_syscall.c:132
#20 0xffffffff80f5a8ed in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#21 0x0000000801fb94aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)

------------------------------------------------------------------------
Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup.

#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80af95fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80af9a21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80af9863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7b13f in trap_fatal (frame=0xfffffe081ee186e0, eva=201697507)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7b199 in trap_pfault (frame=0xfffffe081ee186e0, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7a974 in trap (frame=0xfffffe081ee186e0)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5a5bc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80ad596e in free (addr=0xfffff802472af200,
mtp=0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583
#9 0xffffffff8232a667 in zfs_inactive (vp=<value optimized out>,
cr=<value optimized out>, ct=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4333
#10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:5364
#11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=<value optimized out>,
a=0xfffffe081ee18858) at vnode_if.c:1955
#12 0xffffffff80bbd7bc in vinactive (vp=0xfffff803ae8b3760,
td=0xfffff803ae23b620) at vnode_if.h:807
#13 0xffffffff80bbdcc7 in vputx (vp=0xfffff803ae8b3760, func=1)
at /usr/src/sys/kern/vfs_subr.c:2688
#14 0xffffffff80bc5180 in sys_fchdir (td=0xfffff803ae23b620,
uap=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:724
#15 0xffffffff80f7c1c8 in amd64_syscall (td=0xfffff803ae23b620, traced=0)
at subr_syscall.c:132
#16 0xffffffff80f5ae9d in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008008a99aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
Phil Norman
2018-08-27 08:13:10 UTC
Permalink
Hi.

I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.

The other problem I had was with USB. I got quite a spam of log messages
about the USB reinitialisation. However, eventually I figured out that the
problem didn't occur if I booted the system from a completely powered-down
state. That is, use the physical switch on the PSU to cut power entirely,
re-enable, then boot from that state. Since then I've had 67 days of
uninterrupted uptime, with no USB issues at all.

It sounds like your problem is different, but trying a boot-from-cold might
be worthwhile, just in case ASRock have a consistent problem in this regard.

Cheers,
Phil
Post by Meowthink
Hello all,
Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )
But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.
Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.
In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.
Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.
The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.
Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?
Best regards,
Meowthink
[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224886
[2] https://reviews.freebsd.org/D11780
------------------------------------------------------------------------
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081e962790,
eva=18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081e962790, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081e962790)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff822950a8 in arc_change_state ()
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800
#9 0xffffffff8229328b in arc_access () at time.h:145
#10 0xffffffff82296232 in arc_write_done (zio=0xfffff8065f886410)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169
#11 0xffffffff82334cbe in zio_done (zio=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032
#12 0xffffffff8233070c in zio_execute (zio=0xfffff8065f886410)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768
#13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=0xfffff8000d9e6e00)
at /usr/src/sys/kern/subr_taskqueue.c:463
#14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=<value optimized out>)
at /usr/src/sys/kern/subr_taskqueue.c:755
#15 0xffffffff80abd813 in fork_exit (
callout=0xffffffff80b53d90 <taskqueue_thread_loop>,
arg=0xfffff8000d967030, frame=0xfffffe081e962ac0)
at /usr/src/sys/kern/kern_fork.c:1072
#16 0xffffffff80f5cc7e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:972
#17 0x0000000000000000 in ?? ()
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081ed30700, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081ed30700, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081ed30700)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80dfe4ad in vm_object_terminate (object=0xfffff805bf66d5a0)
at /usr/src/sys/vm/vm_object.c:768
#9 0xffffffff80dfd0f8 in vm_object_deallocate (object=0x0)
at /usr/src/sys/vm/vm_object.c:677
#10 0xffffffff80df3189 in _vm_map_unlock (map=<value optimized out>,
file=<value optimized out>, line=<value optimized out>)
at /usr/src/sys/vm/vm_map.c:2939
#11 0xffffffff80df7be2 in vm_map_remove (map=0xfffff80018673000, start=4096,
end=140737488351232) at /usr/src/sys/vm/vm_map.c:3137
#12 0xffffffff80df2e49 in vmspace_exit (td=0xfffff80039ec0620)
at /usr/src/sys/vm/vm_map.c:337
#13 0xffffffff80ab72b9 in exit1 (td=0xfffff80039ec0620,
rval=<value optimized out>, signo=<value optimized out>)
at /usr/src/sys/kern/kern_exit.c:401
#14 0xffffffff80ab6ced in sys_sys_exit (td=<value optimized out>,
uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:180
#15 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff80039ec0620, traced=0)
at subr_syscall.c:132
#16 0xffffffff80f5c5ad in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008028d034a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
Panic while only running my single thread python script
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081f635ec0, eva=952)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081f635ec0, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081f635ec0)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80af57ad in __rw_wlock_hard (c=0xfffff80016f8f798,
v=<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:977
#9 0xffffffff80bbca92 in bufobj_invalbuf (bo=<value optimized out>, flags=1,
slpflag=1017770744, slptimeo=<value optimized out>)
at /usr/src/sys/kern/vfs_subr.c:1609
#10 0xffffffff80bbf8be in vgonel (vp=0xfffff8053ca9f1d8)
at /usr/src/sys/kern/vfs_subr.c:1655
#11 0xffffffff80bbbcc4 in vnlru_free_locked (count=1, mnt_op=0x0)
at /usr/src/sys/kern/vfs_subr.c:1227
#12 0xffffffff80bbbe14 in getnewvnode_reserve (count=1)
at /usr/src/sys/kern/vfs_subr.c:1287
#13 0xffffffff82327fb4 in zfs_zget (zfsvfs=0xfffff80076574000,
obj_num=34941,
zpp=0xfffffe081f6362a8)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_znode.c:1122
#14 0xffffffff823421ad in zfs_dirent_lookup (dzp=0xfffff804ff94e420,
name=0xfffffe081f6363e0 "filename.ext", zpp=0xfffffe081f6362a8, flag=2)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_dir.c:187
#15 0xffffffff82342267 in zfs_dirlook (dzp=0xfffff804ff94e420,
name=<value optimized out>, zpp=0xfffffe081f636360)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_dir.c:238
#16 0xffffffff8235a4ef in zfs_lookup ()
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_vnops.c:1658
#17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=0xfffffe081f636548)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_vnops.c:4956
#18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>,
a=0xfffffe081f636548) at vnode_if.c:195
#19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=<value optimized out>)
at vnode_if.h:80
#20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=<value optimized out>,
a=0xfffffe081f636610) at vnode_if.c:127
#21 0xffffffff80bb2761 in lookup (ndp=0xfffffe081f636748) at vnode_if.h:54
#22 0xffffffff80bb1c29 in namei (ndp=0xfffffe081f636748)
at /usr/src/sys/kern/vfs_lookup.c:448
#23 0xffffffff80bc8238 in kern_statat (td=0xfffff8013ba1b620,
flag=<value optimized out>, fd=-100,
path=0x80332c910 <Address 0x80332c910 out of bounds>,
pathseg=UIO_USERSPACE, sbp=0xfffffe081f636900, hook=0)
at /usr/src/sys/kern/vfs_syscalls.c:2023
#24 0xffffffff80bc817d in sys_stat (td=<value optimized out>,
uap=0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978
#25 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff8013ba1b620, traced=0)
at subr_syscall.c:132
#26 0xffffffff80f5c5ad in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#27 0x0000000801a5b9ca in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
Panic while using mplayer
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80af91cb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80af95f1 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80af9433 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7a13f in trap_fatal (frame=0xfffffe081f70d380, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7a199 in trap_pfault (frame=0xfffffe081f70d380, usermode=0)
at pcpu.h:230
#6 0xffffffff80f79974 in trap (frame=0xfffffe081f70d380)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5a00c in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff8088c030 in hdac_stream_start (dev=<value optimized out>,
child=<value optimized out>, dir=0, stream=1, buf=1889533952, blksz=2048,
blkcnt=2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927
#9 0xffffffff8088437d in hdaa_channel_start (ch=<value optimized out>)
at hdac_if.h:84
#10 0xffffffff80887e0d in hdaa_channel_trigger (obj=<value optimized out>,
data=0xfffff8007102c480, go=1)
at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161
#11 0xffffffff80893b8e in chn_trigger (c=0xfffff80071058400, go=1)
at channel_if.h:131
#12 0xffffffff8089751b in chn_notify (c=0xfffff80071058400,
flags=<value optimized out>) at /usr/src/sys/dev/sound/pcm/
channel.c:2281
#13 0xffffffff808b697f in vchan_trigger (obj=<value optimized out>,
data=<value optimized out>, go=1)
at /usr/src/sys/dev/sound/pcm/vchan.c:171
#14 0xffffffff80893b8e in chn_trigger (c=0xfffff80071057c00, go=1)
at channel_if.h:131
#15 0xffffffff8089de10 in dsp_ioctl (i_dev=<value optimized out>,
cmd=<value optimized out>, arg=0xfffffe081f70d8d0 "\003",
mode=<value optimized out>, td=<value optimized out>)
at /usr/src/sys/dev/sound/pcm/dsp.c:1733
#16 0xffffffff809c5b38 in devfs_ioctl_f (fp=0xfffff802c5563c80,
com=2147766288, data=0xfffffe081f70d8d0, cred=0xfffff8004c482500,
td=0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
#17 0xffffffff80b5c00d in kern_ioctl (td=0xfffff802f24ac000, fd=51,
com=2147766288, data=<value optimized out>) at file.h:323
#18 0xffffffff80b5bd2c in sys_ioctl (td=0xfffff802f24ac000,
uap=0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745
#19 0xffffffff80f7b1c8 in amd64_syscall (td=0xfffff802f24ac000, traced=0)
at subr_syscall.c:132
#20 0xffffffff80f5a8ed in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#21 0x0000000801fb94aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup.
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80af95fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80af9a21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80af9863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7b13f in trap_fatal (frame=0xfffffe081ee186e0, eva=201697507)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7b199 in trap_pfault (frame=0xfffffe081ee186e0, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7a974 in trap (frame=0xfffffe081ee186e0)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5a5bc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80ad596e in free (addr=0xfffff802472af200,
mtp=0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583
#9 0xffffffff8232a667 in zfs_inactive (vp=<value optimized out>,
cr=<value optimized out>, ct=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_vnops.c:4333
#10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/
zfs_vnops.c:5364
#11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=<value optimized out>,
a=0xfffffe081ee18858) at vnode_if.c:1955
#12 0xffffffff80bbd7bc in vinactive (vp=0xfffff803ae8b3760,
td=0xfffff803ae23b620) at vnode_if.h:807
#13 0xffffffff80bbdcc7 in vputx (vp=0xfffff803ae8b3760, func=1)
at /usr/src/sys/kern/vfs_subr.c:2688
#14 0xffffffff80bc5180 in sys_fchdir (td=0xfffff803ae23b620,
uap=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:724
#15 0xffffffff80f7c1c8 in amd64_syscall (td=0xfffff803ae23b620, traced=0)
at subr_syscall.c:132
#16 0xffffffff80f5ae9d in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008008a99aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
Gary Jennejohn
2018-08-27 11:29:05 UTC
Permalink
On Mon, 27 Aug 2018 10:13:10 +0200
Post by Phil Norman
Hi.
I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.
I had instability problems with my Ryzen 5 - lockups for no
apparent reason. The only recourse waas a hard reset.

It turned out that there were two causes
1) old CPU microcode
2) unhandled errate in the CPU

I installed the /usr/ports/sysutils/devcpu-data port, which
allowed me to install the latest microcode using cpucontrol(8).

I also used a shell script called amd_errata.sh provided by one of
the FreeBSD committers. To my shame I can't remember exactly
who. Note that the errata fixups are now part of the kernel in
FreeBSD 12.

After taking these steps about two months ago I have had no more
lockups and the machine runs very stabily.

[big snip]
--
Gary Jennejohn
Meowthink
2018-08-27 12:18:46 UTC
Permalink
Hi Gary,
Post by Gary Jennejohn
On Mon, 27 Aug 2018 10:13:10 +0200
Post by Phil Norman
Hi.
I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.
I had instability problems with my Ryzen 5 - lockups for no
apparent reason. The only recourse waas a hard reset.
It turned out that there were two causes
1) old CPU microcode
2) unhandled errate in the CPU
I installed the /usr/ports/sysutils/devcpu-data port, which
allowed me to install the latest microcode using cpucontrol(8).
I also used a shell script called amd_errata.sh provided by one of
the FreeBSD committers. To my shame I can't remember exactly
who. Note that the errata fixups are now part of the kernel in
FreeBSD 12.
That's kib, who has committed things in that script to both 12 [1] and
stable/11 [2].

Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.

On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.

Seems like ... the only thing I can do is sit down and wait?
Post by Gary Jennejohn
After taking these steps about two months ago I have had no more
lockups and the machine runs very stabily.
[big snip]
--
Gary Jennejohn
[1] https://svnweb.freebsd.org/base?view=revision&revision=336763
[2] https://svnweb.freebsd.org/base?view=revision&revision=337235
karu.pruun
2018-08-27 13:16:47 UTC
Permalink
Post by Meowthink
That's kib, who has committed things in that script to both 12 [1] and
stable/11 [2].
Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.
On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.
Seems like ... the only thing I can do is sit down and wait?
The revision

https://svnweb.freebsd.org/base/head/sys/x86/x86/cpu_machdep.c?r1=336763&r2=336762&pathrev=336763

works around the mwait issue, i.e. it sets

sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt

Now it may or may not relate to your problem, but it appears that
Ryzen 2400G also has another issue with HLT, see the DragonFly bug
report

https://bugs.dragonflybsd.org/issues/3131

which AMD is aware of and is possibly working on, but it may not have
appeared in the errata yet. The bug report says that until this is
fixed, the workaround is to also disable HLT in cpu_idle. I am not
sure what is the correct value for the sysctl on FreeBSD, perhaps

sysctl machdep.idle=0

or some other value?

Cheers

Peeter

--
Gary Jennejohn
2018-08-27 14:25:37 UTC
Permalink
On Mon, 27 Aug 2018 16:16:47 +0300
Post by karu.pruun
Post by Meowthink
That's kib, who has committed things in that script to both 12 [1] and
stable/11 [2].
Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.
On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.
Seems like ... the only thing I can do is sit down and wait?
The revision
https://svnweb.freebsd.org/base/head/sys/x86/x86/cpu_machdep.c?r1=336763&r2=336762&pathrev=336763
works around the mwait issue, i.e. it sets
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
Now it may or may not relate to your problem, but it appears that
Ryzen 2400G also has another issue with HLT, see the DragonFly bug
report
https://bugs.dragonflybsd.org/issues/3131
which AMD is aware of and is possibly working on, but it may not have
appeared in the errata yet. The bug report says that until this is
fixed, the workaround is to also disable HLT in cpu_idle. I am not
sure what is the correct value for the sysctl on FreeBSD, perhaps
sysctl machdep.idle=0
or some other value?
It is in the latest errata and there are no plans to fix it.

Based on the detailed description, this is a problem only in a
hypervisor. AMD has a suggested workaround for it.
--
Gary Jennejohn
Meowthink
2018-08-27 15:07:28 UTC
Permalink
Hi peeter,
Post by karu.pruun
Post by Meowthink
That's kib, who has committed things in that script to both 12 [1] and
stable/11 [2].
Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.
On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.
Seems like ... the only thing I can do is sit down and wait?
The revision
https://svnweb.freebsd.org/base/head/sys/x86/x86/cpu_machdep.c?r1=336763&r2=336762&pathrev=336763
works around the mwait issue, i.e. it sets
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
I think that shall not apply to 2400G, which is model 11h not 1h.
Here're what I have now:

machdep.idle: acpi
machdep.idle_available: spin, mwait, hlt, acpi
machdep.idle_apl31: 0
machdep.idle_mwait: 1
Post by karu.pruun
Now it may or may not relate to your problem, but it appears that
Ryzen 2400G also has another issue with HLT, see the DragonFly bug
report
https://bugs.dragonflybsd.org/issues/3131
Thanks a lot for that info.
It's much easier to prove your problem, since it's reproducible. But
mine was so random to catch...
Anyway, it seems like the IRET issue [1] is still not fixed? I'm
highly doubt that my issue is this related because my system became
significantly more stable since I stop that irq storm from bluetooth
module - Though it still panics occasionally.
So could anybody tell, what's the difference between FreeBSD
workaround [2] and the DragonflyBSD one?
Post by karu.pruun
which AMD is aware of and is possibly working on, but it may not have
appeared in the errata yet. The bug report says that until this is
fixed, the workaround is to also disable HLT in cpu_idle. I am not
sure what is the correct value for the sysctl on FreeBSD, perhaps
sysctl machdep.idle=0
or some other value?
In the meantime, I have this microcode

# cpucontrol -m 0x8b /dev/cpuctl0
MSR 0x8b: 0x00000000 0x0810100b

Hence I should use mwait?
Still don't know what should I set. Any idea?
Post by karu.pruun
Cheers
Peeter
--
Thank you for your direction.

Cheers,
meowthink

[1] http://lists.dragonflybsd.org/pipermail/commits/2017-August/626190.html
[2] https://reviews.freebsd.org/D11780
karu.pruun
2018-08-28 07:54:17 UTC
Permalink
Post by Meowthink
Post by karu.pruun
Post by Meowthink
Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.
On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.
Seems like ... the only thing I can do is sit down and wait?
The revision
https://svnweb.freebsd.org/base/head/sys/x86/x86/cpu_machdep.c?r1=336763&r2=336762&pathrev=336763
works around the mwait issue, i.e. it sets
sysctl machdep.idle_mwait=0
sysctl machdep.idle=hlt
I think that shall not apply to 2400G, which is model 11h not 1h.
machdep.idle: acpi
machdep.idle_available: spin, mwait, hlt, acpi
machdep.idle_apl31: 0
machdep.idle_mwait: 1
Post by karu.pruun
Now it may or may not relate to your problem, but it appears that
Ryzen 2400G also has another issue with HLT, see the DragonFly bug
report
https://bugs.dragonflybsd.org/issues/3131
Thanks a lot for that info.
It's much easier to prove your problem, since it's reproducible. But
mine was so random to catch...
Anyway, it seems like the IRET issue [1] is still not fixed? I'm
highly doubt that my issue is this related because my system became
significantly more stable since I stop that irq storm from bluetooth
module - Though it still panics occasionally.
So could anybody tell, what's the difference between FreeBSD
workaround [2] and the DragonflyBSD one?
Post by karu.pruun
which AMD is aware of and is possibly working on, but it may not have
appeared in the errata yet. The bug report says that until this is
fixed, the workaround is to also disable HLT in cpu_idle. I am not
sure what is the correct value for the sysctl on FreeBSD, perhaps
sysctl machdep.idle=0
or some other value?
In the meantime, I have this microcode
# cpucontrol -m 0x8b /dev/cpuctl0
MSR 0x8b: 0x00000000 0x0810100b
Hence I should use mwait?
Still don't know what should I set. Any idea?
If I was you, I'd play around with the sysctls mentioned above and see
if it helps. Start with disabling both mwait and hlt, perhaps

machdep.idle=spin
machdep.idle_mwait=0

(assuming that 'spin' means hlt will not used) and then if that does
not lead to a panic, try enabling mwait. I can't test 2400G since I
don't have it any more. I booted FreeBSD a couple of times but did not
run it over long periods of time.

Cheers

Peeter

--

Gary Jennejohn
2018-08-27 14:13:13 UTC
Permalink
On Mon, 27 Aug 2018 20:18:46 +0800
Post by Meowthink
Hi Gary,
Post by Gary Jennejohn
On Mon, 27 Aug 2018 10:13:10 +0200
Post by Phil Norman
Hi.
I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.
I had instability problems with my Ryzen 5 - lockups for no
apparent reason. The only recourse waas a hard reset.
It turned out that there were two causes
1) old CPU microcode
2) unhandled errate in the CPU
I installed the /usr/ports/sysutils/devcpu-data port, which
allowed me to install the latest microcode using cpucontrol(8).
I also used a shell script called amd_errata.sh provided by one of
the FreeBSD committers. To my shame I can't remember exactly
who. Note that the errata fixups are now part of the kernel in
FreeBSD 12.
That's kib, who has committed things in that script to both 12 [1] and
stable/11 [2].
Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.
AMD has also relased a Revision Guide for Family 11h. Lots of
errata listed there, but I didn't look at it closely enough to say whether
any are relevant to lockups.
Post by Meowthink
On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.
Well, I installed the latest BIOS for my ASUS B350M-A also, but it
was no help. The lockups disappear only after I installed the
latest microde using the port.
Post by Meowthink
Seems like ... the only thing I can do is sit down and wait?
Post by Gary Jennejohn
After taking these steps about two months ago I have had no more
lockups and the machine runs very stabily.
[big snip]
--
Gary Jennejohn
[1] https://svnweb.freebsd.org/base?view=revision&revision=336763
[2] https://svnweb.freebsd.org/base?view=revision&revision=337235
--
Gary Jennejohn
Conrad Meyer
2018-08-27 16:30:43 UTC
Permalink
Post by Meowthink
That's kib, who has committed things in that script to both 12 [1] and
stable/11 [2].
Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my
Ryzen 5 2400G's model is 11h.
On the microcode. It shall be updated through UEFI/BIOS updates. I
think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel
0x810100b.
[1] https://svnweb.freebsd.org/base?view=revision&revision=336763
It seems AMD has only published an errata document for Ryzen 1, models
00h-0fh, unfortunately.

Best,
Conrad
Mitchell
2018-08-27 12:28:07 UTC
Permalink
Hi Meowthink:

I'm planning a Home Build, and I came across an issue which might apply
to your design.

Some AMD CPUs are designed for Over-Clocking automatically. But when I
investigated Memory Compatibility I saw that some Memory wasn't.

The "AMD Ryzen 5 2400G" looks like it can Over-Clock itself when it
feels safe to do so.

But the "Crucial 16GB DDR4-2400 EUDIMM CL17" seems to be classified as
Server Memory, which could mean it's designed for a single speed. I
couldn't find more details about Crucial Memory Over-Clocking.

The Crucial Web Pages do feature a Help Facility which might enable you
to check further if you input all your system details.

I'm no expert here. This will be my first Home Build attempt and I
haven't even started yet. You probably need a 2nd and 3rd opinion on
this topic. I'm just hoping my contribution will prompt further comments
from FreeBSD people with more know-how than I've got.

Yours truly: Frank Mitchell
Post by Phil Norman
Hi.
I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.
The other problem I had was with USB. I got quite a spam of log messages
about the USB reinitialisation. However, eventually I figured out that the
problem didn't occur if I booted the system from a completely powered-down
state. That is, use the physical switch on the PSU to cut power entirely,
re-enable, then boot from that state. Since then I've had 67 days of
uninterrupted uptime, with no USB issues at all.
It sounds like your problem is different, but trying a boot-from-cold might
be worthwhile, just in case ASRock have a consistent problem in this regard.
Cheers,
Phil
Post by Meowthink
Hello all,
Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )
But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.
Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.
In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.
Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.
The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.
Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?
Best regards,
Meowthink
Meowthink
2018-08-27 13:07:34 UTC
Permalink
Hi frank,
Post by Mitchell
I'm planning a Home Build, and I came across an issue which might apply
to your design.
Some AMD CPUs are designed for Over-Clocking automatically. But when I
investigated Memory Compatibility I saw that some Memory wasn't.
Many Intel CPUs are turbo boost enabled, also. I think It's safe to
trust these designs. They'll communicate to memory at a steady clock
rate, which will provide by SPD chips on DIMMs.

Ryzens are known to have compatible issues with memories. An easier
way is to choose a module which is in the qualified list, "QVL".
Post by Mitchell
The "AMD Ryzen 5 2400G" looks like it can Over-Clock itself when it
feels safe to do so.
But the "Crucial 16GB DDR4-2400 EUDIMM CL17" seems to be classified as
Server Memory, which could mean it's designed for a single speed. I
couldn't find more details about Crucial Memory Over-Clocking.
The Crucial Web Pages do feature a Help Facility which might enable you
to check further if you input all your system details.
That's a mistake months ago. What I'd care about is ECC.
I knew Ryzens (1x00) are ECC enabled. Then I was mistaken checking out
mobo's specification as Asrock didn't mention Raven Ridges (2x00G) at
that time. I thought my build with 2400G will got ECC, but sadly not.
Now Asrock say these on their website:

- AMD Ryzen series CPUs (Raven Ridge) support DDR4
3200+(OC)/2933(OC)/2667/2400/2133 non-ECC, un-buffered memory*

*For Ryzen Series CPUs (Raven Ridge), ECC is only supported with PRO CPUs.

In the end I got my system run, but without ECC enabled.
Post by Mitchell
I'm no expert here. This will be my first Home Build attempt and I
haven't even started yet. You probably need a 2nd and 3rd opinion on
this topic. I'm just hoping my contribution will prompt further comments
from FreeBSD people with more know-how than I've got.
Yours truly: Frank Mitchell
You are welcome.

Cheers,
meowthink
Post by Mitchell
Post by Phil Norman
Hi.
I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.
The other problem I had was with USB. I got quite a spam of log messages
about the USB reinitialisation. However, eventually I figured out that the
problem didn't occur if I booted the system from a completely
powered-down
state. That is, use the physical switch on the PSU to cut power entirely,
re-enable, then boot from that state. Since then I've had 67 days of
uninterrupted uptime, with no USB issues at all.
It sounds like your problem is different, but trying a boot-from-cold might
be worthwhile, just in case ASRock have a consistent problem in this regard.
Cheers,
Phil
Post by Meowthink
Hello all,
Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )
But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.
Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.
In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.
Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.
The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.
Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?
Best regards,
Meowthink
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
Stefan Blachmann
2018-08-27 21:30:25 UTC
Permalink
It is remarkable that AMD's list only contains "brands" like Crucial
and the like, but not a single first-party-manufacturer.
Why? Because the first-party-manufacturers do not sell bad memories,
for simple reputation reasons.

The question where all these masses of B-grade selection chips remain,
which the memory manufacturers reject for use under their own brand,
is an old taboo in the industry.

My personal impression is that these are dumped via these third party
memory module manufacturers.
The typical gamer/overclocker customer unaware of this will readily
explain away problems on her non-ECC systems equipped with memory
chips rejected by the original manufacturer as "the usual Windows
crashes".
The consumers will even happily take the fancy "coolers" on the
modules as "sign of quality and worthiness", whose actual function is
to hide the crap inside.

Thus my personal advice:
Do not use memory modules from third-party-manufacturers.
The time and data you lose does not justify the savings when buying
stuff from B-grade-stuff remarketers.
Only buy first-party-memory modules, i.e. Samsung, Hynix, Micron etc.
(If you really insist on using third-party-modules, take Kingston, who
have a comparatively small history of using unreliable chips compared
to other "brands".)
Post by Meowthink
Hi frank,
Post by Mitchell
I'm planning a Home Build, and I came across an issue which might apply
to your design.
Some AMD CPUs are designed for Over-Clocking automatically. But when I
investigated Memory Compatibility I saw that some Memory wasn't.
Many Intel CPUs are turbo boost enabled, also. I think It's safe to
trust these designs. They'll communicate to memory at a steady clock
rate, which will provide by SPD chips on DIMMs.
Ryzens are known to have compatible issues with memories. An easier
way is to choose a module which is in the qualified list, "QVL".
Post by Mitchell
The "AMD Ryzen 5 2400G" looks like it can Over-Clock itself when it
feels safe to do so.
But the "Crucial 16GB DDR4-2400 EUDIMM CL17" seems to be classified as
Server Memory, which could mean it's designed for a single speed. I
couldn't find more details about Crucial Memory Over-Clocking.
The Crucial Web Pages do feature a Help Facility which might enable you
to check further if you input all your system details.
That's a mistake months ago. What I'd care about is ECC.
I knew Ryzens (1x00) are ECC enabled. Then I was mistaken checking out
mobo's specification as Asrock didn't mention Raven Ridges (2x00G) at
that time. I thought my build with 2400G will got ECC, but sadly not.
- AMD Ryzen series CPUs (Raven Ridge) support DDR4
3200+(OC)/2933(OC)/2667/2400/2133 non-ECC, un-buffered memory*
*For Ryzen Series CPUs (Raven Ridge), ECC is only supported with PRO CPUs.
In the end I got my system run, but without ECC enabled.
Post by Mitchell
I'm no expert here. This will be my first Home Build attempt and I
haven't even started yet. You probably need a 2nd and 3rd opinion on
this topic. I'm just hoping my contribution will prompt further comments
from FreeBSD people with more know-how than I've got.
Yours truly: Frank Mitchell
You are welcome.
Cheers,
meowthink
Post by Mitchell
Post by Phil Norman
Hi.
I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some
trouble with instability, although my problems weren't panics, but rather
two issues. One was random lockups (with no evidence left in logs), but I
*think* this was down to an inadequately cooled graphics card.
The other problem I had was with USB. I got quite a spam of log messages
about the USB reinitialisation. However, eventually I figured out that the
problem didn't occur if I booted the system from a completely powered-down
state. That is, use the physical switch on the PSU to cut power entirely,
re-enable, then boot from that state. Since then I've had 67 days of
uninterrupted uptime, with no USB issues at all.
It sounds like your problem is different, but trying a boot-from-cold might
be worthwhile, just in case ASRock have a consistent problem in this regard.
Cheers,
Phil
Post by Meowthink
Hello all,
Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )
But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.
Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.
In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.
Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.
The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.
Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?
Best regards,
Meowthink
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
Don Lewis
2018-08-28 00:44:04 UTC
Permalink
Post by Stefan Blachmann
It is remarkable that AMD's list only contains "brands" like Crucial
and the like, but not a single first-party-manufacturer.
Why? Because the first-party-manufacturers do not sell bad memories,
for simple reputation reasons.
The question where all these masses of B-grade selection chips remain,
which the memory manufacturers reject for use under their own brand,
is an old taboo in the industry.
My personal impression is that these are dumped via these third party
memory module manufacturers.
The typical gamer/overclocker customer unaware of this will readily
explain away problems on her non-ECC systems equipped with memory
chips rejected by the original manufacturer as "the usual Windows
crashes".
The consumers will even happily take the fancy "coolers" on the
modules as "sign of quality and worthiness", whose actual function is
to hide the crap inside.
Do not use memory modules from third-party-manufacturers.
The time and data you lose does not justify the savings when buying
stuff from B-grade-stuff remarketers.
Only buy first-party-memory modules, i.e. Samsung, Hynix, Micron etc.
(If you really insist on using third-party-modules, take Kingston, who
have a comparatively small history of using unreliable chips compared
to other "brands".)
Crucial == Micron. There is a Micron copyright notice at the bottom of
the Crucial home page, and a link to Crucial at the bottom of the Micron
home page.

When I put together my Ryzen machine last year, I purchased the Crucial
DDR4-2400 ECC RAM that was listed on the motherboard vendor's qualified
list. I also looked up the part number for the Micron-branded
equivalent, but didn't find any available for retail sale. I've used a
fair amount of Kingston RAM in the past, but at the time they didn't
have DDR4 ECC RAM in that speed grade.
Johannes Lundberg
2018-08-27 12:56:45 UTC
Permalink
Post by Meowthink
Hello all,
Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )
But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.
Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.
In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.
Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.
The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.
Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?
Best regards,
Meowthink
Hi

I have a similar setup and also experience random hard resets without a
kernel dump. FreeBSD usually can run days, Windows 10 (fresh install I
think) doesn’t last more than a few minutes before BSOD.

Windows 10 came with the thing when I bought it. I can't run windows update
or anything since it hangs so soon.
The box came with an earlier mobo+cpu that I replaced with a ASRock+Ryzen.
The earlier AMD Kaveri was stable in both Windows (I think) and FreeBSD.

On occasion I get kernel panic directly in early boot. See attached image.

I am running cpu microcode update in rc.conf.
Post by Meowthink
[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224886
[2] https://reviews.freebsd.org/D11780
------------------------------------------------------------------------
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081e962790,
eva=18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081e962790, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081e962790)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff822950a8 in arc_change_state ()
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800
#9 0xffffffff8229328b in arc_access () at time.h:145
#10 0xffffffff82296232 in arc_write_done (zio=0xfffff8065f886410)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169
#11 0xffffffff82334cbe in zio_done (zio=<value optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032
#12 0xffffffff8233070c in zio_execute (zio=0xfffff8065f886410)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768
#13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=0xfffff8000d9e6e00)
at /usr/src/sys/kern/subr_taskqueue.c:463
#14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=<value optimized out>)
at /usr/src/sys/kern/subr_taskqueue.c:755
#15 0xffffffff80abd813 in fork_exit (
callout=0xffffffff80b53d90 <taskqueue_thread_loop>,
arg=0xfffff8000d967030, frame=0xfffffe081e962ac0)
at /usr/src/sys/kern/kern_fork.c:1072
#16 0xffffffff80f5cc7e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:972
#17 0x0000000000000000 in ?? ()
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081ed30700, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081ed30700, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081ed30700)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80dfe4ad in vm_object_terminate (object=0xfffff805bf66d5a0)
at /usr/src/sys/vm/vm_object.c:768
#9 0xffffffff80dfd0f8 in vm_object_deallocate (object=0x0)
at /usr/src/sys/vm/vm_object.c:677
#10 0xffffffff80df3189 in _vm_map_unlock (map=<value optimized out>,
file=<value optimized out>, line=<value optimized out>)
at /usr/src/sys/vm/vm_map.c:2939
#11 0xffffffff80df7be2 in vm_map_remove (map=0xfffff80018673000, start=4096,
end=140737488351232) at /usr/src/sys/vm/vm_map.c:3137
#12 0xffffffff80df2e49 in vmspace_exit (td=0xfffff80039ec0620)
at /usr/src/sys/vm/vm_map.c:337
#13 0xffffffff80ab72b9 in exit1 (td=0xfffff80039ec0620,
rval=<value optimized out>, signo=<value optimized out>)
at /usr/src/sys/kern/kern_exit.c:401
#14 0xffffffff80ab6ced in sys_sys_exit (td=<value optimized out>,
uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:180
#15 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff80039ec0620, traced=0)
at subr_syscall.c:132
#16 0xffffffff80f5c5ad in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008028d034a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
Panic while only running my single thread python script
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80afa5fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80afa863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081f635ec0, eva=952)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081f635ec0, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7b984 in trap (frame=0xfffffe081f635ec0)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5bccc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80af57ad in __rw_wlock_hard (c=0xfffff80016f8f798,
v=<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:977
#9 0xffffffff80bbca92 in bufobj_invalbuf (bo=<value optimized out>, flags=1,
slpflag=1017770744, slptimeo=<value optimized out>)
at /usr/src/sys/kern/vfs_subr.c:1609
#10 0xffffffff80bbf8be in vgonel (vp=0xfffff8053ca9f1d8)
at /usr/src/sys/kern/vfs_subr.c:1655
#11 0xffffffff80bbbcc4 in vnlru_free_locked (count=1, mnt_op=0x0)
at /usr/src/sys/kern/vfs_subr.c:1227
#12 0xffffffff80bbbe14 in getnewvnode_reserve (count=1)
at /usr/src/sys/kern/vfs_subr.c:1287
#13 0xffffffff82327fb4 in zfs_zget (zfsvfs=0xfffff80076574000,
obj_num=34941,
zpp=0xfffffe081f6362a8)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1122
#14 0xffffffff823421ad in zfs_dirent_lookup (dzp=0xfffff804ff94e420,
name=0xfffffe081f6363e0 "filename.ext", zpp=0xfffffe081f6362a8, flag=2)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:187
#15 0xffffffff82342267 in zfs_dirlook (dzp=0xfffff804ff94e420,
name=<value optimized out>, zpp=0xfffffe081f636360)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
#16 0xffffffff8235a4ef in zfs_lookup ()
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658
#17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=0xfffffe081f636548)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4956
#18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>,
a=0xfffffe081f636548) at vnode_if.c:195
#19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=<value optimized out>)
at vnode_if.h:80
#20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=<value optimized out>,
a=0xfffffe081f636610) at vnode_if.c:127
#21 0xffffffff80bb2761 in lookup (ndp=0xfffffe081f636748) at vnode_if.h:54
#22 0xffffffff80bb1c29 in namei (ndp=0xfffffe081f636748)
at /usr/src/sys/kern/vfs_lookup.c:448
#23 0xffffffff80bc8238 in kern_statat (td=0xfffff8013ba1b620,
flag=<value optimized out>, fd=-100,
path=0x80332c910 <Address 0x80332c910 out of bounds>,
pathseg=UIO_USERSPACE, sbp=0xfffffe081f636900, hook=0)
at /usr/src/sys/kern/vfs_syscalls.c:2023
#24 0xffffffff80bc817d in sys_stat (td=<value optimized out>,
uap=0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978
#25 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff8013ba1b620, traced=0)
at subr_syscall.c:132
#26 0xffffffff80f5c5ad in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#27 0x0000000801a5b9ca in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
Panic while using mplayer
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80af91cb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80af95f1 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80af9433 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7a13f in trap_fatal (frame=0xfffffe081f70d380, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7a199 in trap_pfault (frame=0xfffffe081f70d380, usermode=0)
at pcpu.h:230
#6 0xffffffff80f79974 in trap (frame=0xfffffe081f70d380)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5a00c in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff8088c030 in hdac_stream_start (dev=<value optimized out>,
child=<value optimized out>, dir=0, stream=1, buf=1889533952, blksz=2048,
blkcnt=2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927
#9 0xffffffff8088437d in hdaa_channel_start (ch=<value optimized out>)
at hdac_if.h:84
#10 0xffffffff80887e0d in hdaa_channel_trigger (obj=<value optimized out>,
data=0xfffff8007102c480, go=1)
at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161
#11 0xffffffff80893b8e in chn_trigger (c=0xfffff80071058400, go=1)
at channel_if.h:131
#12 0xffffffff8089751b in chn_notify (c=0xfffff80071058400,
flags=<value optimized out>) at
/usr/src/sys/dev/sound/pcm/channel.c:2281
#13 0xffffffff808b697f in vchan_trigger (obj=<value optimized out>,
data=<value optimized out>, go=1)
at /usr/src/sys/dev/sound/pcm/vchan.c:171
#14 0xffffffff80893b8e in chn_trigger (c=0xfffff80071057c00, go=1)
at channel_if.h:131
#15 0xffffffff8089de10 in dsp_ioctl (i_dev=<value optimized out>,
cmd=<value optimized out>, arg=0xfffffe081f70d8d0 "\003",
mode=<value optimized out>, td=<value optimized out>)
at /usr/src/sys/dev/sound/pcm/dsp.c:1733
#16 0xffffffff809c5b38 in devfs_ioctl_f (fp=0xfffff802c5563c80,
com=2147766288, data=0xfffffe081f70d8d0, cred=0xfffff8004c482500,
td=0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
#17 0xffffffff80b5c00d in kern_ioctl (td=0xfffff802f24ac000, fd=51,
com=2147766288, data=<value optimized out>) at file.h:323
#18 0xffffffff80b5bd2c in sys_ioctl (td=0xfffff802f24ac000,
uap=0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745
#19 0xffffffff80f7b1c8 in amd64_syscall (td=0xfffff802f24ac000, traced=0)
at subr_syscall.c:132
#20 0xffffffff80f5a8ed in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#21 0x0000000801fb94aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
------------------------------------------------------------------------
Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup.
#0 doadump (textdump=<value optimized out>) at pcpu.h:230
230 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:230
#1 0xffffffff80af95fb in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:383
#2 0xffffffff80af9a21 in vpanic (fmt=<value optimized out>,
ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3 0xffffffff80af9863 in panic (fmt=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:707
#4 0xffffffff80f7b13f in trap_fatal (frame=0xfffffe081ee186e0, eva=201697507)
at /usr/src/sys/amd64/amd64/trap.c:877
#5 0xffffffff80f7b199 in trap_pfault (frame=0xfffffe081ee186e0, usermode=0)
at pcpu.h:230
#6 0xffffffff80f7a974 in trap (frame=0xfffffe081ee186e0)
at /usr/src/sys/amd64/amd64/trap.c:415
#7 0xffffffff80f5a5bc in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:231
#8 0xffffffff80ad596e in free (addr=0xfffff802472af200,
mtp=0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583
#9 0xffffffff8232a667 in zfs_inactive (vp=<value optimized out>,
cr=<value optimized out>, ct=<value optimized out>)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4333
#10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=<value optimized out>)
at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:5364
#11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=<value optimized out>,
a=0xfffffe081ee18858) at vnode_if.c:1955
#12 0xffffffff80bbd7bc in vinactive (vp=0xfffff803ae8b3760,
td=0xfffff803ae23b620) at vnode_if.h:807
#13 0xffffffff80bbdcc7 in vputx (vp=0xfffff803ae8b3760, func=1)
at /usr/src/sys/kern/vfs_subr.c:2688
#14 0xffffffff80bc5180 in sys_fchdir (td=0xfffff803ae23b620,
uap=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:724
#15 0xffffffff80f7c1c8 in amd64_syscall (td=0xfffff803ae23b620, traced=0)
at subr_syscall.c:132
#16 0xffffffff80f5ae9d in fast_syscall_common ()
at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008008a99aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language: auto; currently minimal
(kgdb)
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
Loading...