Discussion:
head -r339076 amd64 -> armv7 port cross build attempt with native tools involved: hangs between a cc (wait) and its child ld (uwait)
Mark Millard via freebsd-hackers
2018-10-27 03:42:27 UTC
Permalink
In trying to amd64 -> armv7 cross build ports via poudriere-devel
use with native cross tools involved (and UFS, not ZFS), I'm
getting about 117 ports that built and then one that ends up stuck
in wait/uwait . ^C to poudriere and restarting it repeats the
stuck behavior at the same point (a cc and its ld), for example:


[00:02:51] [01] [00:00:00] Building print/texinfo | texinfo-6.5,1

ps output extraction (blank lines added for each of
scanning):

UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
. . .
0 42312 32181 0 52 0 12904 3904 select I 1 0:00.02 sh: poudriere[FBSDFSSDjailArmV7-default][01]: build_pkg (texinfo-6.5,1) (sh)
0 42974 42312 0 52 0 12904 3900 wait I 1 0:00.00 sh: poudriere[FBSDFSSDjailArmV7-default][01]: build_pkg (texinfo-6.5,1) (sh)
0 42975 42974 0 52 0 10408 1840 wait IJ 1 0:00.01 /usr/bin/make -C /usr/ports/print/texinfo configure
0 43077 42975 0 52 0 10252 1792 wait IJ 1 0:00.00 /bin/sh -e -c (cd /wrkdirs/usr/ports/print/texinfo/work/texinfo-6.5 && _LATE_CONFIGURE_ARGS="" ; if [ -z "" ] && ./configure --help
0 43375 43077 0 52 0 11164 2392 wait IJ 1 0:00.19 /bin/sh ./configure --enable-nls --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --disable-silent-rules --infodir=/usr
0 46850 43375 0 52 0 11164 2388 wait IJ 1 0:00.00 /bin/sh ./configure --enable-nls --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --disable-silent-rules --infodir=/usr
0 46857 46850 0 52 0 11080 2060 wait IJ 1 0:00.04 /bin/sh ./configure --disable-option-checking --prefix=/usr/local --enable-nls --localstatedir=/var --mandir=/usr/local/man --disable-s

0 47796 46857 0 52 0 113840 26184 wait IJ 1 0:00.15 /usr/local/bin/qemu-arm-static /usr/bin/cc -o conftest -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -mcpu=cortex-a

0 47801 47796 0 52 0 285300 39672 uwait IJ 1 0:00.22 qemu-arm-static -L /usr/gnemul/qemu-arm /usr/bin/ld --eh-frame-hdr -dynamic-linker /libexec/ld-elf.so.1 --hash-style=both --enable-new-

So the "/usr/local/bin/qemu-arm-static /usr/bin/cc . . ."
creates the child "qemu-arm-static -L /usr/gnemul/qemu-arm /usr/bin/ld . . ."
process and the two get hung up. Letting it sit for long periods does
not let it progress.

The full commands are (note the "-pipe" vs. the "/tmp/conftest-6c0832.o"):

/usr/local/bin/qemu-arm-static /usr/bin/cc -o conftest -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -mcpu=cortex-a7 -DLIBICONV_PLUG conftest.c

and:

qemu-arm-static -L /usr/gnemul/qemu-arm /usr/bin/ld --eh-frame-hdr -dynamic-linker /libexec/ld-elf.so.1 --hash-style=both --enable-new-dtags -o conftest /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtbegin.o -L/usr/lib /tmp/conftest-6c0832.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/crtend.o /usr/lib/crtn.o

For reference for /tmp/conftest-6c0832.o :

# ls -lTd /usr/local/poudriere/data/.m/FBSDFSSDjailArmV7-default/01/tmp/conftest-6c0832.o
-rw-r--r-- 1 root wheel 4204 Oct 26 17:33:13 2018 /usr/local/poudriere/data/.m/FBSDFSSDjailArmV7-default/01/tmp/conftest-6c0832.o

(I'm not using tmpfs or the like at all.)


The context is based on head -r339076 an is on a
Ryzen Threadripper 1950X system, natively booted
(not Hyper-V). (I've not tried under Hyper-V yet.)

Note: I have built ports similarly before --but the
last time was back in March-May sometime.

# poudriere jail -jFBSDFSSDjailArmV7 -i
Jail name: FBSDFSSDjailArmV7
Jail version: 12.0-ALPHA8
Jail arch: arm.armv7
Jail method: null
Jail mount: /usr/obj/DESTDIRs/clang-armv7-installworld-poud
Jail fs:
Jail updated: 2018-10-26 16:42:55
Tree name: default
Tree method: null
Status: parallel_build:
Building started: 2018-10-26 17:29:36
Elapsed time: 02:47:50
Packages built: 0
Packages failed: 0
Packages ignored: 0
Packages skipped: 0
Packages total: 84
Packages left: 84

# poudriere ports -l
PORTSTREE METHOD TIMESTAMP PATH
default null 2017-08-14 21:07:05 /usr/ports


I have yet to think of a way to look into this or to work around
it. But my long running build on an Orange Pi Plus 2nd Edition
has finished so I'll update from that for now.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Mark Millard via freebsd-hackers
2018-10-27 05:35:33 UTC
Permalink
[Attaching to the ld process with gdb and detaching let things
continue.]
Post by Mark Millard via freebsd-hackers
In trying to amd64 -> armv7 cross build ports via poudriere-devel
use with native cross tools involved (and UFS, not ZFS), I'm
getting about 117 ports that built and then one that ends up stuck
in wait/uwait . ^C to poudriere and restarting it repeats the
[00:02:51] [01] [00:00:00] Building print/texinfo | texinfo-6.5,1
ps output extraction (blank lines added for each of
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
. . .
0 42312 32181 0 52 0 12904 3904 select I 1 0:00.02 sh: poudriere[FBSDFSSDjailArmV7-default][01]: build_pkg (texinfo-6.5,1) (sh)
0 42974 42312 0 52 0 12904 3900 wait I 1 0:00.00 sh: poudriere[FBSDFSSDjailArmV7-default][01]: build_pkg (texinfo-6.5,1) (sh)
0 42975 42974 0 52 0 10408 1840 wait IJ 1 0:00.01 /usr/bin/make -C /usr/ports/print/texinfo configure
0 43077 42975 0 52 0 10252 1792 wait IJ 1 0:00.00 /bin/sh -e -c (cd /wrkdirs/usr/ports/print/texinfo/work/texinfo-6.5 && _LATE_CONFIGURE_ARGS="" ; if [ -z "" ] && ./configure --help
0 43375 43077 0 52 0 11164 2392 wait IJ 1 0:00.19 /bin/sh ./configure --enable-nls --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --disable-silent-rules --infodir=/usr
0 46850 43375 0 52 0 11164 2388 wait IJ 1 0:00.00 /bin/sh ./configure --enable-nls --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --disable-silent-rules --infodir=/usr
0 46857 46850 0 52 0 11080 2060 wait IJ 1 0:00.04 /bin/sh ./configure --disable-option-checking --prefix=/usr/local --enable-nls --localstatedir=/var --mandir=/usr/local/man --disable-s
0 47796 46857 0 52 0 113840 26184 wait IJ 1 0:00.15 /usr/local/bin/qemu-arm-static /usr/bin/cc -o conftest -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -mcpu=cortex-a
0 47801 47796 0 52 0 285300 39672 uwait IJ 1 0:00.22 qemu-arm-static -L /usr/gnemul/qemu-arm /usr/bin/ld --eh-frame-hdr -dynamic-linker /libexec/ld-elf.so.1 --hash-style=both --enable-new-
So the "/usr/local/bin/qemu-arm-static /usr/bin/cc . . ."
creates the child "qemu-arm-static -L /usr/gnemul/qemu-arm /usr/bin/ld . . ."
process and the two get hung up. Letting it sit for long periods does
not let it progress.
/usr/local/bin/qemu-arm-static /usr/bin/cc -o conftest -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -mcpu=cortex-a7 -DLIBICONV_PLUG conftest.c
qemu-arm-static -L /usr/gnemul/qemu-arm /usr/bin/ld --eh-frame-hdr -dynamic-linker /libexec/ld-elf.so.1 --hash-style=both --enable-new-dtags -o conftest /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtbegin.o -L/usr/lib /tmp/conftest-6c0832.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/crtend.o /usr/lib/crtn.o
# ls -lTd /usr/local/poudriere/data/.m/FBSDFSSDjailArmV7-default/01/tmp/conftest-6c0832.o
-rw-r--r-- 1 root wheel 4204 Oct 26 17:33:13 2018 /usr/local/poudriere/data/.m/FBSDFSSDjailArmV7-default/01/tmp/conftest-6c0832.o
(I'm not using tmpfs or the like at all.)
The context is based on head -r339076 an is on a
Ryzen Threadripper 1950X system, natively booted
(not Hyper-V). (I've not tried under Hyper-V yet.)
Note: I have built ports similarly before --but the
last time was back in March-May sometime.
# poudriere jail -jFBSDFSSDjailArmV7 -i
Jail name: FBSDFSSDjailArmV7
Jail version: 12.0-ALPHA8
Jail arch: arm.armv7
Jail method: null
Jail mount: /usr/obj/DESTDIRs/clang-armv7-installworld-poud
Jail updated: 2018-10-26 16:42:55
Tree name: default
Tree method: null
Building started: 2018-10-26 17:29:36
Elapsed time: 02:47:50
Packages built: 0
Packages failed: 0
Packages ignored: 0
Packages skipped: 0
Packages total: 84
Packages left: 84
# poudriere ports -l
PORTSTREE METHOD TIMESTAMP PATH
default null 2017-08-14 21:07:05 /usr/ports
I have yet to think of a way to look into this or to work around
it. But my long running build on an Orange Pi Plus 2nd Edition
has finished so I'll update from that for now.
I tried again and when it hung up I used gdb to
attach to the ld process and later to detach:

# gdb `which qemu-arm-static`
. . .
(gdb) attach 18703
Attaching to program: /usr/local/bin/qemu-arm-static, process 18703
Couldn't get registers: Device busy.
. . .
(gdb) bt
#0 _umtx_op () at _umtx_op.S:3
#1 0x0000000060050cd4 in _umtx_wait_uint_private (where=0x0, addr=<optimized out>, target_val=<optimized out>, tsz=<optimized out>, t=<optimized out>)
at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/freebsd/os-thread.c:258
#2 freebsd_lock_umutex (target_addr=4102556064, id=100867, ts=0x0, mode=<optimized out>) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/freebsd/os-thread.c:890
#3 0x000000006004a808 in do_freebsd__umtx_op (obj=4102556064, op=<optimized out>, val=0, uaddr=0, target_time=0)
at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/freebsd/os-thread.h:359
#4 0x00000000600414d5 in do_freebsd_syscall (cpu_env=0x8607a4c58, num=454, arg1=<optimized out>, arg2=<optimized out>, arg3=<optimized out>, arg4=0, arg5=0, arg6=-185272152, arg7=0, arg8=0)
at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/syscall.c:1364
#5 0x0000000060038d03 in target_cpu_loop (env=0x8607a4c58) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/arm/target_arch_cpu.h:207
#6 0x00000000600386a9 in cpu_loop (env=0xf48809bc) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/main.c:121
#7 0x0000000060039922 in main (argc=-10608, argv=0x7fffffffd1d8) at /wrkdirs/usr/ports/emulators/qemu-user-static/work/qemu-bsd-user-495fb3a/bsd-user/main.c:513
(gdb) detach
Detaching from program: /usr/local/bin/qemu-arm-static, process 18703

Things started back up from there.

We will see if it hangs up again.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Mark Millard via freebsd-hackers
2018-10-27 23:33:03 UTC
Permalink
[Some of this discussion occurred off list. The point here
is not specific to the hang that I originally reported.]
. . .
There are bugs in qemu that can cause such deadlock, you can try these
https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6baa45fdbe0dbb56a7371
https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d499d8e465d4d2026c52b
I'll try those later. Thanks. (I need to get back to sleep.)
It was interesting that attach/detach to the ld process
caused it to progress. The rest of the build completed
just fine. But that one spot consistently hung up before
trying gdb to look at the back trace.
Looking at the qemu code related to the 2nd patch: the
structure of the field copies (via __get_user) seems
very sensitive to the ABI rules for the target and
how things align and such, given that the structure
description and code are host code. __packed vs. not
is possibly not sufficient control to always make things
match right across all the potential combinations of
host and target from what I can see.
Lack of __packed may prove sufficient for my specific
context (amd64 host and armv7 target) but it seems
non-obvious what to do in general.
There would also seem to be big endian vs. little endian
issues on the individual __get_user styles of copies
when the host and target do not match for a multi-byte
numeric encoding.
Well, I get the following for:

#include "/usr/include/sys/event.h" // kevent
#include <stddef.h> // offsetof
#include <stdio.h> // printf

int
main()
{
printf("%lu\n", (unsigned long) sizeof(struct kevent));
printf("ident %lu\n", (unsigned long) offsetof(struct kevent, ident));
printf("filter %lu\n", (unsigned long) offsetof(struct kevent, filter));
printf("flags %lu\n", (unsigned long) offsetof(struct kevent, flags));
printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, fflags));
printf("data %lu\n", (unsigned long) offsetof(struct kevent, data));
printf("udata %lu\n", (unsigned long) offsetof(struct kevent, udata));
printf("ext %lu\n", (unsigned long) offsetof(struct kevent, ext));
return 0;
}

(This code avoided warnings for type mismatches with the
printf strings and such.)

amd64 native [host of qemu use] (comments hand added):

# ./a.out
64
ident 0
filter 8 // NOTE!
flags 10 // NOTE!
fflags 12 // NOTE!
data 16
udata 24
ext 32

(The above is not particularly important but I
include it for completeness.)

armv7 native [target in qemu use] (comments hand added):

# ./a.out
64 // NOTE vs. below!
ident 0
filter 4 // NOTE vs. above!
flags 6 // NOTE vs. above!
fflags 8 // NOTE vs. above!
data 16 // NOTE vs. below!
udata 24 // NOTE vs. below!
ext 32 // NOTE vs. below!

/usr/include/sys/event.h lacks __packed in both cases.

With __packed in qemu-arm-static's source code
for target_freebsd_kevent I confirm that via
gdb for the qemu-arm-static:

p/d sizeof(struct target_freebsd_kevent)
p/d &((struct target_freebsd_kevent *)0)->ident
p/d &((struct target_freebsd_kevent *)0)->filter
p/d &((struct target_freebsd_kevent *)0)->flags
p/d &((struct target_freebsd_kevent *)0)->fflags
p/d &((struct target_freebsd_kevent *)0)->data
p/d &((struct target_freebsd_kevent *)0)->udata
p/d &((struct target_freebsd_kevent *)0)->ext

reports as the 2nd patch's problem-report
material reports (56,0,4,6,8,12,20,24): not
even the right size.

I also confirm that removing __packed in qemu's
code and rebuilding and then checking with gdb
reported a match to the above armv7 native report
(64,0,4,6,8,16,24,32).

I have not verified __packed used vs. not for any
other combination of host and target platforms.



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Mark Millard via freebsd-hackers
2018-10-28 00:30:03 UTC
Permalink
[Just the __packed removal patch was sufficient to no longer
have the hang problem that I originally reported for the
print/texinfo build in poudriere.]
Post by Mark Millard via freebsd-hackers
[Some of this discussion occurred off list. The point here
is not specific to the hang that I originally reported.]
. . .
There are bugs in qemu that can cause such deadlock, you can try these
https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6baa45fdbe0dbb56a7371
https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d499d8e465d4d2026c52b
I'll try those later. Thanks. (I need to get back to sleep.)
It was interesting that attach/detach to the ld process
caused it to progress. The rest of the build completed
just fine. But that one spot consistently hung up before
trying gdb to look at the back trace.
Looking at the qemu code related to the 2nd patch: the
structure of the field copies (via __get_user) seems
very sensitive to the ABI rules for the target and
how things align and such, given that the structure
description and code are host code. __packed vs. not
is possibly not sufficient control to always make things
match right across all the potential combinations of
host and target from what I can see.
Lack of __packed may prove sufficient for my specific
context (amd64 host and armv7 target) but it seems
non-obvious what to do in general.
There would also seem to be big endian vs. little endian
issues on the individual __get_user styles of copies
when the host and target do not match for a multi-byte
numeric encoding.
#include "/usr/include/sys/event.h" // kevent
#include <stddef.h> // offsetof
#include <stdio.h> // printf
int
main()
{
printf("%lu\n", (unsigned long) sizeof(struct kevent));
printf("ident %lu\n", (unsigned long) offsetof(struct kevent, ident));
printf("filter %lu\n", (unsigned long) offsetof(struct kevent, filter));
printf("flags %lu\n", (unsigned long) offsetof(struct kevent, flags));
printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, fflags));
printf("data %lu\n", (unsigned long) offsetof(struct kevent, data));
printf("udata %lu\n", (unsigned long) offsetof(struct kevent, udata));
printf("ext %lu\n", (unsigned long) offsetof(struct kevent, ext));
return 0;
}
(This code avoided warnings for type mismatches with the
printf strings and such.)
# ./a.out
64
ident 0
filter 8 // NOTE!
flags 10 // NOTE!
fflags 12 // NOTE!
data 16
udata 24
ext 32
(The above is not particularly important but I
include it for completeness.)
# ./a.out
64 // NOTE vs. below!
ident 0
filter 4 // NOTE vs. above!
flags 6 // NOTE vs. above!
fflags 8 // NOTE vs. above!
data 16 // NOTE vs. below!
udata 24 // NOTE vs. below!
ext 32 // NOTE vs. below!
/usr/include/sys/event.h lacks __packed in both cases.
With __packed in qemu-arm-static's source code
for target_freebsd_kevent I confirm that via
p/d sizeof(struct target_freebsd_kevent)
p/d &((struct target_freebsd_kevent *)0)->ident
p/d &((struct target_freebsd_kevent *)0)->filter
p/d &((struct target_freebsd_kevent *)0)->flags
p/d &((struct target_freebsd_kevent *)0)->fflags
p/d &((struct target_freebsd_kevent *)0)->data
p/d &((struct target_freebsd_kevent *)0)->udata
p/d &((struct target_freebsd_kevent *)0)->ext
reports as the 2nd patch's problem-report
material reports (56,0,4,6,8,12,20,24): not
even the right size.
I also confirm that removing __packed in qemu's
code and rebuilding and then checking with gdb
reported a match to the above armv7 native report
(64,0,4,6,8,16,24,32).
I have not verified __packed used vs. not for any
other combination of host and target platforms.
Removing the 2 examples of __packed, including the
1 for target_freebsd_kevent, as in Mikaël Urankar's
2nd listed patch, was sufficient to avoid the hang
that I originally reported. (Technically FreeBSD 11
is not involved and so one of the __packed removals
is not relevant to my example.)

I have not applied Mikaël Urankar's first listed
patch at all. It did not prove necessary for my
context.

Again: the only tested context is amd64 -> armv7
(host -> target) under a head -r339076 based
build. (So still 12.)

I'm doing a larger amd64 -> armv7 rebuild (around
210 ports overall) that originally included the
problematical hang and a full-bootstrap build
of lang/gcc8 (so extensive emulation use after
the clang-based stages). Prior to the patch,
all smaller attempts also hung at the same
place for print/texinfo.

But I'll only report if this larger test has
a problem.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Mark Millard via freebsd-hackers
2018-10-28 01:00:34 UTC
Permalink
[The bigger test still hung up.]
Post by Mark Millard via freebsd-hackers
[Just the __packed removal patch was sufficient to no longer
have the hang problem that I originally reported for the
print/texinfo build in poudriere.]
Post by Mark Millard via freebsd-hackers
[Some of this discussion occurred off list. The point here
is not specific to the hang that I originally reported.]
. . .
There are bugs in qemu that can cause such deadlock, you can try these
https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6baa45fdbe0dbb56a7371
https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d499d8e465d4d2026c52b
I'll try those later. Thanks. (I need to get back to sleep.)
It was interesting that attach/detach to the ld process
caused it to progress. The rest of the build completed
just fine. But that one spot consistently hung up before
trying gdb to look at the back trace.
Looking at the qemu code related to the 2nd patch: the
structure of the field copies (via __get_user) seems
very sensitive to the ABI rules for the target and
how things align and such, given that the structure
description and code are host code. __packed vs. not
is possibly not sufficient control to always make things
match right across all the potential combinations of
host and target from what I can see.
Lack of __packed may prove sufficient for my specific
context (amd64 host and armv7 target) but it seems
non-obvious what to do in general.
There would also seem to be big endian vs. little endian
issues on the individual __get_user styles of copies
when the host and target do not match for a multi-byte
numeric encoding.
#include "/usr/include/sys/event.h" // kevent
#include <stddef.h> // offsetof
#include <stdio.h> // printf
int
main()
{
printf("%lu\n", (unsigned long) sizeof(struct kevent));
printf("ident %lu\n", (unsigned long) offsetof(struct kevent, ident));
printf("filter %lu\n", (unsigned long) offsetof(struct kevent, filter));
printf("flags %lu\n", (unsigned long) offsetof(struct kevent, flags));
printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, fflags));
printf("data %lu\n", (unsigned long) offsetof(struct kevent, data));
printf("udata %lu\n", (unsigned long) offsetof(struct kevent, udata));
printf("ext %lu\n", (unsigned long) offsetof(struct kevent, ext));
return 0;
}
(This code avoided warnings for type mismatches with the
printf strings and such.)
# ./a.out
64
ident 0
filter 8 // NOTE!
flags 10 // NOTE!
fflags 12 // NOTE!
data 16
udata 24
ext 32
(The above is not particularly important but I
include it for completeness.)
# ./a.out
64 // NOTE vs. below!
ident 0
filter 4 // NOTE vs. above!
flags 6 // NOTE vs. above!
fflags 8 // NOTE vs. above!
data 16 // NOTE vs. below!
udata 24 // NOTE vs. below!
ext 32 // NOTE vs. below!
/usr/include/sys/event.h lacks __packed in both cases.
With __packed in qemu-arm-static's source code
for target_freebsd_kevent I confirm that via
p/d sizeof(struct target_freebsd_kevent)
p/d &((struct target_freebsd_kevent *)0)->ident
p/d &((struct target_freebsd_kevent *)0)->filter
p/d &((struct target_freebsd_kevent *)0)->flags
p/d &((struct target_freebsd_kevent *)0)->fflags
p/d &((struct target_freebsd_kevent *)0)->data
p/d &((struct target_freebsd_kevent *)0)->udata
p/d &((struct target_freebsd_kevent *)0)->ext
reports as the 2nd patch's problem-report
material reports (56,0,4,6,8,12,20,24): not
even the right size.
I also confirm that removing __packed in qemu's
code and rebuilding and then checking with gdb
reported a match to the above armv7 native report
(64,0,4,6,8,16,24,32).
I have not verified __packed used vs. not for any
other combination of host and target platforms.
Removing the 2 examples of __packed, including the
1 for target_freebsd_kevent, as in Mikaël Urankar's
2nd listed patch, was sufficient to avoid the hang
that I originally reported. (Technically FreeBSD 11
is not involved and so one of the __packed removals
is not relevant to my example.)
I have not applied Mikaël Urankar's first listed
patch at all. It did not prove necessary for my
context.
Again: the only tested context is amd64 -> armv7
(host -> target) under a head -r339076 based
build. (So still 12.)
I'm doing a larger amd64 -> armv7 rebuild (around
210 ports overall) that originally included the
problematical hang and a full-bootstrap build
of lang/gcc8 (so extensive emulation use after
the clang-based stages). Prior to the patch,
all smaller attempts also hung at the same
place for print/texinfo.
But I'll only report if this larger test has
a problem.
The bigger test still hung up in the same old place.
A gdb attach/detach sequence against the qemu-arm-static
for the ld again let it continue from there.

Drat. But good to know.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Mark Millard via freebsd-hackers
2018-10-28 13:58:13 UTC
Permalink
[I have a work around for the specific activity to avoid
the hang.]
Post by Mark Millard via freebsd-hackers
[The bigger test still hung up.]
Post by Mark Millard via freebsd-hackers
[Just the __packed removal patch was sufficient to no longer
have the hang problem that I originally reported for the
print/texinfo build in poudriere.]
Post by Mark Millard via freebsd-hackers
[Some of this discussion occurred off list. The point here
is not specific to the hang that I originally reported.]
. . .
There are bugs in qemu that can cause such deadlock, you can try these
https://github.com/MikaelUrankar/qemu-bsd-user/commit/9424a5ffde4de2768ab6baa45fdbe0dbb56a7371
https://github.com/MikaelUrankar/qemu-bsd-user/commit/d6f65a7f07d280b6906d499d8e465d4d2026c52b
I'll try those later. Thanks. (I need to get back to sleep.)
It was interesting that attach/detach to the ld process
caused it to progress. The rest of the build completed
just fine. But that one spot consistently hung up before
trying gdb to look at the back trace.
Looking at the qemu code related to the 2nd patch: the
structure of the field copies (via __get_user) seems
very sensitive to the ABI rules for the target and
how things align and such, given that the structure
description and code are host code. __packed vs. not
is possibly not sufficient control to always make things
match right across all the potential combinations of
host and target from what I can see.
Lack of __packed may prove sufficient for my specific
context (amd64 host and armv7 target) but it seems
non-obvious what to do in general.
There would also seem to be big endian vs. little endian
issues on the individual __get_user styles of copies
when the host and target do not match for a multi-byte
numeric encoding.
#include "/usr/include/sys/event.h" // kevent
#include <stddef.h> // offsetof
#include <stdio.h> // printf
int
main()
{
printf("%lu\n", (unsigned long) sizeof(struct kevent));
printf("ident %lu\n", (unsigned long) offsetof(struct kevent, ident));
printf("filter %lu\n", (unsigned long) offsetof(struct kevent, filter));
printf("flags %lu\n", (unsigned long) offsetof(struct kevent, flags));
printf("fflags %lu\n", (unsigned long) offsetof(struct kevent, fflags));
printf("data %lu\n", (unsigned long) offsetof(struct kevent, data));
printf("udata %lu\n", (unsigned long) offsetof(struct kevent, udata));
printf("ext %lu\n", (unsigned long) offsetof(struct kevent, ext));
return 0;
}
(This code avoided warnings for type mismatches with the
printf strings and such.)
# ./a.out
64
ident 0
filter 8 // NOTE!
flags 10 // NOTE!
fflags 12 // NOTE!
data 16
udata 24
ext 32
(The above is not particularly important but I
include it for completeness.)
# ./a.out
64 // NOTE vs. below!
ident 0
filter 4 // NOTE vs. above!
flags 6 // NOTE vs. above!
fflags 8 // NOTE vs. above!
data 16 // NOTE vs. below!
udata 24 // NOTE vs. below!
ext 32 // NOTE vs. below!
/usr/include/sys/event.h lacks __packed in both cases.
With __packed in qemu-arm-static's source code
for target_freebsd_kevent I confirm that via
p/d sizeof(struct target_freebsd_kevent)
p/d &((struct target_freebsd_kevent *)0)->ident
p/d &((struct target_freebsd_kevent *)0)->filter
p/d &((struct target_freebsd_kevent *)0)->flags
p/d &((struct target_freebsd_kevent *)0)->fflags
p/d &((struct target_freebsd_kevent *)0)->data
p/d &((struct target_freebsd_kevent *)0)->udata
p/d &((struct target_freebsd_kevent *)0)->ext
reports as the 2nd patch's problem-report
material reports (56,0,4,6,8,12,20,24): not
even the right size.
I also confirm that removing __packed in qemu's
code and rebuilding and then checking with gdb
reported a match to the above armv7 native report
(64,0,4,6,8,16,24,32).
I have not verified __packed used vs. not for any
other combination of host and target platforms.
Removing the 2 examples of __packed, including the
1 for target_freebsd_kevent, as in Mikaël Urankar's
2nd listed patch, was sufficient to avoid the hang
that I originally reported. (Technically FreeBSD 11
is not involved and so one of the __packed removals
is not relevant to my example.)
I have not applied Mikaël Urankar's first listed
patch at all. It did not prove necessary for my
context.
Again: the only tested context is amd64 -> armv7
(host -> target) under a head -r339076 based
build. (So still 12.)
I'm doing a larger amd64 -> armv7 rebuild (around
210 ports overall) that originally included the
problematical hang and a full-bootstrap build
of lang/gcc8 (so extensive emulation use after
the clang-based stages). Prior to the patch,
all smaller attempts also hung at the same
place for print/texinfo.
But I'll only report if this larger test has
a problem.
The bigger test still hung up in the same old place.
A gdb attach/detach sequence against the qemu-arm-static
for the ld again let it continue from there.
Drat. But good to know.
Having lld use -Wl,--no-threads avoids the problem.

Without the option, lld for N "cpus" creates N
or so extra worker threads (besides the thread
for main) plus one more that does something
different. Having only the thread for main (and
possibly one more) avoids the hangups.

In my context, N==28 (Hyper-V) or N==32 (native
FreeBSD boot) was in use.

Also: The hangups when there were around N+2 threads
total only happened when lld was executed as
emulated code instead of as host-native code. Some
autoconfig activity does not use ${CC} or the like
and so some lld use ends up emulated even when most
of the clang/llvm activity in the poudriere bulk
run is host-native.


Side note:

The ports infrastructure does not have LINKER_TYPE
in use like buildworld buildkernel does, so I did
not use LDFLAGS.lld+=-Wl,--no-threads like I do
for buildworld buildkernel . For now I'm using
LDFLAGS.clang+=-Wl,--no-threads with
LDFLAGS+=${LDFLAGS.${CHOSEN_COMPILER_TYPE}} in
order to select the option when lld is more likely
to be in use. I also avoid the LDFLAGS.clang
assignment for powerpc* families, because lld is
not used in that context (so far).

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

Loading...