Discussion:
embed endian info in locale data files magic (PR231965)
Yuri Pankov
2018-10-18 00:25:26 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231965 ([PowerPC64]
Cross compiling powerpc64 from amd64 results in nonfunctional locale
installations), describes the issue with locale data built on LE system
(amd64) when used on BE system (powerpc).

Fix introduced in rS308170 depends on the fact that locale data is built
on LE system, and will likely fail when it's built natively on mips
(please correct me if I'm wrong). More so, we shouldn't be hardcoding
the conversion in libc, and I see 2 options here:

1. fix localedef to output data in target's system endian
2. embed the endian info in locale data files (updating magic signature)
and ehhance the previous fix with runtime selection of needed
conversion

I have put the change for #2 together at
https://reviews.freebsd.org/D17603 (more a PoC at the moment than a real
review), and wondering if it looks sane enough or if there's anything
obvious I'm missing.

I have briefly tested the libc and locale files (LC_CTYPE and
LC_COLLATE) built on amd64 on a powerpc system, and it seems to work.

TIA
Yuri Pankov
2018-10-18 01:58:58 UTC
Permalink
Post by Yuri Pankov
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231965 ([PowerPC64]
Cross compiling powerpc64 from amd64 results in nonfunctional locale
installations), describes the issue with locale data built on LE system
(amd64) when used on BE system (powerpc).
Fix introduced in rS308170 depends on the fact that locale data is built
on LE system, and will likely fail when it's built natively on mips
(please correct me if I'm wrong). More so, we shouldn't be hardcoding
1. fix localedef to output data in target's system endian
2. embed the endian info in locale data files (updating magic signature)
and ehhance the previous fix with runtime selection of needed
conversion
Thinking more about this, or:

3. Always store the data in LE (or BE, doesn't matter), and
appropriately convert while reading. This will likely require least change.
Post by Yuri Pankov
I have put the change for #2 together at
https://reviews.freebsd.org/D17603 (more a PoC at the moment than a real
review), and wondering if it looks sane enough or if there's anything
obvious I'm missing.
I have briefly tested the libc and locale files (LC_CTYPE and
LC_COLLATE) built on amd64 on a powerpc system, and it seems to work.
TIA
Warner Losh
2018-10-18 04:21:05 UTC
Permalink
Post by Yuri Pankov
Post by Yuri Pankov
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231965 ([PowerPC64]
Cross compiling powerpc64 from amd64 results in nonfunctional locale
installations), describes the issue with locale data built on LE system
(amd64) when used on BE system (powerpc).
Fix introduced in rS308170 depends on the fact that locale data is built
on LE system, and will likely fail when it's built natively on mips
(please correct me if I'm wrong). More so, we shouldn't be hardcoding
1. fix localedef to output data in target's system endian
2. embed the endian info in locale data files (updating magic signature)
and ehhance the previous fix with runtime selection of needed
conversion
3. Always store the data in LE (or BE, doesn't matter), and
appropriately convert while reading. This will likely require least change.
I like this.

Warner
Poul-Henning Kamp
2018-10-18 06:16:25 UTC
Permalink
--------
Post by Warner Losh
Post by Yuri Pankov
3. Always store the data in LE (or BE, doesn't matter), and
appropriately convert while reading. This will likely require least change.
I like this.
The locale stuff can have a surprisingly large performance impact, so
it might be a good idea to run some benchmarks first.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
Yuri Pankov
2018-10-18 20:52:10 UTC
Permalink
Post by Poul-Henning Kamp
--------
Post by Warner Losh
Post by Yuri Pankov
3. Always store the data in LE (or BE, doesn't matter), and
appropriately convert while reading. This will likely require least change.
I like this.
The locale stuff can have a surprisingly large performance impact, so
it might be a good idea to run some benchmarks first.
Thanks for the answers, and especially for mentioning the magic word
which is "performance" that I seem to be forgetting about. I have
updated the change to skip any runtime conversions and rather make
localedef output data in target system's byte order.

If anyone wants to take a look, it's still at:

https://reviews.freebsd.org/D17603

Baptiste Daroussin
2018-10-18 07:57:29 UTC
Permalink
Post by Yuri Pankov
Post by Yuri Pankov
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231965 ([PowerPC64]
Cross compiling powerpc64 from amd64 results in nonfunctional locale
installations), describes the issue with locale data built on LE system
(amd64) when used on BE system (powerpc).
Fix introduced in rS308170 depends on the fact that locale data is built
on LE system, and will likely fail when it's built natively on mips
(please correct me if I'm wrong). More so, we shouldn't be hardcoding
1. fix localedef to output data in target's system endian
2. embed the endian info in locale data files (updating magic signature)
and ehhance the previous fix with runtime selection of needed
conversion
3. Always store the data in LE (or BE, doesn't matter), and
appropriately convert while reading. This will likely require least change.
I agree on that approach, just some benchmarking would be needed to make sure
we are not killing the performance

Best regards,
Bapt
Brooks Davis
2018-10-18 17:14:50 UTC
Permalink
Post by Baptiste Daroussin
Post by Yuri Pankov
Post by Yuri Pankov
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231965 ([PowerPC64]
Cross compiling powerpc64 from amd64 results in nonfunctional locale
installations), describes the issue with locale data built on LE system
(amd64) when used on BE system (powerpc).
Fix introduced in rS308170 depends on the fact that locale data is built
on LE system, and will likely fail when it's built natively on mips
(please correct me if I'm wrong). More so, we shouldn't be hardcoding
1. fix localedef to output data in target's system endian
2. embed the endian info in locale data files (updating magic signature)
and ehhance the previous fix with runtime selection of needed
conversion
3. Always store the data in LE (or BE, doesn't matter), and
appropriately convert while reading. This will likely require least change.
I agree on that approach, just some benchmarking would be needed to make sure
we are not killing the performance
While not a bad idea, we should probably focus on benchmarking modern
power systems. None of the other BE systems are all that important IMO
(I say this as someone who uses BE mips64 daily.)

-- Brooks
Loading...