Discussion:
Celeron J3160 with enabled Turbo mode stays at 480MHz(lowestsetting) forever and can not lower frequency without Tuebo mode
Cy Schubert
2018-09-05 14:51:28 UTC
Permalink
I don't think you need something accurate.

We don't know whether it is implemented through ACPI or similar to the old turbo jumper on the MB, which increased the clock rate and sometimes the voltage ( required to maintain stability when increasing the clock rate). We don't know how your MB manufacturer implemented this.

My guess is it's probably through ACPI but you don't know until you have some rough measurements.

---
Sent using a tiny phone keyboard.
Apologies for any typos and autocorrect.
Also, this old phone only supports top post. Apologies.

Cy Schubert
<***@cschubert.com> or <***@freebsd.org>
The need of the many outweighs the greed of the few.
---

-----Original Message-----
From: Lev Serebryakov
Sent: 05/09/2018 06:46
To: Cy Schubert; Eric van Gyzen
Cc: FreeBSD Current; freebsd-***@freebsd.org
Subject: Re: Celeron J3160 with enabled Turbo mode stays at 480MHz(lowestsetting) forever and can not lower frequency without Tuebo mode
1601 is not the actual frequency. That is just how it is reported. It
is almost certainly running much higher than 1601.
We don't know this until we can independently verify it. Do you mind
running some benchmarks with and without turbo mode?
What could be adequate benchmarks for this? Something likje "openssl
speed aes128-cbc" or I need more specific one?
--
// Lev Serebryakov
Lev Serebryakov
2018-09-05 15:43:04 UTC
Permalink
Post by Cy Schubert
I don't think you need something accurate.
1.6GHz and 2.48Ghz.. Maybe... I i'm trying now.
Post by Cy Schubert
We don't know whether it is implemented through ACPI or similar to the
old turbo jumper on the MB, which increased the clock rate and sometimes
the voltage ( required to maintain stability when increasing the clock
rate). We don't know how your MB manufacturer implemented this.
I thought, it could be implemented only in one? official, way, as it is
Intel's official technology, and not MoBo's one.
--
// Lev Serebryakov
Lev Serebryakov
2018-09-05 16:27:06 UTC
Permalink
Post by Cy Schubert
I don't think you need something accurate.
Ok, here is results. I'm working in single-user mode.

TL;DR "Turbo" mode make "openssl" much slower (x3.5)!

I can not properly interpret this result.

But "turbostat" properly detect Turbo/No-Turbo mode, so it is not
mistake in BIOS.

(1) Trubo ENABLED, powerd IS NOT started

dev.cpu.0.freq=480 no matter what.

turbostat shows DIFFERENT speeds, like this (I've removed IRQ-related
fields to fit in one line):

Package Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz CPU%c1 CPU%c3 CPU%c6
- - - 2 0.30 1359 1600 6.18 0.00 93.52 31
0 0 0 2 0.36 863 1600 0.08 0.00 99.56 31
0 1 1 4 0.47 1462 1600 24.56 0.00 74.96 31
1 0 2 2 0.22 1670 1600 0.05 0.00 99.72 29
1 1 3 2 0.14 1792 1600 0.02 0.00 99.84 29

"openssl speed aes-256-cbc":

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 5653.98k 6159.49k 6356.31k 17271.70k 17517.23k

(2) Trubo ENABLED, powerd IS started

dev.cpu.0.freq shows different values, from 60 in idle to 1601 under load.

turbostat shows same values, but at idle Bzy_MHz drops low.

openssl is the same

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 5580.78k 6155.97k 6349.23k 17273.51k 17511.77k

(3) Trubo DISABLED, powerd IS NOT started

dev.cpu.0.freq=1600 no matter what.

turbostat shows higher numbers:

Package Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz CPU%c1 CPU%c3 CPU%c6
- - - 2 0.21 1807 1600 1.44 0.00 98.35 38
0 0 0 3 0.28 1764 1600 0.06 0.00 99.66 38
0 1 1 3 0.24 2052 1600 1.72 0.00 98.03 38
1 0 2 1 0.09 1629 1600 0.02 0.00 99.89 36
1 1 3 2 0.22 1664 1600 3.94 0.00 95.84 36

"openssl speed aes-256-cbc":

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 18939.95k 20638.71k 21281.24k 57836.36k 58736.39k

(3.5 times faster that with Turbo ENABLED!)

(4) Trubo DISABLED, powerd IS started

dev.cpu.0.freq shows different values, from 60 in idle to 1600 under load.

turbostat shows very low Bzy_MHz on idle, but high (suspiciously high)
under load:

Package Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz CPU%c1 CPU%c3 CPU%c6
- - - 1475 92.22 2666 1600 1.66 0.00 6.12 41
0 0 0 1475 92.22 2666 1600 1.62 0.00 6.16 41
0 1 1 1475 92.21 2666 1600 1.41 0.00 6.38 41
1 0 2 1475 92.21 2666 1600 1.78 0.00 6.01 38
1 1 3 1476 92.24 2666 1600 1.84 0.00 5.92 38


openssl is almost the same

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 16277.18k 20620.71k 21272.10k 57998.35k 58687.83k
--
// Lev Serebryakov
Benjamin Kaduk
2018-09-05 22:32:46 UTC
Permalink
Post by Lev Serebryakov
Post by Cy Schubert
I don't think you need something accurate.
Ok, here is results. I'm working in single-user mode.
TL;DR "Turbo" mode make "openssl" much slower (x3.5)!
I can not properly interpret this result.
You need to say more about what openssl is doing (i.e., how it was
configured, what architecture it's on, etc.). In particular, there
was for a time an AVX2 implementation for some primitives, that ended up
being a net loss, since heavy use of those instructions would cause
overheating and throttling. OpenSSL has a lot of custom assembly for these
common primitves, with some logic to select among them both at
configuration time and at runtime, so results such as this may or may not
be widely transferrable.

-Ben
Lev Serebryakov
2018-09-06 00:02:29 UTC
Permalink
Hello Benjamin,
Post by Benjamin Kaduk
Post by Lev Serebryakov
Post by Cy Schubert
I don't think you need something accurate.
Ok, here is results. I'm working in single-user mode.
TL;DR "Turbo" mode make "openssl" much slower (x3.5)!
I can not properly interpret this result.
You need to say more about what openssl is doing (i.e., how it was
configured, what architecture it's on, etc.). In particular, there
was for a time an AVX2 implementation for some primitives, that ended up
being a net loss, since heavy use of those instructions would cause
overheating and throttling. OpenSSL has a lot of custom assembly for these
common primitves, with some logic to select among them both at
configuration time and at runtime, so results such as this may or may not
be widely transferrable.
It is system (very fresh ALPHA4) openssl, built with default settings.
Simple single run with one thread, without AES-NI:

openssl speed aes-256-cbc

It is as simple as that.
--
Best regards,
Lev mailto:***@FreeBSD.org
Benjamin Kaduk
2018-09-06 01:15:36 UTC
Permalink
Post by Lev Serebryakov
Hello Benjamin,
Post by Benjamin Kaduk
Post by Lev Serebryakov
Post by Cy Schubert
I don't think you need something accurate.
Ok, here is results. I'm working in single-user mode.
TL;DR "Turbo" mode make "openssl" much slower (x3.5)!
I can not properly interpret this result.
You need to say more about what openssl is doing (i.e., how it was
configured, what architecture it's on, etc.). In particular, there
was for a time an AVX2 implementation for some primitives, that ended up
being a net loss, since heavy use of those instructions would cause
overheating and throttling. OpenSSL has a lot of custom assembly for these
common primitves, with some logic to select among them both at
configuration time and at runtime, so results such as this may or may not
be widely transferrable.
It is system (very fresh ALPHA4) openssl, built with default settings.
openssl speed aes-256-cbc
It is as simple as that.
Okay, "system openssl" and the FreeBSD version is enough to nail down the
code and configuration, and I see the processor type is in the subject
line. I guess posting the CPU features bits from dmesg might save whoever
tries to track down the codepaths being used some time (unless that was
posted already and I missed it?).

-Ben
Lev Serebryakov
2018-09-06 11:32:29 UTC
Permalink
Post by Benjamin Kaduk
Okay, "system openssl" and the FreeBSD version is enough to nail down the
code and configuration, and I see the processor type is in the subject
line. I guess posting the CPU features bits from dmesg might save whoever
tries to track down the codepaths being used some time (unless that was
posted already and I missed it?).
I'll post it tonight, but I don't think it is very openssl-specific or
thermal throttling. I've monitored temperatures, of course, and
monitored frequencies with turbostat. With Turbo enabled freuqnces jumps
wildly and were lower than with Turbo disabled. And anyway, even
frequencies jumps were not large enough to explain x3.5 difference.

Another thing which puzzles me, that with Turbo disabled (!) I see
frequencies 2666MHz accroding to turbostat, which seems impossible, as
it is higher than official Turbo frequency (!). I don't know how to
explain this. Maybe, turbostat fails?
--
// Lev Serebryakov
Lev Serebryakov
2018-09-06 13:41:27 UTC
Permalink
Post by Benjamin Kaduk
Okay, "system openssl" and the FreeBSD version is enough to nail down the
code and configuration, and I see the processor type is in the subject
line. I guess posting the CPU features bits from dmesg might save whoever
tries to track down the codepaths being used some time (unless that was
posted already and I missed it?).
CPU: Intel(R) Celeron(R) CPU J3160 @ 1.60GHz (1600.05-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x406c4 Family=0x6 Model=0x4c Stepping=4

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>

Features2=0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND>
AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
AMD Features2=0x101<LAHF,Prefetch>
Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
TSC: P-state invariant, performance statistics

And here are two outputs of turbostat with additional settings:

turbostat, Turbo DISABLED:

CPUID(0): GenuineIntel 11 CPUID levels; family:model:stepping 0x6:4c:4
(6:76:4)
CPUID(1): SSE3 MONITOR - EIST TM2 TSC MSR ACPI-TM TM
CPUID(6): APERF, No-TURBO, DTS, No-PTM, No-HWP, No-HWPnotify,
No-HWPwindow, No-HWPepp, No-HWPpkg, EPB
cpu3: MSR_IA32_MISC_ENABLE: 0x4000850089 (TCC EIST No-MWAIT PREFETCH
No-TURBO)
CPUID(7): No-SGX
cpu3: MSR_PLATFORM_INFO: 0x60002001400
6 * 133.3 = 800.0 MHz max efficiency frequency
20 * 133.3 = 2666.6 MHz base frequency
cpu3: MSR_IA32_POWER_CTL: 0x00000000 (C1E auto-promotion: DISabled)
cpu3: MSR_TURBO_RATIO_LIMIT: 0x00000000
cpu3: MSR_PKG_CST_CONFIG_CONTROL: 0x0014000f (UNlocked:
pkg-cstate-limit=15: unknown)
NSFOD /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver
cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x00000000 (performance)
cpu2: MSR_IA32_ENERGY_PERF_BIAS: 0x00000000 (performance)
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x005a0000 (90 C)
cpu2: MSR_IA32_TEMPERATURE_TARGET: 0x005a0000 (90 C)

turbostat, Turbo ENABLED:

CPUID(0): GenuineIntel 11 CPUID levels; family:model:stepping 0x6:4c:4
(6:76:4)
CPUID(1): SSE3 MONITOR - EIST TM2 TSC MSR ACPI-TM TM
CPUID(6): APERF, TURBO, DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow,
No-HWPepp, No-HWPpkg, EPB
cpu1: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST No-MWAIT PREFETCH TURBO)
CPUID(7): No-SGX
cpu1: MSR_PLATFORM_INFO: 0x60002001400
6 * 133.3 = 800.0 MHz max efficiency frequency
20 * 133.3 = 2666.6 MHz base frequency
cpu1: MSR_IA32_POWER_CTL: 0x00000000 (C1E auto-promotion: DISabled)
cpu1: MSR_TURBO_RATIO_LIMIT: 0x00000000
cpu1: MSR_PKG_CST_CONFIG_CONTROL: 0x0014000f (UNlocked:
pkg-cstate-limit=15: unknown)
NSFOD /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x00000000 (performance)
cpu2: MSR_IA32_ENERGY_PERF_BIAS: 0x00000000 (performance)
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x005a0000 (90 C)
cpu2: MSR_IA32_TEMPERATURE_TARGET: 0x005a0000 (90 C)
--
// Lev Serebryakov
Loading...