Discussion:
FreeBSD 11.x thread creation time is 9000+ microseconds on Intel Xeon Gold series CPU
Add Reply
Steevan Rodrigues
2018-11-06 04:01:30 UTC
Reply
Permalink
Hi ,
I am seeing a FreeBSD 11.x OS poor performance issue .
CPU is Intel(R) Xeon(R) Gold 5115 CPU @ 2.40GHz ( dual socket CPU with 10
cores per socket )

I have attached a simple program which creates thread and computes time
taken to create this thread. On this CPU with FreeBSD 11.x OS it takes 9000
to 15000 micro seconds ( us) to create
just one thread.

On other platforms this thread creation time is usually 20 to 30 us only.
Any idea why it takes so much more time with FreeBSD 11.x ?
Is there any processor specific tuning that needs to be done ?

Thanks,
Steevan
Václav Haisman
2018-11-06 06:46:33 UTC
Reply
Permalink
Post by Steevan Rodrigues
Hi ,
I am seeing a FreeBSD 11.x OS poor performance issue .
cores per socket )
I have attached a simple program which creates thread and computes time
There is no attachment. The mailing list software probably stripped that.
Post by Steevan Rodrigues
taken to create this thread. On this CPU with FreeBSD 11.x OS it takes 9000
to 15000 micro seconds ( us) to create
just one thread.
On other platforms this thread creation time is usually 20 to 30 us only.
Any idea why it takes so much more time with FreeBSD 11.x ?
Is there any processor specific tuning that needs to be done ?
Thanks,
Steevan
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
--
VH
Steevan Rodrigues
2018-11-06 09:01:09 UTC
Reply
Permalink
Ok. Sorry about that. Again I have attached that test program as a tar
file. I am not sure whether even this gets stripped by mailing list
software.

Anyway here is the program given below.

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/time.h>
#include <sys/param.h>
#include <sys/cpuset.h>

#define CPU_NUM 5
#define CPUBIND

void *myThreadFun(void *vargp)
{
sleep(1);
printf("Printing from demo for thread creation time \n");
return NULL;
}

void main(void)
{
pthread_t thread_id;
unsigned long int usec, usec2, sec, sec2;
struct timeval tv, tv2;
struct timezone tz;
pthread_attr_t attr;
#ifdef CPUBIND
cpuset_t cset;

const pid_t pid = getpid();

CPU_ZERO(&cset);
CPU_SET(CPU_NUM,&cset);
cpusetid_t setid;
cpuset(&setid);
//sched_setaffinity(pid, sizeof(cpu_set_t), &cset);
//cpuset_setaffinity(CPU_LEVEL_CPUSET, CPU_WHICH_CPUSET, setid,
sizeof(cpuset_t), &cset);
cpuset_setaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID, pid,
sizeof(cpuset_t), &cset);
#endif

pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

gettimeofday(&tv, &tz); // take time before creating thread
pthread_create(&thread_id, &attr, myThreadFun, NULL);
gettimeofday(&tv2, &tz); // take time after creation of thread

pthread_join(thread_id, NULL);
usec = tv.tv_usec;
sec = tv.tv_sec;
usec2 = tv2.tv_usec;
sec2 = tv2.tv_sec;
printf("Thread creation time details are: \n");
printf("Before: Sec %lu usec %lu After: Sec %lu usec %lu : SecDiff %lu
UsecDiff %lu \n", sec, usec, sec2, usec2, sec2 - sec , usec2 - usec);

}

---------------------------------------------

Thanks
Steevan
Post by Václav Haisman
Post by Steevan Rodrigues
Hi ,
I am seeing a FreeBSD 11.x OS poor performance issue .
cores per socket )
I have attached a simple program which creates thread and computes time
There is no attachment. The mailing list software probably stripped that.
Post by Steevan Rodrigues
taken to create this thread. On this CPU with FreeBSD 11.x OS it takes
9000
Post by Steevan Rodrigues
to 15000 micro seconds ( us) to create
just one thread.
On other platforms this thread creation time is usually 20 to 30 us only.
Any idea why it takes so much more time with FreeBSD 11.x ?
Is there any processor specific tuning that needs to be done ?
Thanks,
Steevan
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "
--
VH
Guido Falsi
2018-11-06 09:47:26 UTC
Reply
Permalink
Post by Steevan Rodrigues
Ok. Sorry about that. Again I have attached that test program as a tar
file. I am not sure whether even this gets stripped by mailing list
software.
Anyway here is the program given below.
I'm not a kernel guy or an expert in such things, but I see that
flamegraphs are often used to analyze such issues. It gives an idea
where the kernel is actually spending it's time.

There's also a port to help with this:

https://www.freshports.org/benchmarks/flamegraph/

I think if you could gather such information and make it available you'd
make it easier for experts to explain what is happening and maybe also
do something about this.
--
Guido Falsi <***@madpilot.net>
Steevan Rodrigues
2018-11-06 11:05:20 UTC
Reply
Permalink
Thanks to all of you for the quick responses.

I see this issue only on a particular server.
I do not see this issue on another server with slightly older processor
Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (dual socket 6 core per socket)
In this system the same program takes only about 800 usec

I wonder whether it is problem with 11.x FreeBSD release on this particular
processor.
I came across two results (given below) from Phoronix OSbench test suite
which also show poor thread creation time for FreeBSD 11.x

https://openbenchmarking.org/result/1804094-AR-1804096AR58
https://openbenchmarking.org/result/1807114-RA-WINDOWSSE47

In this look for "Create Threads" row.

Thanks
Steevan
Post by Guido Falsi
Post by Steevan Rodrigues
Ok. Sorry about that. Again I have attached that test program as a tar
file. I am not sure whether even this gets stripped by mailing list
software.
Anyway here is the program given below.
I'm not a kernel guy or an expert in such things, but I see that
flamegraphs are often used to analyze such issues. It gives an idea
where the kernel is actually spending it's time.
https://www.freshports.org/benchmarks/flamegraph/
I think if you could gather such information and make it available you'd
make it easier for experts to explain what is happening and maybe also
do something about this.
--
Stefan Esser
2018-11-06 12:48:12 UTC
Reply
Permalink
Post by Steevan Rodrigues
Thanks to all of you for the quick responses.
I see this issue only on a particular server.
I do not see this issue on another server with slightly older processor
In this system the same program takes only about 800 usec
I wonder whether it is problem with 11.x FreeBSD release on this particular
processor.
I came across two results (given below) from Phoronix OSbench test suite
which also show poor thread creation time for FreeBSD 11.x
https://openbenchmarking.org/result/1804094-AR-1804096AR58
https://openbenchmarking.org/result/1807114-RA-WINDOWSSE47
In this look for "Create Threads" row.
This list shows, that FreeBSD-12 is fast, only FreeBSD-11 shows the long
thread creation time (16000 vs 36 in whatever units), even though FreeBSD-12
was running with KPTI-Patches.

This lets me think that the issue that caused these delays has been fixed,
but not merged to FreeBSD-11, yet.#

I guess that the cause for the delays is the synchronization IPIs between
sockets, and that these might be very high due to some cores in deep sleep
modes (with corresponding long wake-up delays).

Regards, STefan
Steevan Rodrigues
2018-11-19 12:47:05 UTC
Reply
Permalink
I tried FreeBSD 12 Beta 3 version on this server with Xeon Gold 5115 CPU.
All these problems have disappeared. Thread creation time has improved
greatly .
It is now below 100 usec ( in FreeBSD 11.x 9000+ usec)

Also I had another issue with FreeBSD 11.x which related to contigmalloc
and contigfree .
Actually contigfree was taking too much time on FreeBSD 11.x on this same
server with Xeon Gold 5115 CPU.

In FreeBSD 12 Beta3 also contigfree takes much more time compared to
contigmalloc.
However when I compare the values to FreeBSD 11.x number I can see huge
improvement in FreeBSD 12 Beta 3 .
Because of this contigfree issue my driver unload used to take 5 to 20
minutes in FreeBSD 11.x.
Now my driver takes only a few seconds to load and a few seconds to unload
in FreeBSD 12. BEta 3.

Hence it looks like the problem is with FreeBSD 11.x.
I am still waiting for final release version of FreeBSD 12.0

Regards,
Steevan
Post by Steevan Rodrigues
Post by Steevan Rodrigues
Thanks to all of you for the quick responses.
I see this issue only on a particular server.
I do not see this issue on another server with slightly older processor
socket)
Post by Steevan Rodrigues
In this system the same program takes only about 800 usec
I wonder whether it is problem with 11.x FreeBSD release on this
particular
Post by Steevan Rodrigues
processor.
I came across two results (given below) from Phoronix OSbench test suite
which also show poor thread creation time for FreeBSD 11.x
https://openbenchmarking.org/result/1804094-AR-1804096AR58
https://openbenchmarking.org/result/1807114-RA-WINDOWSSE47
In this look for "Create Threads" row.
This list shows, that FreeBSD-12 is fast, only FreeBSD-11 shows the long
thread creation time (16000 vs 36 in whatever units), even though FreeBSD-12
was running with KPTI-Patches.
This lets me think that the issue that caused these delays has been fixed,
but not merged to FreeBSD-11, yet.#
I guess that the cause for the delays is the synchronization IPIs between
sockets, and that these might be very high due to some cores in deep sleep
modes (with corresponding long wake-up delays).
Regards, STefan
Eugene Grosbein
2018-11-06 13:43:13 UTC
Reply
Permalink
Post by Steevan Rodrigues
Hi ,
I am seeing a FreeBSD 11.x OS poor performance issue .
cores per socket )
I have attached a simple program which creates thread and computes time
taken to create this thread. On this CPU with FreeBSD 11.x OS it takes 9000
to 15000 micro seconds ( us) to create
just one thread.
On other platforms this thread creation time is usually 20 to 30 us only.
Any idea why it takes so much more time with FreeBSD 11.x ?
Is there any processor specific tuning that needs to be done ?
Please set "sysctl kern.eventtimer.periodic=1" and retry your tests.
Eugene Grosbein
2018-11-06 13:48:21 UTC
Reply
Permalink
Post by Eugene Grosbein
Post by Steevan Rodrigues
I am seeing a FreeBSD 11.x OS poor performance issue .
cores per socket )
I have attached a simple program which creates thread and computes time
taken to create this thread. On this CPU with FreeBSD 11.x OS it takes 9000
to 15000 micro seconds ( us) to create
just one thread.
On other platforms this thread creation time is usually 20 to 30 us only.
Any idea why it takes so much more time with FreeBSD 11.x ?
Is there any processor specific tuning that needs to be done ?
Please set "sysctl kern.eventtimer.periodic=1" and retry your tests.
Also "sysctl kern.timecounter.alloweddeviation=0" may help, too.
Loading...