Double Tong
2018-08-16 16:28:33 UTC
Hello,
I want to create a thread with a separate file descriptor table to have better performance with kevent(2). On Linux, I was using unshare(2) syscall to achieve this, which as far as I know there is no equivalent or similar syscall in FreeBSD.
I have posted on freebsd forums (https://forums.freebsd.org/threads/create-a-thread-with-a-separate-file-descriptor-table-set-rffdg-flag.67143/), and now I understood the following: rfork_thread(3) is deprecated in favor of pthread_create(3). rfork_thread(3) is written in assembly language to perform stack swapping, which means if rfork_thread(3) no longer exists in the build, it can damage our program's portability if it relies on rfork_thread.
With the above consideration, I still have the following questions:
1. Is there an elegant way to create a thread with a separate file descriptor table?
2. If you are thinking about using rfork_thread(3) to do this, I am working on this direction. I am using waitpid to join these "threads", and the thread exits in the middle of execution with status 0x8b collected by waitpid. I guess this status means invalid page access. I wrote a tiny program (attached below) to reflect the code I am using in my program, I appreciate if you would like to take a look at it to see if there is anything I was not doing correctly.
3. As I was reading the code of pthread_create, it allocates a pthread struct on the top of thread, and then calls clone, which freebsd implemented its version of clone that actually calls rfork (I did not find the source of freebsd's clone, can someone provides a link?). So I believe theoretically there should be a way to achieve this in the user space. And if I am not using pthread related APIs, then missing pthread struct should be fine as well?
4. On Linux, after calling unshare(CLONE_FLIES), I got performance increase around 10% with 1000 concurrent TCP connections. I am instructed to implement this by my supervisor, and I do not have much details about why the performance would increase. Would this also works for freebsd as well (kevent calls)?
Thank you for any help, comments in advance!
my_rfork_test.cc:
--------------------------
// This program runs well, except status code is non zero. In my bigger program, it terminates in the middle of the routine with status 0x8b
#include <iostream>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <vector>
const size_t VEC_SIZE = 1000;
static int thread_routine(void* arg) {
std::cout<<"init thread "<<arg<<std::endl;
// the problem seems related to some memory allocation
std::vector<int>** vectors = new std::vector<int>*[VEC_SIZE];
for(int i = 0; i < VEC_SIZE; i++) {
vectors[i] = new std::vector<int>(10000);
std::cout<<"vec "<<i<<" initialized"<<std::endl;
}
}
int main() {
const int STACK_SIZE = 8000000;
//void* stackaddr = malloc(STACK_SIZE); // should also work
void* stackaddr = mmap(NULL, STACK_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0);
std::cout<<"stackaddr: 0x"<<std::hex<<stackaddr<<std::endl;
void* stacktop = (char*)stackaddr + STACK_SIZE; // assuming stack going downwards
pid_t child = 0;
child = rfork_thread(RFFDG|RFPROC|RFMEM|RFSIGSHARE,stacktop,&thread_routine, reinterpret_cast<void*>(2));
int status = 0;
waitpid(child, &status, 0x0);
std::cout<<"return status 0x"<<std::hex<<status<<std::endl; // should return 0? but usually not
}
--------------------------
Best regards,
Shuangyi Tong
I want to create a thread with a separate file descriptor table to have better performance with kevent(2). On Linux, I was using unshare(2) syscall to achieve this, which as far as I know there is no equivalent or similar syscall in FreeBSD.
I have posted on freebsd forums (https://forums.freebsd.org/threads/create-a-thread-with-a-separate-file-descriptor-table-set-rffdg-flag.67143/), and now I understood the following: rfork_thread(3) is deprecated in favor of pthread_create(3). rfork_thread(3) is written in assembly language to perform stack swapping, which means if rfork_thread(3) no longer exists in the build, it can damage our program's portability if it relies on rfork_thread.
With the above consideration, I still have the following questions:
1. Is there an elegant way to create a thread with a separate file descriptor table?
2. If you are thinking about using rfork_thread(3) to do this, I am working on this direction. I am using waitpid to join these "threads", and the thread exits in the middle of execution with status 0x8b collected by waitpid. I guess this status means invalid page access. I wrote a tiny program (attached below) to reflect the code I am using in my program, I appreciate if you would like to take a look at it to see if there is anything I was not doing correctly.
3. As I was reading the code of pthread_create, it allocates a pthread struct on the top of thread, and then calls clone, which freebsd implemented its version of clone that actually calls rfork (I did not find the source of freebsd's clone, can someone provides a link?). So I believe theoretically there should be a way to achieve this in the user space. And if I am not using pthread related APIs, then missing pthread struct should be fine as well?
4. On Linux, after calling unshare(CLONE_FLIES), I got performance increase around 10% with 1000 concurrent TCP connections. I am instructed to implement this by my supervisor, and I do not have much details about why the performance would increase. Would this also works for freebsd as well (kevent calls)?
Thank you for any help, comments in advance!
my_rfork_test.cc:
--------------------------
// This program runs well, except status code is non zero. In my bigger program, it terminates in the middle of the routine with status 0x8b
#include <iostream>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <vector>
const size_t VEC_SIZE = 1000;
static int thread_routine(void* arg) {
std::cout<<"init thread "<<arg<<std::endl;
// the problem seems related to some memory allocation
std::vector<int>** vectors = new std::vector<int>*[VEC_SIZE];
for(int i = 0; i < VEC_SIZE; i++) {
vectors[i] = new std::vector<int>(10000);
std::cout<<"vec "<<i<<" initialized"<<std::endl;
}
}
int main() {
const int STACK_SIZE = 8000000;
//void* stackaddr = malloc(STACK_SIZE); // should also work
void* stackaddr = mmap(NULL, STACK_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0);
std::cout<<"stackaddr: 0x"<<std::hex<<stackaddr<<std::endl;
void* stacktop = (char*)stackaddr + STACK_SIZE; // assuming stack going downwards
pid_t child = 0;
child = rfork_thread(RFFDG|RFPROC|RFMEM|RFSIGSHARE,stacktop,&thread_routine, reinterpret_cast<void*>(2));
int status = 0;
waitpid(child, &status, 0x0);
std::cout<<"return status 0x"<<std::hex<<status<<std::endl; // should return 0? but usually not
}
--------------------------
Best regards,
Shuangyi Tong