Using libhugetlbfs to transparently back up glibc malloc calls in a multi-threaded application
1
vote
0
answers
418
views
I'm trying to back up the memory allocations of a multi-threaded application with 1GiB hugepages using libhugetlbfs. However, only the main thread allocations are being assigned hugepages. If I restrict the maximum number of Glibc malloc arenas to 1, all the allocations of all threads are backed up with hugepages. This is not ideal due to the introduced contention of concurrently accessing a single arena.
Is there any way to transparently force all threads to use huge pages by means of libhugetlbfs?
**Note**: I'm aware of transparent huge pages (THP). However, allocations smaller than 1GiB are not automatically assigned hugepages. Smaller pages will only be compacted into bigger pages when the khugepaged kernel thread process them, which is something I would not like to rely on. Ideally, I would like all malloc calls to be serviced using huge pages even if the allocations are small. This is useful for a applications that do a lot of small allocations.
Experimentation
===============
These are the steps that I have followed to set up 1GiB hugepages:
sudo mkdir /dev/hugepages1G
sudo mount -t hugetlbfs -o uid=,pagesize=1g,min_size=50g none /dev/hugepages1G
sudo hugeadm --pool-pages-min 1G:50
I'm using the dummy application below for testing. The main thread allocates and initializes 1GiB of memory. Then, it creates three pthreads, each of which allocates and initializes 10GiB of memory.
#include
#include
#include
#include
#include
#include
#include
void *iamathread(void *data)
{
char *addr;
char dummy;
size_t size, i;
size = 10*1024*1024*1024UL;
pid_t x = syscall(__NR_gettid);
addr = malloc(size);
if (!addr) {
perror("cannot allocate memory");
pthread_exit(NULL);
}
memset(addr, 1, size);
printf("%d:\t sleeping\n", x);
sleep(1000000U);
return NULL;
}
int main(int argc, char *agv[])
{
char *addr;
char dummy;
size_t size, i;
int npt;
npt = 3;
size = 1*1024*1024*1024UL;
pthread_t pt[npt];
for (i = 0; i < npt; i++) {
if (pthread_create(&pt[i], NULL, iamathread, NULL)) {
fprintf(stderr, "Error creating thread\n");
return 1;
}
}
pid_t x = syscall(__NR_gettid);
printf("%d:\t I'm main\n", x);
addr = malloc(size);
if (!addr) {
perror("cannot allocate memory");
return 1;
}
memset(addr, 1, size);
printf("Press any key to exit and release memory\n");
scanf("%c", &dummy);
return 0;
}
I have created the following script to count the number of pages per page size used by an application:
#!/usr/bin/bash
PID=$1
awk '
BEGIN {
tmp_size = -1
}
$1 == "Size:" {
tmp_size = $2
next
}
$1 == "KernelPageSize:" {
page_size = $2
vmas[page_size]["count"] += 1
vmas[page_size]["pages"] += tmp_size/page_size
tmp_size = -1
next
}
END {
for (key in vmas) {
print(key " KiB VMAs: " vmas[key]["count"])
}
for (key in vmas) {
print(key " KiB num pages: " vmas[key]["pages"])
}
}
' /proc/$PID/smaps
And these are the results obtained when running with and without the MALLOC_ARENA_MAX environment variable to limit the number of arenas:
$ LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=1G HUGETLB_PATH=/dev/hugepages1G ./main &
$ hugepagecount.sh pgrep main
4 KiB VMAs: 41
1048576 KiB VMAs: 2
4 KiB num pages: 7922277
1048576 KiB num pages: 2
$ MALLOC_ARENA_MAX=1 LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=1G HUGETLB_PATH=/dev/hugepages1G ./main &
$ hugepagecount.sh pgrep main
4 KiB VMAs: 37
1048576 KiB VMAs: 5
4 KiB num pages: 8802
1048576 KiB num pages: 32
When not limiting the number of arenas, only 2 1GiB (1048576 KiB) pages are allocated. Instead, when forcing a single arena, 32 1GiB pages are allocated.
Asked by aleixrocks
(305 rep)
Dec 3, 2019, 07:55 AM
Last activity: Dec 3, 2019, 08:27 AM
Last activity: Dec 3, 2019, 08:27 AM