Sample Header Ad - 728x90

Linux HugeTLB: What is the advantage of the filesystem approach?

5 votes
1 answer
1291 views
Moved Post Notice -------------------- I just moved this question (with slight modifications) from a StackOverflow question (which I have deleted, since cross-posting is strongly discouraged), which has not been answered over there and might be better suited here. There were two comments (but no answers) made at the StackOverflow question. This is a short summary of those (note that you might need to read the actual question to understand this): * The filesystem approach enables you to use libhugetlbfs which can do all sorts of things. * That does not really convince me - if I as an application programmer can allocate huge pages without going via the filesystem, so could libhugetlbfs, right? * Going via the filesystem allows you to set permissions on who can allocate huge pages. * Sure, but it's not required to go via the filesystem. If anyone can do mmap(…, MAP_HUGETLB, …), anyone who is denied access on a filesystem level can still exhaust all huge pages by going the mmap way. Actual Question =============== I am currently exploring the various ways of allocating memory in huge pages under Linux. I somehow can not wrap my head around the concept of the HugeTLB 'filesystem'. Note that I'm not talking about transparent huge pages - those are a whole different beast. ## The Conventional Way The conventional wisdom (as e.g. presented in [the Debian Wiki](https://wiki.debian.org/Hugepages#Enabling_HugeTlbPage) or [the Kernel docs](https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html#using-huge-pages)) seems to be: - Make sure set your kernel configuration correctly - set various kernel parameters right - mount a special filesystem (hugetlbfs) to some arbitrary directory, say /dev/hugepages/ (that seems to be the default on Fedora…) - mmap() a file within that directory into your address space, i.e., something like:
int fd = open("/dev/hugepages/myfile, O_CREAT | O_RDWR, 0755);
void * addr = mmap(0, 10*1024*1024, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);
… and if these two calls succeed, I should have addr pointing to 10 MB of memory allocated in five 2 MB huge pages. Nice. ## The Easy Way However, this seems awfully overcomplicated? At least on Linux 5.15 the whole filesystem thing seems to be completely unnecessary. I just tried this: * kernel configured with HugeTLBfs * kernel parameters set correctly (i.e., vm.nr_hugepages > 0) * no hugetlbfs mounted anywhere And then just do an mmap of anonymous memory:
void *addr = mmap(0, 10*1024*1024, (PROT_READ | PROT_WRITE),
                  (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB), 0, 0);
This gives me 10 MB of memory allocated in huge pages (at least if I don't fail at interpreting the flags in the page table). ## Why the Filesystem? So my question is: Why the filesystem? Is it actually "necessary" to go via the filesystem, as the various guides suggest, and my attempt above was just lucky? Does the filesystem approach have other advantages (aside from having a file which represents parts of your RAM, which seems like a huge footgun…)? Or is this maybe just a remnant from some previous time, when MAP_ANONYMOUS | MAP_HUGETLB was not allowed?
Asked by Lukas Barth (231 rep)
Aug 2, 2023, 10:23 AM
Last activity: Dec 4, 2024, 02:13 PM