Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
0
votes
0
answers
110
views
Allocating contiguous physical memory using huge pages in kernel module
0 I need a kernel module that allocates 8MB of physically contiguous memory using 2MB huge pages, in response to a user-space mmap() request. While I’ve successfully used alloc_pages() with 4KB pages to allocate smaller contiguous chunks like 256KB or 512KB, I’m unsure if this approach can be used t...
0
I need a kernel module that allocates 8MB of physically contiguous memory using 2MB huge pages, in response to a user-space mmap() request. While I’ve successfully used alloc_pages() with 4KB pages to allocate smaller contiguous chunks like 256KB or 512KB, I’m unsure if this approach can be used to allocate 8MB of physically contiguous memory backed by 2MB huge pages.
To consistently allocate 8MB using 2MB huge pages, is there a way to reserve these huge pages And in scenarios where sufficient 2MB pages aren't available, is there a fallback mechanism to allocate the same 8MB region using 4KB pages instead?
ReturnAddress
(3 rep)
Apr 18, 2025, 04:23 PM
0
votes
0
answers
27
views
hugepages allocated via mmap not being freed up by Kernel even after unmounting hugetlbfs filesystem (OL9/UEK7)
We have an c++ application that allocates and uses hugepages memory (via Jemalloc hooks). Chunk (2MB) allocation happens via `mmap` with protection flags `PROT_READ | PROT_WRITE` and flags `MAP_SHARED | MAP_POPULATE` into a chunk file created on `hugetlbfs` filesystem. The call looks roughly like th...
We have an c++ application that allocates and uses hugepages memory (via Jemalloc hooks). Chunk (2MB) allocation happens via
mmap
with protection flags PROT_READ | PROT_WRITE
and flags MAP_SHARED | MAP_POPULATE
into a chunk file created on hugetlbfs
filesystem. The call looks roughly like this:
void* chunk = mmap(nullptr, size, protection, flags, fd, offset);
The application itself doesn't free up allocated chunks on exit. We rely on unmounting and remounting hugetlbfs on each restart to free up everything.
Everything was fine until we recently migrated to Oracle Linux 9 (
UEK7 5.15.0-303.171.5.2.1
). After migration, we are noticing that once the application exits, even after unmounting the filesystem, hugepages aren't getting freed up (verified via /proc/meminfo
). Nothing seems to clear up those hugepages until the system is rebooted. We have verified that none of the other processes (via /proc//maps
) are using hugepages.
Any pointers on why kernel thinks those hugepages are in use? The application runs as a systemd service, uses cgroup v2 (no special memory customization), in case these are somehow impacting the cleanup.
WafflingDoodle
(1 rep)
Apr 17, 2025, 05:03 AM
0
votes
0
answers
34
views
How my VM overcommit memory using hugetlbfs
I am trying to reproduce the experiments in this paper.《Progressive Memory Adjustment with Performance Guarantee in Virtualized Systems》. It is said "For example, we use hugetlbfs to limit the available memory in the host to 100GB, but the memory size requested by all VMs is 120GB, and the memory ov...
I am trying to reproduce the experiments in this paper.《Progressive Memory Adjustment with Performance Guarantee in Virtualized Systems》. It is said "For example, we use hugetlbfs to
limit the available memory in the host to 100GB, but the memory size requested by all VMs is 120GB, and the memory overcommitment rate value at this scenario is (120 − 100)/100 = 20%."
This is what i have done. I try to boot two VM with "-m 6G" and only allocate 10G to hugetlbfs, which means 20% overcommitment rate. I have to use "prealloc=off,reserve=off" because if not the qemu would error I have not enough memory to boot the second vm.
qemu-system-x86_64 \
-m 6G\
-object memory-backend memfd,id=mem1,size=6G,hugetlb=on,hugetlbsize=2M,share=on,prealloc=off,reserve=off \
-numa node,memdev=mem1 \
......
Then, i try to make VM1 use 6GB memory, and make VM2 use 6GB memory when VM1 release memory. In my idea, the VM2 should allocate memory faster than VM1 release, so in a few seconds, the VM1 should have no hugepages to allocate and wait. This worsen the performance of VM1, which is what i want. Therefore, i can test another better algorithm that VM1 release memory faster than normal algorithm(which is FreePageReporting).
My problem is, VM2 would just broke when the hugetlbfs have zero hugepages left. It should be right as is said in "If no huge page exists at page fault time, the task is sent a SIGBUS and often dies an unhappy death. ". But it is not what i want. because I want the VM2 to wait or use memory from swap zone. I have enable the swapfile but it is just not used at all.
Could anyone help? i have been stuck here for 2 months and my due is coming. PLEASE!
qi chen
(1 rep)
Feb 21, 2025, 07:42 AM
5
votes
1
answers
1291
views
Linux HugeTLB: What is the advantage of the filesystem approach?
Moved Post Notice -------------------- I just moved this question (with slight modifications) from a StackOverflow question (which I have deleted, since cross-posting is strongly discouraged), which has not been answered over there and might be better suited here. There were two comments (but no ans...
Moved Post Notice
--------------------
I just moved this question (with slight modifications) from a StackOverflow question (which I have deleted, since cross-posting is strongly discouraged), which has not been answered over there and might be better suited here.
There were two comments (but no answers) made at the StackOverflow question. This is a short summary of those (note that you might need to read the actual question to understand this):
* The filesystem approach enables you to use
libhugetlbfs
which can do all sorts of things.
* That does not really convince me - if I as an application programmer can allocate huge pages without going via the filesystem, so could libhugetlbfs
, right?
* Going via the filesystem allows you to set permissions on who can allocate huge pages.
* Sure, but it's not required to go via the filesystem. If anyone can do mmap(…, MAP_HUGETLB, …)
, anyone who is denied access on a filesystem level can still exhaust all huge pages by going the mmap
way.
Actual Question
===============
I am currently exploring the various ways of allocating memory in huge pages under Linux. I somehow can not wrap my head around the concept of the HugeTLB 'filesystem'. Note that I'm not talking about transparent huge pages - those are a whole different beast.
## The Conventional Way
The conventional wisdom (as e.g. presented in [the Debian Wiki](https://wiki.debian.org/Hugepages#Enabling_HugeTlbPage) or [the Kernel docs](https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html#using-huge-pages)) seems to be:
- Make sure set your kernel configuration correctly
- set various kernel parameters right
- mount a special filesystem (hugetlbfs
) to some arbitrary directory, say /dev/hugepages/
(that seems to be the default on Fedora…)
- mmap()
a file within that directory into your address space, i.e., something like:
int fd = open("/dev/hugepages/myfile, O_CREAT | O_RDWR, 0755);
void * addr = mmap(0, 10*1024*1024, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);
… and if these two calls succeed, I should have addr
pointing to 10 MB of memory allocated in five 2 MB huge pages. Nice.
## The Easy Way
However, this seems awfully overcomplicated?
At least on Linux 5.15 the whole filesystem thing seems to be completely unnecessary. I just tried this:
* kernel configured with HugeTLBfs
* kernel parameters set correctly (i.e., vm.nr_hugepages > 0
)
* no hugetlbfs
mounted anywhere
And then just do an mmap
of anonymous memory:
void *addr = mmap(0, 10*1024*1024, (PROT_READ | PROT_WRITE),
(MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB), 0, 0);
This gives me 10 MB of memory allocated in huge pages (at least if I don't fail at interpreting the flags in the page table).
## Why the Filesystem?
So my question is: Why the filesystem? Is it actually "necessary" to go via the filesystem, as the various guides suggest, and my attempt above was just lucky? Does the filesystem approach have other advantages (aside from having a file which represents parts of your RAM, which seems like a huge footgun…)? Or is this maybe just a remnant from some previous time, when MAP_ANONYMOUS | MAP_HUGETLB
was not allowed?
Lukas Barth
(231 rep)
Aug 2, 2023, 10:23 AM
• Last activity: Dec 4, 2024, 02:13 PM
0
votes
0
answers
49
views
Discrepancy between vm.nr_hugepages and HugePages_Total
My `/etc/sysctl.conf` have: `vm.nr_hugepages = 20484` but (even after `sysctl -p`) I got: [root@vm04-oracle-19c ~]# cat /proc/meminfo | grep -i hugepages AnonHugePages: 0 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 18933 HugePages_Free: 46 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepag...
My
/etc/sysctl.conf
have: vm.nr_hugepages = 20484
but (even after sysctl -p
) I got:
[root@vm04-oracle-19c ~]# cat /proc/meminfo | grep -i hugepages
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 18933
HugePages_Free: 46
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
[root@vm04-oracle-19c ~]#
Why this happen? HugePages_Total
should be 20484
Current memory:
[opc@vm04-oracle-19c ~]$ free -h
total used free shared buff/cache available
Mem: 63Gi 41Gi 887Mi 7.2Gi 21Gi 14Gi
Swap: 7.8Gi 7.0Mi 7.8Gi
Astora
(509 rep)
Sep 29, 2024, 01:59 AM
• Last activity: Sep 29, 2024, 10:56 PM
6
votes
1
answers
3595
views
Understanding main memory fragmentation and hugepages
I have a machine that is intended for general use and which I also used to run a QEMU virtual machine. Because the virtual machine should be as performant as possible, I want to back the VM memory with hugepages, ideally 1GB hugepages. The machine has 32GB of ram and I want to provide 16 to the VM....
I have a machine that is intended for general use and which I also used to run a QEMU virtual machine. Because the virtual machine should be as performant as possible, I want to back the VM memory with hugepages, ideally 1GB hugepages. The machine has 32GB of ram and I want to provide 16 to the VM. The problem is that during my normal use of the machine, I might need to use all 32GB, so allocating the 16G of hugepages at boot is not an option.
To work around this I have a hook script that allocates the 16G of hugepages when the VM boots. As you might expect, for 1GB hugepages, this fails if the host machine has been used for any amount of time (it seems to work reliably with 2M hugepages though this is not ideal).
What I don't understand is exactly why this is happening. For example, I can open several applications (browser window, code editor, etc just to force some fragmentation for testing), then close them so that only my desktop is open. My memory usages in this case is around 2.5G/32G.
Is there really no way that the kernel can find 16 1G-pages of contiguous aligned memory, out of the remaining 30G of RAM, that seems like very high fragmentation. Furthermore, I can run
$ sudo tee /proc/sys/vm/compact_memory <<<1
to try to defrag the RAM, but even then, I have never successfully allocated 16 1G hugepages for the VM. This in particular is really shocking to me, since after defragging only 2.5G of RAM the remaining 30G *still* isn't contiguous or aligned.
What I'm misunderstanding about this process? Does this seem like expected behavior? Additionally, is there any way to check if
compact_memory
actually did anything? I don't see any output in dmesg
or similar after running that command.
Max Ehrlich
(111 rep)
Jun 20, 2018, 02:53 PM
• Last activity: May 2, 2024, 09:44 PM
2
votes
0
answers
302
views
How to allocate huge pages forcibly and synchronously?
On Linux, (non-transparent) huge pages may be allocated by writing into the `vm.nr_hugepages` sysctl, or, equivalently, into the `/sys/kernel/mm/hugepages/hugepages- kB/nr_hugepages` sysfs file: ``` # to allocate 2 GiB worth of huge pages, assuming a huge page size of 2 MiB (default on x86) $ sysctl...
On Linux, (non-transparent) huge pages may be allocated by writing into the
vm.nr_hugepages
sysctl, or, equivalently, into the /sys/kernel/mm/hugepages/hugepages-kB/nr_hugepages
sysfs file:
# to allocate 2 GiB worth of huge pages, assuming a huge page size of 2 MiB (default on x86)
$ sysctl vm.nr_hugepages=1024
# likewise, but explicitly
$ echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages; grep . /sys/kernel/mm/hugepages/hugepages-2048kB/*
/sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages:1024
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages:1024
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages_mempolicy:1024
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_overcommit_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/resv_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/surplus_hugepages:0
However, if Linux fails to allocate the desired amount of hugepages, the rest will be skipped silently:
$ sysctl vm.nr_hugepages=1024
vm.nr_hugepages = 1024
$ sysctl vm.nr_hugepages
vm.nr_hugepages = 640
Chances of allocating the requested amount of huge pages on a long-running system can be raised somewhat by allocating in a loop, mixed with page cache reset requests and memory compaction requests:
while :; do
sysctl vm.drop_caches=3
sysctl vm.compact_memory=1
sysctl vm.nr_hugepages=1024
nr=$(sysctl -n vm.nr_hugepages)
echo "vm.nr_hugepages = $nr"
if (( nr >= 1024 )); then
break
fi
done
However, this is a dirty hack at best, and incurs unnecessary invalidation and thrashing of the entire page cache.
---
Is there a less hacky method to request N huge pages to be allocated, automatically triggering any and all work necessary to fulfill the allocation (e. g. memory compaction, page cache reclaim, ...) and synchronously waiting for it to complete?
In other words, how do I ask the kernel "do whatever you must to get me N huge pages, and do not return unless N huge pages are allocated or unless it has been firmly established that it is absolutely impossible to fulfill such an allocation"?
intelfx
(5699 rep)
Feb 16, 2024, 09:21 PM
2
votes
0
answers
419
views
How to change the permission of /dev/hugepages?
I have an app that open() a file under `/dev/hugepages` to allocate a huge page. For now, it requires root. How can I change the permissions? It's automatically mounted by F38 with: ```bash #/usr/lib/systemd/system/dev-hugepages.mount # SPDX-License-Identifier: LGPL-2.1-or-later # # This file is par...
I have an app that open() a file under
/dev/hugepages
to allocate a huge page.
For now, it requires root.
How can I change the permissions?
It's automatically mounted by F38 with:
#/usr/lib/systemd/system/dev-hugepages.mount
# SPDX-License-Identifier: LGPL-2.1-or-later
#
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
[Unit]
Description=Huge Pages File System
Documentation=https://docs.kernel.org/admin-guide/mm/hugetlbpage.html
Documentation=https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
DefaultDependencies=no
Before=sysinit.target
ConditionPathExists=/sys/kernel/mm/hugepages
ConditionCapability=CAP_SYS_ADMIN
ConditionVirtualization=!private-users
[Mount]
What=hugetlbfs
Where=/dev/hugepages
Type=hugetlbfs
I tried to add this line under [Mount]:
Options=uid=1000
without luck, the permission is correct using ls
but it just doesn't work.
無名前
(729 rep)
Oct 13, 2023, 07:28 AM
• Last activity: Oct 13, 2023, 08:27 AM
2
votes
2
answers
1350
views
How to enable HugeTLB controller in cgroup v2 on Ubuntu
I am trying to enable [HugeTLB][1] Controller on cgroup v2 on my system but can't figure out how. This is the list of controllers on my system: cat /sys/fs/cgroup/cgroup.controllers cpuset cpu io memory pids rdma And this is what I see for the meminfo on my system: cat /proc/meminfo | grep Huge Anon...
I am trying to enable HugeTLB Controller on cgroup v2 on my system but can't figure out how.
This is the list of controllers on my system:
cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory pids rdma
And this is what I see for the meminfo on my system:
cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
Am I missing something to enable the HugeTLB controller on cgroup v2? Is there a kernel flag, or some other setting that I need to enable?
Harshdeep Gupta
(21 rep)
Oct 13, 2022, 10:37 PM
• Last activity: Jun 9, 2023, 02:54 PM
2
votes
1
answers
1526
views
Can a Linux Swap Partition Be Too Big?
Can a Linux swap partition be too big? I'm pretty certain the answer is, "no" but I haven't found any resources on-point, so thought I'd ask. In contrast, the main Windows swap file, pagefile.sys, can be too large. A commonly cited cap is 3x installed RAM, else the system may have trouble functionin...
Can a Linux swap partition be too big?
I'm pretty certain the answer is, "no" but I haven't found any resources on-point, so thought I'd ask.
In contrast, the main Windows swap file, pagefile.sys, can be too large. A commonly cited cap is 3x installed RAM, else the system may have trouble functioning.
The distinction seems to lie in the fact that Linux virtual memory is highly configurable with kernel parameters, not to mention compile options, whereas Windows virtual memory is barely so. Windows virtual memory management consequently seems to rely on algorithms that are immutable or seem to rely on swap file size and how it is configured.
Linux has its own virtual memory management algorithms, of course, but the question is whether and how they are affected by the size of the specified swap partition or file.
This comes up because I have a system with 16GB physical RAM configured with a series of 64GB partitions to facilitate a multi-boot capability. For convenience / laziness, I've simply designated one of these 64GB partitions as swap, *i.e.*, 4x physical RAM in contrast to Windows' 3x cap (the latter being relevant only as a frame of reference because this is a Linux-only system). I'm debugging some issues around memory management and VMware Workstation and have come to wonder what, if any, effect the swap partition's size has on compaction, swappiness, page faults, and performance generally.
Many thanks for any constructive input.
ebsf
(399 rep)
Aug 23, 2022, 07:31 PM
• Last activity: Aug 23, 2022, 08:31 PM
0
votes
2
answers
5152
views
How do I view the number of 1GB hugetables (and what documentation should I follow)?
I am trying to figure out hugepages for use by KVM under Ubuntu 20.04. If I change the number of 2048 KiB (the default size) pages, I see that is reflected in `/proc/meminfo` ``` :~$ echo 0 |sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages 0 :~$ cat /proc/meminfo | grep Huge AnonHugeP...
I am trying to figure out hugepages for use by KVM under Ubuntu 20.04.
If I change the number of 2048 KiB (the default size) pages, I see that is reflected in
/proc/meminfo
:~$ echo 0 |sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
0
:~$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 50331648 kB
:~$ echo 512 |sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
512
:~$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 512
HugePages_Free: 512
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 51380224 kB
:~$
However, when I change the number of 1GB pages, I don't see anything to reflect that.
:~$ echo 0 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
0
:~$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 512
HugePages_Free: 512
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 51380224 kB
:~$ echo 16 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
16
:~$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 512
HugePages_Free: 512
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 51380224 kB
And as I understand it, this means that 1GB hugepages are supported by my system, right?
ls /sys/kernel/mm/hugepages
hugepages-1048576kB hugepages-2048kB
Are 1Gb pages listed somewhere else? Can I check on their status?
Edit: I can see my 1GB hugepages thanks to @Krackout, but I am still confused about what documentation I should even be following:
I am confused about varying procedures for setting up and monitoring hugepages. I seem to have got them working, but there is still quite a bit that is not clear to me.
**Main resources:**
* https://help.ubuntu.com/community/KVM%20-%20Using%20Hugepages
* https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-memory-tuning
* https://mathiashueber.com/configuring-hugepages-use-virtual-machine/
* https://wiki.archlinux.org/index.php/KVM
* https://wiki.debian.org/Hugepages
Each of the above links describes partially overlapping procedures. It seems that differences based on kernel and distro, but it isn't clear to me what exactly they are and I can't seem to find it explcitly spelled out anywhere.
On my Ubuntu 20.04 setup, what works for me is putting the following in crontab -e
:
@reboot echo 64 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
@reboot mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
After which I can start a KVM VM in virt-manager containing the following XML:
So the way I was able to do isn't exactly what any of the guides said.
Stonecraft
(869 rep)
Jun 22, 2020, 03:06 AM
• Last activity: Mar 2, 2022, 02:19 PM
2
votes
2
answers
989
views
Increasing hw.pagesize in FreeBSD
I have a server rocking _FreeBSD 13_. From [the documentation][1] of `sysctl` I can read that `hw.pagesize` *cannot be changes* on the go. This makes sense to me as this type of parameter depends on the kernel. I can read also there this: ``` Some of the variables which cannot be modified during nor...
I have a server rocking _FreeBSD 13_.
From the documentation of
sysctl
I can read that hw.pagesize
*cannot be changes* on the go. This makes sense to me as this type of parameter depends on the kernel.
I can read also there this:
Some of the variables which cannot be modified during normal system oper-
ation can be initialized via loader(8) tunables. This can for example be
done by setting them in loader.conf(5). Please refer to loader.conf(5)
for more information on which tunables are available and how to set them.
Sadly I cannot find in the documentation of loader(8)
nor loader.conf(5)
any reference to this I need.
In a naive attempt, I just added the hw.pagesize=...
to my /etc/sysctl
config file, without any success. Now, when I run pagesize
I get my sad 4096
bytes value:
jose@miner:~ $ pagesize
4096
But, how can I make it larger? I would like to use 1GB pages in a system, yet I cannot find anywhere how to enable so.
Navarro
(490 rep)
Jun 11, 2021, 11:16 AM
• Last activity: Jun 11, 2021, 02:37 PM
1
votes
0
answers
234
views
linux enable large page management
I am doing some experiments. Some huge pages (2MB) are used in the experiment, so that the 21-bit page offset can remain unchanged when performing virtual address translation. I found some methods on how to enable huge pages on the Internet. And that is effective. But I am not very clear about its p...
I am doing some experiments. Some huge pages (2MB) are used in the experiment, so that the 21-bit page offset can remain unchanged when performing virtual address translation.
I found some methods on how to enable huge pages on the Internet. And that is effective. But I am not very clear about its principle, so I would like to ask?
It requires Hugepages and assumes they are mounted on
/mnt/hugetlbfs/
. This value can be modified by changing the value of FILE_NAME.
The mount point must be created previously:
$ sudo mkdir /mnt/hugetlbfs
.
Once reserved, hugepages can be mounted:
$ sudo mount -t hugetlbfs none /mnt/hugetlbfs
Note that this may require to use sudo
for the examples or to change the permissions of the /mnt/hugetlbfs/
folder.
To enable a fixed amount of huge pages, after a reboot the number of huge pages must be set:
$ echo 100 > /proc/sys/vm/nr_hugepages
At the beginning, my understanding was that the original system was managed by 4Kb
pages, and now I enable huge pages, then all the memory will be managed by huge pages.
But I read some explanations and compared the commands. It feels like it has created a folder, and the files in this folder are managed by huge pages, and those that are not in this folder are managed by 4KB. In C language, I can use buffer = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE|HUGEPAGES, -1, 0);
to create large pages.
Is my understanding correct?
Gerrie
(193 rep)
May 17, 2021, 02:27 AM
0
votes
1
answers
122
views
Enabling Huge Pages on RHEL6 for Oracle 18C xe
I have been trying to switch from oracle AMM to ASMM huge pages. I have done the following changes on RHEL 6 Added following entry in /etc/sysctl.conf ( as suggested by hugepages_setting.sh ) vm.nr_hugepages=777 Added following entry in /etc/security/limits.conf oracle soft memlock 2831155 oracle ha...
I have been trying to switch from oracle AMM to ASMM huge pages. I have done the following changes on RHEL 6
Added following entry in /etc/sysctl.conf ( as suggested by hugepages_setting.sh )
vm.nr_hugepages=777
Added following entry in /etc/security/limits.conf
oracle soft memlock 2831155
oracle hard memlock 2831155
rebooted the server
changed oracle parameters memory_target, memory_max_target, sga_target, sga_max_target, use_large_pages to specific values.
After a database restart, I can see the following:
[root@rheloracle ~]# grep -i huge /proc/meminfo
AnonHugePages: 0 kB
HugePages_Total: 777
HugePages_Free: 8
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
And when I shutdown the database I could see the HugePages_Free is equal to HugePages_Total.
[root@rheloracle ~]# grep -i huge /proc/meminfo
AnonHugePages: 0 kB
HugePages_Total: 777
HugePages_Free: 777
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Looks like HugePage configuration at db level and o/s level are in sync and in use. But all the examples and documents (i have referred) indicate that HugePages_Rsvd should have a non zero value after enabling huge pages. But same is not happening in my case. Can you please suggest if I am missing something or it's normal to have HugePages_Rsvd 0.
(I am running oracle 18c xpress edition on RHEL6)
Prem
(243 rep)
Aug 20, 2020, 05:00 AM
• Last activity: Jan 26, 2021, 09:07 PM
1
votes
1
answers
301
views
Linux use huge pages only
I have a x64 Linux system. The page size reported by `getconf` is 4 k: ```bash $ getconf PAGESIZE 4096 ``` I want the kernel to use only large pages (2 M or 4 M) for all memory allocations. I've calculated that I have enough RAM to handle the memory that will be wasted because of this. How do I...
I have a x64 Linux system. The page size reported by
getconf
is 4 k:
$ getconf PAGESIZE
4096
I want the kernel to use only large pages (2 M or 4 M) for all memory allocations. I've calculated that I have enough RAM to handle the memory
that will be wasted because of this.
How do I configure the Linux kernel so that it uses large pages for all allocations?
fctorial
(203 rep)
Dec 31, 2020, 12:03 AM
• Last activity: Dec 31, 2020, 08:56 AM
3
votes
2
answers
1182
views
Discover huge page support on POSIX or Linux
I'm working on a program which needs to detect at runtime whether the system it's running on supports hugepages, and if so, what sizes are available. Ideally I'd like this to work for any POSIX platform, but a Linux-specific solution would be a start. POSIX supports [`sysconf(_SC_PAGESIZE)`](http://...
I'm working on a program which needs to detect at runtime whether the system it's running on supports hugepages, and if so, what sizes are available. Ideally I'd like this to work for any POSIX platform, but a Linux-specific solution would be a start.
POSIX supports [
sysconf(_SC_PAGESIZE)
](http://man7.org/linux/man-pages/man3/sysconf.3.html) to get the default page size on the platform, but doesn't seem to similarly support asking for any hugepage sizes. I could also potentially check by trying to [mmap
](http://man7.org/linux/man-pages/man2/mmap.2.html) MAP_HUGE_2MB
or MAP_HUGE_1GB
arguments, but that would be slow, and in the case of 1GB huge pages, incredibly wasteful (and it could easily fail due to a lack of available memory).
joshlf
(395 rep)
May 11, 2017, 11:01 PM
• Last activity: Dec 30, 2020, 10:27 PM
0
votes
1
answers
678
views
benefits of allocating huge pages at boot
[ moving the question from StackOverflow where it seems less appropriate ] The kernel boots with `default_hugepagesz=1G` option, which defines size of the default page size. So when an application want large memory, the kernel will allocate it with 1G pages. If the kernel boots with `hugepages=N`, i...
[ moving the question from StackOverflow where it seems less appropriate ]
The kernel boots with
default_hugepagesz=1G
option, which defines size of the default page size. So when an application want large memory, the kernel will allocate it with 1G pages.
If the kernel boots with hugepages=N
, i.e. allocate N huge pages at boot. So in this case, the kernel will automatically take a page from this pool, thus saving time on allocating memory?
When this pool runs out of available pages, how will the kernel allocate huge memory?
Mark
(1943 rep)
Dec 30, 2020, 05:11 PM
• Last activity: Dec 30, 2020, 07:19 PM
0
votes
1
answers
2251
views
Is it possible to disable Transparent Huge pages on the fly?
In order to disable the `THP` We did the following on all 635 `RHEL` machines (we have `rhel 7.5` version) This lines are from bash script that we runs on all machines **Step 1** [[ -f /sys/kernel/mm/transparent_hugepage/enabled ]] && echo never > /sys/kernel/mm/transparent_hugepage/enabled [[ -f /s...
In order to disable the
THP
We did the following on all 635 RHEL
machines (we have rhel 7.5
version)
This lines are from bash script that we runs on all machines
**Step 1**
[[ -f /sys/kernel/mm/transparent_hugepage/enabled ]] && echo never > /sys/kernel/mm/transparent_hugepage/enabled
[[ -f /sys/kernel/mm/transparent_hugepage/defrag ]] && echo never > /sys/kernel/mm/transparent_hugepage/defrag
*Verification:*
cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
cat /sys/kernel/mm/transparent_hugepage/defrag
always madvise [never]
But as all know this steps are not considered when machine restarted/rebooted
**Step 2**
So we also did this , we append the following lines to /etc/rc.local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
**The question is:**
Does step 1
as I mentioned above really disabled
the THP
on the fly?
Note - also other info from one typical machine
sysctl -a | grep hugepage
vm.hugepages_treat_as_movable = 0
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
reference - [Configuring Transparent Huge Pages](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-configuring_transparent_huge_pages) .
yael
(13936 rep)
Feb 20, 2020, 02:04 PM
• Last activity: Feb 20, 2020, 05:13 PM
1
votes
2
answers
3361
views
Zend OPcache huge_code_pages: madvise(HUGEPAGE) failed
I've got this error while running a PHP command-line script on a freshly installed server: > PHP Warning: Zend OPcache huge_code_pages: madvise(HUGEPAGE) failed: Invalid argument The server is running CentOS 7.3, with PHP 7.1.4 from the [remi repository][1]. According to [this thread on remi forum][...
I've got this error while running a PHP command-line script on a freshly installed server:
> PHP Warning: Zend OPcache huge_code_pages: madvise(HUGEPAGE) failed: Invalid argument
The server is running CentOS 7.3, with PHP 7.1.4 from the remi repository .
According to this thread on remi forum , and this thread on plesk.com , the solution is to disable
huge_code_pages
in php-opcache.ini:
opcache.huge_code_pages=0
However, Remi said that this problem should only occur on CentOS 6, not CentOS 7.
Before I disable huge_code_pages
for good, **is there a solution to make it work?**
BenMorel
(4849 rep)
May 5, 2017, 03:27 PM
• Last activity: Dec 15, 2019, 12:30 AM
1
votes
0
answers
418
views
Using libhugetlbfs to transparently back up glibc malloc calls in a multi-threaded application
I'm trying to back up the memory allocations of a multi-threaded application with 1GiB hugepages using libhugetlbfs. However, only the main thread allocations are being assigned hugepages. If I restrict the maximum number of Glibc malloc arenas to 1, all the allocations of all threads are backed up...
I'm trying to back up the memory allocations of a multi-threaded application with 1GiB hugepages using libhugetlbfs. However, only the main thread allocations are being assigned hugepages. If I restrict the maximum number of Glibc malloc arenas to 1, all the allocations of all threads are backed up with hugepages. This is not ideal due to the introduced contention of concurrently accessing a single arena.
Is there any way to transparently force all threads to use huge pages by means of libhugetlbfs?
**Note**: I'm aware of transparent huge pages (THP). However, allocations smaller than 1GiB are not automatically assigned hugepages. Smaller pages will only be compacted into bigger pages when the khugepaged kernel thread process them, which is something I would not like to rely on. Ideally, I would like all malloc calls to be serviced using huge pages even if the allocations are small. This is useful for a applications that do a lot of small allocations.
Experimentation
===============
These are the steps that I have followed to set up 1GiB hugepages:
sudo mkdir /dev/hugepages1G
sudo mount -t hugetlbfs -o uid=,pagesize=1g,min_size=50g none /dev/hugepages1G
sudo hugeadm --pool-pages-min 1G:50
I'm using the dummy application below for testing. The main thread allocates and initializes 1GiB of memory. Then, it creates three pthreads, each of which allocates and initializes 10GiB of memory.
#include
#include
#include
#include
#include
#include
#include
void *iamathread(void *data)
{
char *addr;
char dummy;
size_t size, i;
size = 10*1024*1024*1024UL;
pid_t x = syscall(__NR_gettid);
addr = malloc(size);
if (!addr) {
perror("cannot allocate memory");
pthread_exit(NULL);
}
memset(addr, 1, size);
printf("%d:\t sleeping\n", x);
sleep(1000000U);
return NULL;
}
int main(int argc, char *agv[])
{
char *addr;
char dummy;
size_t size, i;
int npt;
npt = 3;
size = 1*1024*1024*1024UL;
pthread_t pt[npt];
for (i = 0; i < npt; i++) {
if (pthread_create(&pt[i], NULL, iamathread, NULL)) {
fprintf(stderr, "Error creating thread\n");
return 1;
}
}
pid_t x = syscall(__NR_gettid);
printf("%d:\t I'm main\n", x);
addr = malloc(size);
if (!addr) {
perror("cannot allocate memory");
return 1;
}
memset(addr, 1, size);
printf("Press any key to exit and release memory\n");
scanf("%c", &dummy);
return 0;
}
I have created the following script to count the number of pages per page size used by an application:
#!/usr/bin/bash
PID=$1
awk '
BEGIN {
tmp_size = -1
}
$1 == "Size:" {
tmp_size = $2
next
}
$1 == "KernelPageSize:" {
page_size = $2
vmas[page_size]["count"] += 1
vmas[page_size]["pages"] += tmp_size/page_size
tmp_size = -1
next
}
END {
for (key in vmas) {
print(key " KiB VMAs: " vmas[key]["count"])
}
for (key in vmas) {
print(key " KiB num pages: " vmas[key]["pages"])
}
}
' /proc/$PID/smaps
And these are the results obtained when running with and without the MALLOC_ARENA_MAX environment variable to limit the number of arenas:
$ LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=1G HUGETLB_PATH=/dev/hugepages1G ./main &
$ hugepagecount.sh pgrep main
4 KiB VMAs: 41
1048576 KiB VMAs: 2
4 KiB num pages: 7922277
1048576 KiB num pages: 2
$ MALLOC_ARENA_MAX=1 LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=1G HUGETLB_PATH=/dev/hugepages1G ./main &
$ hugepagecount.sh pgrep main
4 KiB VMAs: 37
1048576 KiB VMAs: 5
4 KiB num pages: 8802
1048576 KiB num pages: 32
When not limiting the number of arenas, only 2 1GiB (1048576 KiB) pages are allocated. Instead, when forcing a single arena, 32 1GiB pages are allocated.
aleixrocks
(305 rep)
Dec 3, 2019, 07:55 AM
• Last activity: Dec 3, 2019, 08:27 AM
Showing page 1 of 20 total questions