Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes

0 answers

110 views

Allocating contiguous physical memory using huge pages in kernel module

linux kernel-modules memory-management huge-pages

0 I need a kernel module that allocates 8MB of physically contiguous memory using 2MB huge pages, in response to a user-space mmap() request. While I’ve successfully used alloc_pages() with 4KB pages to allocate smaller contiguous chunks like 256KB or 512KB, I’m unsure if this approach can be used t...

                                  0

I need a kernel module that allocates 8MB of physically contiguous memory using 2MB huge pages, in response to a user-space mmap() request. While I’ve successfully used alloc_pages() with 4KB pages to allocate smaller contiguous chunks like 256KB or 512KB, I’m unsure if this approach can be used to allocate 8MB of physically contiguous memory backed by 2MB huge pages.

To consistently allocate 8MB using 2MB huge pages, is there a way to reserve these huge pages And in scenarios where sufficient 2MB pages aren't available, is there a fallback mechanism to allocate the same 8MB region using 4KB pages instead?

ReturnAddress (3 rep)

Apr 18, 2025, 04:23 PM

0 votes

0 answers

27 views

hugepages allocated via mmap not being freed up by Kernel even after unmounting hugetlbfs filesystem (OL9/UEK7)

linux-kernel oracle-linux huge-pages

We have an c++ application that allocates and uses hugepages memory (via Jemalloc hooks). Chunk (2MB) allocation happens via `mmap` with protection flags `PROT_READ | PROT_WRITE` and flags `MAP_SHARED | MAP_POPULATE` into a chunk file created on `hugetlbfs` filesystem. The call looks roughly like th...

We have an c++ application that allocates and uses hugepages memory (via Jemalloc hooks). Chunk (2MB) allocation happens via mmap with protection flags PROT_READ | PROT_WRITE and flags MAP_SHARED | MAP_POPULATE into a chunk file created on hugetlbfs filesystem. The call looks roughly like this:

void* chunk = mmap(nullptr, size, protection, flags, fd, offset);

The application itself doesn't free up allocated chunks on exit. We rely on unmounting and remounting hugetlbfs on each restart to free up everything. Everything was fine until we recently migrated to Oracle Linux 9 ( UEK7 5.15.0-303.171.5.2.1). After migration, we are noticing that once the application exits, even after unmounting the filesystem, hugepages aren't getting freed up (verified via /proc/meminfo). Nothing seems to clear up those hugepages until the system is rebooted. We have verified that none of the other processes (via /proc//maps) are using hugepages. Any pointers on why kernel thinks those hugepages are in use? The application runs as a systemd service, uses cgroup v2 (no special memory customization), in case these are somehow impacting the cleanup.

WafflingDoodle (1 rep)

Apr 17, 2025, 05:03 AM

0 votes

0 answers

34 views

How my VM overcommit memory using hugetlbfs

memory qemu huge-pages

I am trying to reproduce the experiments in this paper.《Progressive Memory Adjustment with Performance Guarantee in Virtualized Systems》. It is said "For example, we use hugetlbfs to limit the available memory in the host to 100GB, but the memory size requested by all VMs is 120GB, and the memory ov...

                                  I am trying to reproduce the experiments in this paper.《Progressive Memory Adjustment with Performance Guarantee in Virtualized Systems》. It is said "For example, we use hugetlbfs to
limit the available memory in the host to 100GB, but the memory size requested by all VMs is 120GB, and the memory overcommitment rate value at this scenario is (120 − 100)/100 = 20%."

This is what i have done. I try to boot two VM with "-m 6G" and only allocate 10G to hugetlbfs, which means 20% overcommitment rate. I have to use "prealloc=off,reserve=off" because if not the qemu would error I have not enough memory to boot the second vm.

qemu-system-x86_64  \
    -m 6G\
    -object memory-backend memfd,id=mem1,size=6G,hugetlb=on,hugetlbsize=2M,share=on,prealloc=off,reserve=off \
    -numa node,memdev=mem1 \
    ......

Then, i try to make VM1 use 6GB memory, and make VM2 use 6GB memory when VM1 release memory. In my idea, the VM2 should allocate memory faster than VM1 release, so in a few seconds, the VM1 should have no hugepages to allocate and wait. This worsen the performance of VM1, which is what i want. Therefore, i can test another better algorithm that VM1 release memory faster than normal algorithm(which is FreePageReporting).

My problem is, VM2 would just broke when the hugetlbfs have zero hugepages left. It should be right as is said in "If no huge page exists at page fault time, the task is sent a SIGBUS and often dies an unhappy death. ". But it is not what i want. because I want the VM2 to wait or use memory from swap zone. I have enable the swapfile but it is just not used at all.

Could anyone help? i have been stuck here for 2 months and my due is coming. PLEASE!
                                

qi chen (1 rep)

Feb 21, 2025, 07:42 AM

5 votes

1 answers

1291 views

Linux HugeTLB: What is the advantage of the filesystem approach?

memory virtual-memory mmap huge-pages

Moved Post Notice -------------------- I just moved this question (with slight modifications) from a StackOverflow question (which I have deleted, since cross-posting is strongly discouraged), which has not been answered over there and might be better suited here. There were two comments (but no answers) made at the StackOverflow question. This is a short summary of those (note that you might need to read the actual question to understand this): * The filesystem approach enables you to use libhugetlbfs which can do all sorts of things. * That does not really convince me - if I as an application programmer can allocate huge pages without going via the filesystem, so could libhugetlbfs, right? * Going via the filesystem allows you to set permissions on who can allocate huge pages. * Sure, but it's not required to go via the filesystem. If anyone can do mmap(…, MAP_HUGETLB, …), anyone who is denied access on a filesystem level can still exhaust all huge pages by going the mmap way. Actual Question =============== I am currently exploring the various ways of allocating memory in huge pages under Linux. I somehow can not wrap my head around the concept of the HugeTLB 'filesystem'. Note that I'm not talking about transparent huge pages - those are a whole different beast. ## The Conventional Way The conventional wisdom (as e.g. presented in [the Debian Wiki](https://wiki.debian.org/Hugepages#Enabling_HugeTlbPage) or [the Kernel docs](https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html#using-huge-pages)) seems to be: - Make sure set your kernel configuration correctly - set various kernel parameters right - mount a special filesystem (hugetlbfs) to some arbitrary directory, say /dev/hugepages/ (that seems to be the default on Fedora…) - mmap() a file within that directory into your address space, i.e., something like:

int fd = open("/dev/hugepages/myfile, O_CREAT | O_RDWR, 0755);
void * addr = mmap(0, 10*1024*1024, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);

… and if these two calls succeed, I should have addr pointing to 10 MB of memory allocated in five 2 MB huge pages. Nice. ## The Easy Way However, this seems awfully overcomplicated? At least on Linux 5.15 the whole filesystem thing seems to be completely unnecessary. I just tried this: * kernel configured with HugeTLBfs * kernel parameters set correctly (i.e., vm.nr_hugepages > 0) * no hugetlbfs mounted anywhere And then just do an mmap of anonymous memory:

void *addr = mmap(0, 10*1024*1024, (PROT_READ | PROT_WRITE),
                  (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB), 0, 0);

This gives me 10 MB of memory allocated in huge pages (at least if I don't fail at interpreting the flags in the page table). ## Why the Filesystem? So my question is: Why the filesystem? Is it actually "necessary" to go via the filesystem, as the various guides suggest, and my attempt above was just lucky? Does the filesystem approach have other advantages (aside from having a file which represents parts of your RAM, which seems like a huge footgun…)? Or is this maybe just a remnant from some previous time, when MAP_ANONYMOUS | MAP_HUGETLB was not allowed?

Lukas Barth (231 rep)

Aug 2, 2023, 10:23 AM • Last activity: Dec 4, 2024, 02:13 PM

0 votes

0 answers

49 views

Discrepancy between vm.nr_hugepages and HugePages_Total

linux memory huge-pages

My `/etc/sysctl.conf` have: `vm.nr_hugepages = 20484` but (even after `sysctl -p`) I got: [root@vm04-oracle-19c ~]# cat /proc/meminfo | grep -i hugepages AnonHugePages: 0 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 18933 HugePages_Free: 46 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepag...

                                  My /etc/sysctl.conf have: vm.nr_hugepages = 20484

but (even after sysctl -p) I got:

    [root@vm04-oracle-19c ~]# cat /proc/meminfo | grep -i hugepages
    AnonHugePages:         0 kB
    ShmemHugePages:        0 kB
    FileHugePages:         0 kB
    HugePages_Total:   18933
    HugePages_Free:       46
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    [root@vm04-oracle-19c ~]#


Why this happen? HugePages_Total should be 20484


Current memory:

    [opc@vm04-oracle-19c ~]$ free -h
                  total        used        free      shared  buff/cache   available
    Mem:           63Gi        41Gi       887Mi       7.2Gi        21Gi        14Gi
    Swap:         7.8Gi       7.0Mi       7.8Gi


                                

Astora (509 rep)

Sep 29, 2024, 01:59 AM • Last activity: Sep 29, 2024, 10:56 PM

6 votes

1 answers

3595 views

Understanding main memory fragmentation and hugepages

memory qemu defragmentation huge-pages

I have a machine that is intended for general use and which I also used to run a QEMU virtual machine. Because the virtual machine should be as performant as possible, I want to back the VM memory with hugepages, ideally 1GB hugepages. The machine has 32GB of ram and I want to provide 16 to the VM....

                                  I have a machine that is intended for general use and which I also used to run a QEMU virtual machine. Because the virtual machine should be as performant as possible, I want to back the VM memory with hugepages, ideally 1GB hugepages. The machine has 32GB of ram and I want to provide 16 to the VM. The problem is that during my normal use of the machine, I might need to use all 32GB, so allocating the 16G of hugepages at boot is not an option. 

To work around this I have a hook script that allocates the 16G of hugepages when the VM boots. As you might expect, for 1GB hugepages, this fails if the host machine has been used for any amount of time (it seems to work reliably with 2M hugepages though this is not ideal). 

What I don't understand is exactly why this is happening. For example, I can open several applications (browser window, code editor, etc just to force some fragmentation for testing), then close them so that only my desktop is open. My memory usages in this case is around 2.5G/32G. 

Is there really no way that the kernel can find 16 1G-pages of contiguous aligned memory, out of the remaining 30G of RAM, that seems like very high fragmentation. Furthermore, I can run 

    $ sudo tee /proc/sys/vm/compact_memory <<<1

to try to defrag the RAM, but even then, I have never successfully allocated 16 1G hugepages for the VM. This in particular is really shocking to me, since after defragging only 2.5G of RAM the remaining 30G *still* isn't contiguous or aligned. 

What I'm misunderstanding about this process? Does this seem like expected behavior? Additionally, is there any way to check if compact_memory actually did anything? I don't see any output in dmesg or similar after running that command.

Max Ehrlich (111 rep)

Jun 20, 2018, 02:53 PM • Last activity: May 2, 2024, 09:44 PM

2 votes

0 answers

302 views

How to allocate huge pages forcibly and synchronously?

linux huge-pages

On Linux, (non-transparent) huge pages may be allocated by writing into the `vm.nr_hugepages` sysctl, or, equivalently, into the `/sys/kernel/mm/hugepages/hugepages- kB/nr_hugepages` sysfs file: ``` # to allocate 2 GiB worth of huge pages, assuming a huge page size of 2 MiB (default on x86) $ sysctl...

On Linux, (non-transparent) huge pages may be allocated by writing into the vm.nr_hugepages sysctl, or, equivalently, into the /sys/kernel/mm/hugepages/hugepages-kB/nr_hugepages sysfs file:

# to allocate 2 GiB worth of huge pages, assuming a huge page size of 2 MiB (default on x86)
$ sysctl vm.nr_hugepages=1024

# likewise, but explicitly
$ echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages; grep . /sys/kernel/mm/hugepages/hugepages-2048kB/*
/sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages:1024
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages:1024
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages_mempolicy:1024
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_overcommit_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/resv_hugepages:0
/sys/kernel/mm/hugepages/hugepages-2048kB/surplus_hugepages:0

However, if Linux fails to allocate the desired amount of hugepages, the rest will be skipped silently:

$ sysctl vm.nr_hugepages=1024
vm.nr_hugepages = 1024

$ sysctl vm.nr_hugepages
vm.nr_hugepages = 640

Chances of allocating the requested amount of huge pages on a long-running system can be raised somewhat by allocating in a loop, mixed with page cache reset requests and memory compaction requests:

while :; do
    sysctl vm.drop_caches=3
    sysctl vm.compact_memory=1
    sysctl vm.nr_hugepages=1024

    nr=$(sysctl -n vm.nr_hugepages)
    echo "vm.nr_hugepages = $nr"
    if (( nr >= 1024 )); then
        break
    fi
done

However, this is a dirty hack at best, and incurs unnecessary invalidation and thrashing of the entire page cache. --- Is there a less hacky method to request N huge pages to be allocated, automatically triggering any and all work necessary to fulfill the allocation (e. g. memory compaction, page cache reclaim, ...) and synchronously waiting for it to complete? In other words, how do I ask the kernel "do whatever you must to get me N huge pages, and do not return unless N huge pages are allocated or unless it has been firmly established that it is absolutely impossible to fulfill such an allocation"?

intelfx (5699 rep)

Feb 16, 2024, 09:21 PM

2 votes

0 answers

419 views

How to change the permission of /dev/hugepages?

sysctl huge-pages

I have an app that open() a file under `/dev/hugepages` to allocate a huge page. For now, it requires root. How can I change the permissions? It's automatically mounted by F38 with: ```bash #/usr/lib/systemd/system/dev-hugepages.mount # SPDX-License-Identifier: LGPL-2.1-or-later # # This file is par...

I have an app that open() a file under /dev/hugepages to allocate a huge page. For now, it requires root. How can I change the permissions? It's automatically mounted by F38 with:

#/usr/lib/systemd/system/dev-hugepages.mount

#  SPDX-License-Identifier: LGPL-2.1-or-later
#
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Huge Pages File System
Documentation=https://docs.kernel.org/admin-guide/mm/hugetlbpage.html 
Documentation=https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems 
DefaultDependencies=no
Before=sysinit.target
ConditionPathExists=/sys/kernel/mm/hugepages
ConditionCapability=CAP_SYS_ADMIN
ConditionVirtualization=!private-users

[Mount]
What=hugetlbfs
Where=/dev/hugepages
Type=hugetlbfs

I tried to add this line under [Mount]:

Options=uid=1000

without luck, the permission is correct using ls but it just doesn't work.

無名前 (729 rep)

Oct 13, 2023, 07:28 AM • Last activity: Oct 13, 2023, 08:27 AM

2 votes

2 answers

1350 views

How to enable HugeTLB controller in cgroup v2 on Ubuntu

linux ubuntu linux-kernel cgroups huge-pages

I am trying to enable [HugeTLB][1] Controller on cgroup v2 on my system but can't figure out how. This is the list of controllers on my system: cat /sys/fs/cgroup/cgroup.controllers cpuset cpu io memory pids rdma And this is what I see for the meminfo on my system: cat /proc/meminfo | grep Huge Anon...

                                  I am trying to enable HugeTLB  Controller on cgroup v2 on my system but can't figure out  how.

This is the list of controllers on my system:

    cat /sys/fs/cgroup/cgroup.controllers 
    cpuset cpu io memory pids rdma

And this is what I see for the meminfo on my system: 

    cat /proc/meminfo | grep Huge
    AnonHugePages:         0 kB
    ShmemHugePages:        0 kB
    FileHugePages:         0 kB
    HugePages_Total:       0
    HugePages_Free:        0
    HugePages_Rsvd:        0
    HugePages_Surp:        0
    Hugepagesize:       2048 kB
    Hugetlb:               0 kB


Am I missing something to enable the HugeTLB controller on cgroup v2? Is there a kernel flag, or some other setting that I need to enable?
                                

Harshdeep Gupta (21 rep)

Oct 13, 2022, 10:37 PM • Last activity: Jun 9, 2023, 02:54 PM

2 votes

1 answers

1526 views

Can a Linux Swap Partition Be Too Big?

swap virtual-memory memory-management iommu huge-pages

Can a Linux swap partition be too big? I'm pretty certain the answer is, "no" but I haven't found any resources on-point, so thought I'd ask. In contrast, the main Windows swap file, pagefile.sys, can be too large. A commonly cited cap is 3x installed RAM, else the system may have trouble functionin...

                                  Can a Linux swap partition be too big?

I'm pretty certain the answer is, "no" but I haven't found any resources on-point, so thought I'd ask.

In contrast, the main Windows swap file, pagefile.sys, can be too large.  A commonly cited cap is 3x installed RAM, else the system may have trouble functioning.

The distinction seems to lie in the fact that Linux virtual memory is highly configurable with kernel parameters, not to mention compile options, whereas Windows virtual memory is barely so.  Windows virtual memory management consequently seems to rely on algorithms that are immutable or seem to rely on swap file size and how it is configured.

Linux has its own virtual memory management algorithms, of course, but the question is whether and how they are affected by the size of the specified swap partition or file.

This comes up because I have a system with 16GB physical RAM configured with a series of 64GB partitions to facilitate a multi-boot capability.  For convenience / laziness, I've simply designated one of these 64GB partitions as swap, *i.e.*, 4x physical RAM in contrast to Windows' 3x cap (the latter being relevant only as a frame of reference because this is a Linux-only system).  I'm debugging some issues around memory management and VMware Workstation and have come to wonder what, if any, effect the swap partition's size has on compaction, swappiness, page faults, and performance generally.

Many thanks for any constructive input.

ebsf (399 rep)

Aug 23, 2022, 07:31 PM • Last activity: Aug 23, 2022, 08:31 PM

0 votes

2 answers

5152 views

How do I view the number of 1GB hugetables (and what documentation should I follow)?

linux-kernel memory kvm huge-pages

I am trying to figure out hugepages for use by KVM under Ubuntu 20.04. If I change the number of 2048 KiB (the default size) pages, I see that is reflected in `/proc/meminfo` ``` :~$ echo 0 |sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages 0 :~$ cat /proc/meminfo | grep Huge AnonHugeP...

I am trying to figure out hugepages for use by KVM under Ubuntu 20.04. If I change the number of 2048 KiB (the default size) pages, I see that is reflected in /proc/meminfo

:~$ echo 0 |sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
0
:~$ cat /proc/meminfo | grep Huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        50331648 kB
:~$ echo 512 |sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
512
:~$ cat /proc/meminfo | grep Huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:     512
HugePages_Free:      512
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        51380224 kB
:~$

However, when I change the number of 1GB pages, I don't see anything to reflect that.

:~$ echo 0 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
0
:~$ cat /proc/meminfo | grep Huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:     512
HugePages_Free:      512
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        51380224 kB
:~$ echo 16 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
16
:~$ cat /proc/meminfo | grep Huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:     512
HugePages_Free:      512
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        51380224 kB

And as I understand it, this means that 1GB hugepages are supported by my system, right?

ls /sys/kernel/mm/hugepages
hugepages-1048576kB  hugepages-2048kB

Are 1Gb pages listed somewhere else? Can I check on their status? Edit: I can see my 1GB hugepages thanks to @Krackout, but I am still confused about what documentation I should even be following: I am confused about varying procedures for setting up and monitoring hugepages. I seem to have got them working, but there is still quite a bit that is not clear to me. **Main resources:** * https://help.ubuntu.com/community/KVM%20-%20Using%20Hugepages * https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-memory-tuning * https://mathiashueber.com/configuring-hugepages-use-virtual-machine/ * https://wiki.archlinux.org/index.php/KVM * https://wiki.debian.org/Hugepages Each of the above links describes partially overlapping procedures. It seems that differences based on kernel and distro, but it isn't clear to me what exactly they are and I can't seem to find it explcitly spelled out anywhere. On my Ubuntu 20.04 setup, what works for me is putting the following in crontab -e:

@reboot echo 64 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
@reboot mount -t hugetlbfs -o pagesize=1G none /dev/hugepages

After which I can start a KVM VM in virt-manager containing the following XML:

So the way I was able to do isn't exactly what any of the guides said.

Stonecraft (869 rep)

Jun 22, 2020, 03:06 AM • Last activity: Mar 2, 2022, 02:19 PM

2 votes

2 answers

989 views

Increasing hw.pagesize in FreeBSD

freebsd huge-pages

I have a server rocking _FreeBSD 13_. From [the documentation][1] of `sysctl` I can read that `hw.pagesize` *cannot be changes* on the go. This makes sense to me as this type of parameter depends on the kernel. I can read also there this: ``` Some of the variables which cannot be modified during nor...

I have a server rocking _FreeBSD 13_. From the documentation of sysctl I can read that hw.pagesize *cannot be changes* on the go. This makes sense to me as this type of parameter depends on the kernel. I can read also there this:

Some of the variables which cannot	be modified during normal system oper-
ation can be initialized via loader(8) tunables.  This can	for example be
done by setting them in loader.conf(5).  Please refer to loader.conf(5)
for more information on which tunables are	available and how to set them.

Sadly I cannot find in the documentation of loader(8) nor loader.conf(5) any reference to this I need. In a naive attempt, I just added the hw.pagesize=... to my /etc/sysctl config file, without any success. Now, when I run pagesize I get my sad 4096 bytes value:

jose@miner:~ $ pagesize
4096

But, how can I make it larger? I would like to use 1GB pages in a system, yet I cannot find anywhere how to enable so.

Navarro (490 rep)

Jun 11, 2021, 11:16 AM • Last activity: Jun 11, 2021, 02:37 PM

1 votes

0 answers

234 views

linux enable large page management

linux virtual-memory mmap huge-pages

I am doing some experiments. Some huge pages (2MB) are used in the experiment, so that the 21-bit page offset can remain unchanged when performing virtual address translation. I found some methods on how to enable huge pages on the Internet. And that is effective. But I am not very clear about its p...

                                  I am doing some experiments. Some huge pages (2MB) are used in the experiment, so that the 21-bit page offset can remain unchanged when performing virtual address translation.
I found some methods on how to enable huge pages on the Internet. And that is effective. But I am not very clear about its principle, so I would like to ask?

    It requires Hugepages and assumes they are mounted on /mnt/hugetlbfs/. This value can be modified by changing the value of FILE_NAME.
    The mount point must be created previously:
    
        $ sudo mkdir /mnt/hugetlbfs.
        
        Once reserved, hugepages can be mounted:
        
        $ sudo mount -t hugetlbfs none /mnt/hugetlbfs
        
        Note that this may require to use sudo for the examples or to change the permissions of the /mnt/hugetlbfs/ folder.
        
        To enable a fixed amount of huge pages, after a reboot the number of huge pages must be set:
        
        $ echo 100 > /proc/sys/vm/nr_hugepages

At the beginning, my understanding was that the original system was managed by 4Kb pages, and now I enable huge pages, then all the memory will be managed by huge pages.
But I read some explanations and compared the commands. It feels like it has created a folder, and the files in this folder are managed by huge pages, and those that are not in this folder are managed by 4KB. In C language, I can use buffer = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE|HUGEPAGES, -1, 0); to create large pages.

Is my understanding correct?

Gerrie (193 rep)

May 17, 2021, 02:27 AM

0 votes

1 answers

122 views

Enabling Huge Pages on RHEL6 for Oracle 18C xe

rhel oracle huge-pages

I have been trying to switch from oracle AMM to ASMM huge pages. I have done the following changes on RHEL 6 Added following entry in /etc/sysctl.conf ( as suggested by hugepages_setting.sh ) vm.nr_hugepages=777 Added following entry in /etc/security/limits.conf oracle soft memlock 2831155 oracle ha...

                                  I have been trying to switch from oracle AMM to ASMM huge pages. I have done the following changes on RHEL 6
 
Added following entry  in /etc/sysctl.conf ( as suggested by hugepages_setting.sh )

          vm.nr_hugepages=777
Added following entry in /etc/security/limits.conf 

           oracle   soft   memlock    2831155
           oracle   hard   memlock    2831155

rebooted the server
changed oracle parameters memory_target, memory_max_target, sga_target, sga_max_target, use_large_pages to specific values.

After a database restart, I can see the following:

         [root@rheloracle ~]# grep -i huge /proc/meminfo
          AnonHugePages:         0 kB
          HugePages_Total:     777
          HugePages_Free:        8
          HugePages_Rsvd:        0
          HugePages_Surp:        0
          Hugepagesize:       2048 kB

And when I shutdown the database I could see the HugePages_Free is equal to HugePages_Total.

           [root@rheloracle ~]# grep -i huge /proc/meminfo
           AnonHugePages:         0 kB
           HugePages_Total:     777
           HugePages_Free:      777
           HugePages_Rsvd:        0
           HugePages_Surp:        0
           Hugepagesize:       2048 kB


Looks like HugePage configuration at db level and o/s level are in sync and in use. But all the examples and documents (i have referred) indicate that HugePages_Rsvd should have a non zero value after enabling huge pages. But same is not happening in my case. Can you please suggest if I am missing something or it's normal to have HugePages_Rsvd 0.

(I am running oracle 18c xpress edition on RHEL6)



        

                                

Prem (243 rep)

Aug 20, 2020, 05:00 AM • Last activity: Jan 26, 2021, 09:07 PM

1 votes

1 answers

301 views

Linux use huge pages only

linux linux-kernel virtual-memory huge-pages

I have a x64 Linux system. The page size reported by `getconf` is 4 k: ```bash $ getconf PAGESIZE 4096 ``` I want the kernel to use only large pages (2 M or 4 M) for all memory allocations. I've calculated that I have enough RAM to handle the memory that will be wasted because of this. How do I...

I have a x64 Linux system. The page size reported by getconf is 4 k:

$ getconf PAGESIZE
4096

I want the kernel to use only large pages (2 M or 4 M) for all memory allocations. I've calculated that I have enough RAM to handle the memory that will be wasted because of this. How do I configure the Linux kernel so that it uses large pages for all allocations?

fctorial (203 rep)

Dec 31, 2020, 12:03 AM • Last activity: Dec 31, 2020, 08:56 AM

3 votes

2 answers

1182 views

Discover huge page support on POSIX or Linux

linux posix virtual-memory huge-pages

I'm working on a program which needs to detect at runtime whether the system it's running on supports hugepages, and if so, what sizes are available. Ideally I'd like this to work for any POSIX platform, but a Linux-specific solution would be a start. POSIX supports [`sysconf(_SC_PAGESIZE)`](http://...

                                  I'm working on a program which needs to detect at runtime whether the system it's running on supports hugepages, and if so, what sizes are available. Ideally I'd like this to work for any POSIX platform, but a Linux-specific solution would be a start.

POSIX supports [sysconf(_SC_PAGESIZE)](http://man7.org/linux/man-pages/man3/sysconf.3.html)  to get the default page size on the platform, but doesn't seem to similarly support asking for any hugepage sizes. I could also potentially check by trying to [mmap](http://man7.org/linux/man-pages/man2/mmap.2.html)  MAP_HUGE_2MB or MAP_HUGE_1GB arguments, but that would be slow, and in the case of 1GB huge pages, incredibly wasteful (and it could easily fail due to a lack of available memory).

joshlf (395 rep)

May 11, 2017, 11:01 PM • Last activity: Dec 30, 2020, 10:27 PM

0 votes

1 answers

678 views

benefits of allocating huge pages at boot

linux memory huge-pages

[ moving the question from StackOverflow where it seems less appropriate ] The kernel boots with `default_hugepagesz=1G` option, which defines size of the default page size. So when an application want large memory, the kernel will allocate it with 1G pages. If the kernel boots with `hugepages=N`, i...

                                  [ moving the question from StackOverflow where it seems less appropriate ]

The kernel boots with default_hugepagesz=1G option, which defines size of the default page size. So when an application want large memory, the kernel will allocate it with 1G pages.

If the kernel boots with hugepages=N, i.e. allocate N huge pages at boot. So in this case, the kernel will automatically take a page from this pool, thus saving time on allocating memory?

When this pool runs out of available pages, how will the kernel allocate huge memory?

Mark (1943 rep)

Dec 30, 2020, 05:11 PM • Last activity: Dec 30, 2020, 07:19 PM

0 votes

1 answers

2251 views

Is it possible to disable Transparent Huge pages on the fly?

linux rhel kernel huge-pages

In order to disable the `THP` We did the following on all 635 `RHEL` machines (we have `rhel 7.5` version) This lines are from bash script that we runs on all machines **Step 1** [[ -f /sys/kernel/mm/transparent_hugepage/enabled ]] && echo never > /sys/kernel/mm/transparent_hugepage/enabled [[ -f /s...

                                  In order to disable the THP 

We did the following on all 635 RHEL machines (we have rhel 7.5 version)

This lines are from bash script that we runs on all machines 

**Step 1**
    
    [[ -f /sys/kernel/mm/transparent_hugepage/enabled ]] && echo never > /sys/kernel/mm/transparent_hugepage/enabled
    [[ -f /sys/kernel/mm/transparent_hugepage/defrag  ]] && echo never > /sys/kernel/mm/transparent_hugepage/defrag

*Verification:*

    cat /sys/kernel/mm/transparent_hugepage/enabled
    always madvise [never]

    cat /sys/kernel/mm/transparent_hugepage/defrag
    always madvise [never]


But as all know this steps are not considered when machine restarted/rebooted

**Step 2**

So we also did this , we append the following lines to /etc/rc.local

    if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
       echo never > /sys/kernel/mm/transparent_hugepage/enabled
    fi
    if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
       echo never > /sys/kernel/mm/transparent_hugepage/defrag
    fi

**The question is:**

Does step 1 as I mentioned above really disabled the THP on the fly? 

Note - also other info from one typical machine

    sysctl -a | grep hugepage
    vm.hugepages_treat_as_movable = 0
    vm.nr_hugepages = 0
    vm.nr_hugepages_mempolicy = 0
    vm.nr_overcommit_hugepages = 0

reference - [Configuring Transparent Huge Pages](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-configuring_transparent_huge_pages) .

                                

yael (13936 rep)

Feb 20, 2020, 02:04 PM • Last activity: Feb 20, 2020, 05:13 PM

1 votes

2 answers

3361 views

Zend OPcache huge_code_pages: madvise(HUGEPAGE) failed

centos php huge-pages php-opcache

I've got this error while running a PHP command-line script on a freshly installed server: > PHP Warning: Zend OPcache huge_code_pages: madvise(HUGEPAGE) failed: Invalid argument The server is running CentOS 7.3, with PHP 7.1.4 from the [remi repository][1]. According to [this thread on remi forum][...

                                  I've got this error while running a PHP command-line script on a freshly installed server:

> PHP Warning: Zend OPcache huge_code_pages: madvise(HUGEPAGE) failed: Invalid argument

The server is running CentOS 7.3, with PHP 7.1.4 from the remi repository .

According to this thread on remi forum , and this thread on plesk.com , the solution is to disable huge_code_pages in php-opcache.ini:

    opcache.huge_code_pages=0

However, Remi said  that this problem should only occur on CentOS 6, not CentOS 7.

Before I disable huge_code_pages for good, **is there a solution to make it work?**

BenMorel (4849 rep)

May 5, 2017, 03:27 PM • Last activity: Dec 15, 2019, 12:30 AM

1 votes

0 answers

418 views

Using libhugetlbfs to transparently back up glibc malloc calls in a multi-threaded application

linux glibc multithreading huge-pages

I'm trying to back up the memory allocations of a multi-threaded application with 1GiB hugepages using libhugetlbfs. However, only the main thread allocations are being assigned hugepages. If I restrict the maximum number of Glibc malloc arenas to 1, all the allocations of all threads are backed up with hugepages. This is not ideal due to the introduced contention of concurrently accessing a single arena. Is there any way to transparently force all threads to use huge pages by means of libhugetlbfs? **Note**: I'm aware of transparent huge pages (THP). However, allocations smaller than 1GiB are not automatically assigned hugepages. Smaller pages will only be compacted into bigger pages when the khugepaged kernel thread process them, which is something I would not like to rely on. Ideally, I would like all malloc calls to be serviced using huge pages even if the allocations are small. This is useful for a applications that do a lot of small allocations. Experimentation =============== These are the steps that I have followed to set up 1GiB hugepages:

sudo mkdir /dev/hugepages1G
sudo mount -t hugetlbfs -o uid=,pagesize=1g,min_size=50g none /dev/hugepages1G
sudo hugeadm --pool-pages-min 1G:50

I'm using the dummy application below for testing. The main thread allocates and initializes 1GiB of memory. Then, it creates three pthreads, each of which allocates and initializes 10GiB of memory.

#include 
#include 
#include 
#include 
#include 
#include 
#include 


void *iamathread(void *data)
{
	char *addr;
	char dummy;
	size_t size, i;

	size = 10*1024*1024*1024UL;

	pid_t x = syscall(__NR_gettid);

	addr = malloc(size);
	if (!addr) {
		perror("cannot allocate memory");
		pthread_exit(NULL);
	}

	memset(addr, 1, size);

	printf("%d:\t sleeping\n", x);
	sleep(1000000U);

    return NULL;
}

int main(int argc, char *agv[])
{
	char *addr;
	char dummy;
	size_t size, i;
	int npt;

	npt = 3;
	size = 1*1024*1024*1024UL;

	pthread_t pt[npt];

	for (i = 0; i < npt; i++) {
		if (pthread_create(&pt[i], NULL, iamathread, NULL)) {
			fprintf(stderr, "Error creating thread\n");
			return 1;
		}
	}

	pid_t x = syscall(__NR_gettid);
	printf("%d:\t I'm main\n", x);

	addr = malloc(size);
	if (!addr) {
		perror("cannot allocate memory");
		return 1;
	}

	memset(addr, 1, size);

	printf("Press any key to exit and release memory\n");
	scanf("%c", &dummy);

	return 0;
}

I have created the following script to count the number of pages per page size used by an application:

#!/usr/bin/bash

PID=$1

awk '
BEGIN {
	tmp_size = -1
}

$1 == "Size:" {
	tmp_size = $2
	next
}

$1 == "KernelPageSize:" {
	page_size = $2
	vmas[page_size]["count"] += 1
	vmas[page_size]["pages"] += tmp_size/page_size

	tmp_size = -1
	next
}

END {
	for (key in vmas) {
		print(key " KiB VMAs: " vmas[key]["count"])
	}
	for (key in vmas) {
		print(key " KiB num pages: " vmas[key]["pages"])
	}
}

' /proc/$PID/smaps

And these are the results obtained when running with and without the MALLOC_ARENA_MAX environment variable to limit the number of arenas:

$ LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=1G HUGETLB_PATH=/dev/hugepages1G ./main &
$ hugepagecount.sh pgrep main
4 KiB VMAs: 41
1048576 KiB VMAs: 2
4 KiB num pages: 7922277
1048576 KiB num pages: 2

$ MALLOC_ARENA_MAX=1 LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=1G HUGETLB_PATH=/dev/hugepages1G ./main &
$ hugepagecount.sh pgrep main
4 KiB VMAs: 37
1048576 KiB VMAs: 5
4 KiB num pages: 8802
1048576 KiB num pages: 32

When not limiting the number of arenas, only 2 1GiB (1048576 KiB) pages are allocated. Instead, when forcing a single arena, 32 1GiB pages are allocated.

aleixrocks (305 rep)

Dec 3, 2019, 07:55 AM • Last activity: Dec 3, 2019, 08:27 AM

Showing page 1 of 20 total questions