HugePage: Enable for PostgreSQL

How to allocate precise huge pages for PostgreSQL clusters?

Pros and Cons of Huge Pages

For databases, enabling huge pages has both advantages and disadvantages.

Advantages:

  • Significant performance benefits in OLAP scenarios: large data scanning and batch computing
  • More controlled memory allocation model: “lock” required memory at startup
  • Improved memory access efficiency, reduced TLB misses
  • Lower kernel page table maintenance overhead

Disadvantages:

  • Additional configuration and maintenance complexity
  • Locked huge page memory, less flexibility for systems requiring high resource elasticity
  • Limited benefits in small memory scenarios, potentially counterproductive

Note: Huge Pages and Transparent Huge Pages (THP) are different concepts. Pigsty will force disable Transparent Huge Pages to follow database best practices.


When to Enable Huge Pages?

We recommend enabling huge pages if your scenario meets the following conditions:

  • OLAP analytical workloads
  • Memory exceeding several tens of GB
  • PostgreSQL 15+
  • Linux kernel version > 3.10 (> EL7, > Ubuntu 16)

Pigsty does not enable huge pages by default, but you can easily enable them through configuration and dedicate them to PostgreSQL.


Allocate Node Huge Pages

To enable huge pages on a node, users can use these two parameters:

Choose one of these parameters. You can either specify the exact number of (2MB) huge pages or the memory ratio to be allocated as huge pages (0.00 - 0.90), with the former having higher priority.

node_hugepage_count: 0            # Precisely specify number of 2MB huge pages, higher priority
node_hugepage_ratio: 0            # Memory ratio allocated as 2MB huge pages, lower priority

Apply changes:

./node.yml -t node_tune

Essentially, this writes the vm.nr_hugepages parameter to /etc/sysctl.d/hugepage.conf and executes sysctl -p to apply changes.

./node.yml -t node_tune -e node_hugepage_count=3000    # Allocate 3000 2MB huge pages (6GB)
./node.yml -t node_tune -e node_hugepage_ratio=0.30    # Allocate 30% of memory as huge pages

Note: These parameters enable huge pages for the entire node, not just PostgreSQL.

PostgreSQL server attempts to use huge pages by default at startup. If insufficient huge pages are available, PostgreSQL will continue to start using regular pages.

When reducing huge page count, only unused and free huge pages will be released. Used huge pages will be released after process termination.

Pigsty allows allocating up to 90% of memory as huge pages, but for PostgreSQL databases, a reasonable range is typically 25% - 40% of memory.

We recommend setting: node_hugepage_ratio=0.30 and further adjusting huge page count after PostgreSQL startup as needed.


Monitor Huge Page Status

The most intuitive way is to use Pigsty’s monitoring system. Here’s a sample monitoring chart when adjusting huge pages:

  1. Default state
  2. Huge pages enabled, unused
  3. PG restarted, using/reserving some huge pages
  4. Further PG usage, using more huge pages
  5. Reduced huge page count, reclaimed unused pages
  6. PG restarted, completely released reserved pages

You can directly check huge page status with cat /proc/meminfo | grep Huge:

$ cat /proc/meminfo  | grep Huge

By default, with no huge pages enabled, the total count is 0:

AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB

With huge pages enabled, showing 6015 total huge pages, all free and available:

AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    6015
HugePages_Free:     6015
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        12318720 kB

If PostgreSQL is restarted now (it attempts to use huge pages by default):

sudo su - postgres
pg-restart

PostgreSQL will reserve (Rsvd, Reserved) the required huge pages for shared buffers, for example reserving 5040 here:

AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    6015
HugePages_Free:     5887
HugePages_Rsvd:     5040
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        12318720 kB

If we add some load to PostgreSQL, like pgbench -is10, PostgreSQL will start using more huge pages (Alloc = Total - Free).

Note: Once huge pages are (allocated or reserved), even if you reduce the system’s vm.nr_hugepages parameter, these pages will remain reserved until used up. Therefore, to truly reclaim these huge pages, you need to restart the PostgreSQL service.

./node.yml -t node_tune -e node_hugepage_count=3000    # Allocate 3000 huge pages

Precise Huge Page Allocation

Before PostgreSQL starts, you need to allocate sufficient huge pages, otherwise PostgreSQL won’t be able to use them.

In Pigsty, the default SharedBuffer doesn’t exceed 25% of memory, so you can allocate 26% ~ 27% of memory as huge pages to ensure PostgreSQL can use them.

node_hugepage_ratio: 0.27  # Allocate 27% of memory as huge pages, definitely enough for PG

If you don’t mind slight resource waste, you can directly allocate around 27% of memory as huge pages.


Reclamation Script

After PG starts, you can query the actual number of huge pages PostgreSQL uses with this SQL:

SHOW shared_memory_size_in_huge_pages;

Finally, you can precisely specify the required number of huge pages:

node_hugepage_count: 3000   # Precisely allocate 3000 2MB huge pages (6GB)

However, to precisely count the required number of huge pages without missing any, you typically need to wait until the PostgreSQL server starts.

A compromise approach is to:

  1. Initially over-allocate huge pages
  2. Start PostgreSQL
  3. Query the exact required number from PG
  4. Then precisely modify the huge page count

Dedicate Huge Pages to PG

By default, all processes can use huge pages. If you want to restrict huge page usage to PostgreSQL only, you can modify the vm.hugetlb_shm_group kernel parameter.

You can adjust the node_sysctl_params parameter to include PostgreSQL’s GID.

node_sysctl_params:
  vm.hugetlb_shm_group: 26
node_sysctl_params:
  vm.hugetlb_shm_group: 543

Note: Default PostgreSQL UID/GID values differ between EL/Debian, being 26 and 543 respectively (can be explicitly modified via pg_dbsu_uid)

To remove this change:

sysctl -p /etc/sysctl.d/hugepage.conf

Quick Adjustment Script

The pg-tune-hugepage script can reclaim wasted huge pages, but it’s only available for PostgreSQL 15+.

If your PostgreSQL is already running, you can enable huge pages (PG15+ only) using:

sync; echo 3 > /proc/sys/vm/drop_caches   # Flush disk, free system cache (be prepared for performance impact)
sudo /pg/bin/pg-tune-hugepage             # Write nr_hugepages to /etc/sysctl.d/hugepage.conf
pg restart <cls>                          # Restart postgres to use hugepage

Sample output of executing pg-tune-hugepage:

$ /pg/bin/pg-tune-hugepage
[INFO] Querying PostgreSQL for hugepage requirements...
[INFO] Added safety margin of 0 hugepages (5168 → 5168)
[INFO] ==================================
PostgreSQL user: postgres
PostgreSQL group ID: 26
Required hugepages: 5168
Configuration file: /etc/sysctl.d/hugepage.conf
[BEFORE] ================================
Current memory information:
AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:   10025
HugePages_Free:     9896
HugePages_Rsvd:     5039
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        20531200 kB
Current sysctl settings:
vm.hugetlb_shm_group = 26
vm.nr_hugepages = 10025
vm.nr_hugepages_mempolicy = 10025
[EXECUTE] ===============================
Writing new hugepage configuration...
Applying new settings...
vm.nr_hugepages = 5168
vm.hugetlb_shm_group = 26
[AFTER] =================================
Updated memory information:
AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    5168
HugePages_Free:     5039
HugePages_Rsvd:     5039
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        10584064 kB
Updated sysctl settings:
vm.hugetlb_shm_group = 26
vm.nr_hugepages = 5168
vm.nr_hugepages_mempolicy = 5168
[DONE] ==================================
PostgreSQL hugepage configuration complete.

Consider adding the following to your inventory file:
node_hugepage_count: 5168
node_sysctl_params: {vm.hugetlb_shm_group: 26}

References


Last modified 2025-03-22: update task tutorial (a20aa5b)