Sample Header Ad - 728x90

Is it a good idea to have inode size of 2048 bytes + inline_data on an ext4 filesystem?

3 votes
1 answer
70 views
I only recently "discovered" the inline_data feature of ext4, although it seems to have been around for 10+ years. I ran a few statistics on various of my systems (desktop/notebook + server), specifically on the root filesystems, and found out that: - Around 5% of all files are < 60 bytes in size. The 60 byte threshold is relevant, because that's how much inline data you can fit in a standard 256 byte inode - Another ~20-25% of files are between 60 and 896 bytes in size. Again, the "magic number" 896 is how much you fit in a 1KB inode - Further 20% are in the 896-1920 byte range (you guess it - 1920 is what you fit into a 2KB inode) - That percentage is even more stunning for directories - 30-35% are below 60 bytes, and further 60% are below 1920 bytes. This means that with an inode size of 2048 bytes you can ***inline roughly half of all files and 95% of all directories on an average root filesystem***! This came as quite a shocker to me... Now, of course since inodes are preallocated and fixed for the lifetime of a filesystem, large inodes lead to a lot of "wasted" space, if you have a lot of them (i.e. a low inode_ratio setting). But then again, allocating a 4KB block for a 5 byte file is also a waste of space. And according to above statistic, half of all files on the filesystem and virtually all directories can't even fill half of a 4KB block, so that wasted space is not insignificant. The only difference between wasting that space in the inode table and in the data blocks is that you have one more level of indirection, plus potential for fragmentation, etc. The advantages I see in that setup are: - When the kernel loads an inode, it reads at least one page size (4KB) from disk, no matter if the inode is 128 bytes or 2KB, so you have zero overhead in terms of raw disk IO... - ... but you have the data preloaded as soon as you stat the file, no additional IO needed to read the contents - The kernel caches inodes more aggressively than data blocks, so inlined data is more likely to stay longer in cache - Inodes are stored in a fixed, contiguous region of the partition, so you can't ever have fragmentation there - Inlining is especially useful for directories, a) since such a high portion of them are small, and b) because you're very likely to need the contents of the directory, so having it preloaded makes a lot of sense What do you think about this setup? Am I missing something here, and are there some potential risks I don't see? I stress again that I'm talking about a root filesystem, hosting basically the operating system, config files, and some caches and logs. Obviously the picture would be quite different for a /home partition hosting user directories, and even more different for a fileserver, webserver, mailserver, etc. (I know there are a few threads describing some corner cases where inline_data does not play well with journaling, but those are 5+ years old, so I hope those issues have been sorted out.) **EDIT**: Since there are doubts expressed in the comments if directory inlining works - it does. I have already implemented the setup described here, and the machine I'm writing on right now actually is running on a root filesystem with 2KB inodes with inlining. Here's what /usr looks like in ls:
`
# ls -l /usr
total 160
drwxr-xr-x   2 root root 36864 Jul  1 00:35 bin
drwxr-xr-x   2 root root    60 Mar  4 13:20 games
drwxr-xr-x   4 root root  1920 Jun 16 21:32 include
drwxr-xr-x  64 root root  1920 Jun 25 21:16 lib
drwxr-xr-x   2 root root  1920 Jun  9 01:48 lib64
drwxr-xr-x  16 root root  4096 Jun 22 02:58 libexec
drwxr-xr-x  11 root root  1920 Jun  9 00:10 local
drwxr-xr-x   2 root root 12288 Jun 26 20:22 sbin
drwxr-xr-x 191 root root  4096 Jun 26 20:22 share
drwxr-xr-x   2 root root    60 Mar  4 13:20 src
` And if you dive even deeper and use debuge2fs to examine those directories, the ones having 60 or 1920 byte size have 0 allocated data blocks, while those having 4096 and more do have data blocks.
Asked by Mike (477 rep)
Jul 1, 2025, 02:05 PM
Last activity: Jul 2, 2025, 04:28 PM