Loading...
1.. SPDX-License-Identifier: GPL-2.0
2
3Orphan file
4-----------
5
6In unix there can inodes that are unlinked from directory hierarchy but that
7are still alive because they are open. In case of crash the filesystem has to
8clean up these inodes as otherwise they (and the blocks referenced from them)
9would leak. Similarly if we truncate or extend the file, we need not be able
10to perform the operation in a single journalling transaction. In such case we
11track the inode as orphan so that in case of crash extra blocks allocated to
12the file get truncated.
13
14Traditionally ext4 tracks orphan inodes in a form of single linked list where
15superblock contains the inode number of the last orphan inode (s_last_orphan
16field) and then each inode contains inode number of the previously orphaned
17inode (we overload i_dtime inode field for this). However this filesystem
18global single linked list is a scalability bottleneck for workloads that result
19in heavy creation of orphan inodes. When orphan file feature
20(COMPAT_ORPHAN_FILE) is enabled, the filesystem has a special inode
21(referenced from the superblock through s_orphan_file_inum) with several
22blocks. Each of these blocks has a structure:
23
24============= ================ =============== ===============================
25Offset Type Name Description
26============= ================ =============== ===============================
270x0 Array of Orphan inode Each __le32 entry is either
28 __le32 entries entries empty (0) or it contains
29 inode number of an orphan
30 inode.
31blocksize-8 __le32 ob_magic Magic value stored in orphan
32 block tail (0x0b10ca04)
33blocksize-4 __le32 ob_checksum Checksum of the orphan block.
34============= ================ =============== ===============================
35
36When a filesystem with orphan file feature is writeably mounted, we set
37RO_COMPAT_ORPHAN_PRESENT feature in the superblock to indicate there may
38be valid orphan entries. In case we see this feature when mounting the
39filesystem, we read the whole orphan file and process all orphan inodes found
40there as usual. When cleanly unmounting the filesystem we remove the
41RO_COMPAT_ORPHAN_PRESENT feature to avoid unnecessary scanning of the orphan
42file and also make the filesystem fully compatible with older kernels.
1.. SPDX-License-Identifier: GPL-2.0
2
3Orphan file
4-----------
5
6In unix there can inodes that are unlinked from directory hierarchy but that
7are still alive because they are open. In case of crash the filesystem has to
8clean up these inodes as otherwise they (and the blocks referenced from them)
9would leak. Similarly if we truncate or extend the file, we need not be able
10to perform the operation in a single journalling transaction. In such case we
11track the inode as orphan so that in case of crash extra blocks allocated to
12the file get truncated.
13
14Traditionally ext4 tracks orphan inodes in a form of single linked list where
15superblock contains the inode number of the last orphan inode (s_last_orphan
16field) and then each inode contains inode number of the previously orphaned
17inode (we overload i_dtime inode field for this). However this filesystem
18global single linked list is a scalability bottleneck for workloads that result
19in heavy creation of orphan inodes. When orphan file feature
20(COMPAT_ORPHAN_FILE) is enabled, the filesystem has a special inode
21(referenced from the superblock through s_orphan_file_inum) with several
22blocks. Each of these blocks has a structure:
23
24============= ================ =============== ===============================
25Offset Type Name Description
26============= ================ =============== ===============================
270x0 Array of Orphan inode Each __le32 entry is either
28 __le32 entries entries empty (0) or it contains
29 inode number of an orphan
30 inode.
31blocksize-8 __le32 ob_magic Magic value stored in orphan
32 block tail (0x0b10ca04)
33blocksize-4 __le32 ob_checksum Checksum of the orphan block.
34============= ================ =============== ===============================
35
36When a filesystem with orphan file feature is writeably mounted, we set
37RO_COMPAT_ORPHAN_PRESENT feature in the superblock to indicate there may
38be valid orphan entries. In case we see this feature when mounting the
39filesystem, we read the whole orphan file and process all orphan inodes found
40there as usual. When cleanly unmounting the filesystem we remove the
41RO_COMPAT_ORPHAN_PRESENT feature to avoid unnecessary scanning of the orphan
42file and also make the filesystem fully compatible with older kernels.