How does mount load the right kernel module?

On any recent Linux distro, you can mount any filesystem type directly. For example:

# dd if=/dev/zero of=/tmp/test.img bs=4k count=4096
# mkfs.xfs /tmp/test.img
# mount -v -o loop /tmp/test.img /mnt/tmp

The mount command works even though I didn’t have the xfs.ko kernel module loaded, and I didn’t tell mount that it’s xfs.

How does it do that? I asked around several people at work and no one could give me the correct answer. So in this article I’ll describe exactly how it works.

First of all, I’ll mention two wrong answers to this: (a) the kernel doesn’t have a magic “mount any filesystem” syscall, and (b) it’s nothing to do with either /proc/filesystems or /etc/filesystems.

For (a), the mount(2) syscall clearly takes a filesystem type (string). As for (b), /proc/filesystems only lists filesystems which are known to the kernel already, ie. ones for which we’ve already loaded the right module. Since I didn’t have the xfs module loaded, xfs wasn’t listed in /proc/filesystems before I ran the mount command.

This should be enough of a clue that there must be some utility in userspace which knows how to probe the type by just looking at the header of any arbitrary filesystem. This utility is blkid, which used to be part of e2fsprogs but has now been combined with util-linux-ng.

blkid can probe a filesystem that it has not seen before and tell what type it is:

# blkid /tmp/test.img 
/tmp/test.img: UUID="c80ebc11-3b26-4b93-acbb-f52bdfaa9ac5" TYPE="xfs" 

Looking at the source for blkid confirms there is a directory full of probe tools for every conceivable filesystem.

The mount utility calls out to blkid — actually to the libblkid library, not to the command line tool, but it comes to the same thing.

So /bin/mount knows what it’s mounting, and requests the “xfs” filesystem type when it issues the system call into the kernel.

That still leaves the question of how the xfs module gets loaded. The answer is that the mount syscall eventually calls the kernel function __request_module. This strange function actually calls out to the userspace /sbin/modprobe binary, causing the module to get loaded. Meanwhile the mount syscall itself is paused. And yes, it even deals with the recursive situation where modprobe might need to mount filesystems or load other modules in order to succeed.

So there you have it, mounting a filesystem can magically load the right kernel module for that filesystem. All done using some userspace probing and some kernel trickery.

About these ads

6 Comments

Filed under Uncategorized

6 responses to “How does mount load the right kernel module?

  1. Nice post. I didn’t know blkid!

    Thank you for sharing.

  2. What a coincidence! I looked up blkid command today to fetch the UUIDs of my external HDD partitions.
    Nice post. Very informative. Thanks.

  3. Good post.

    Personally I’m a big fan of the modalias system used to automatically load appropriate modules for the hardware in your system. I think few people realize the kernel has been doing this for a few years now, whereas previously distributions had gigantic lookup tables which were used for generating static modprobe.conf files…and probably even fewer follow the whole chain the modalias system uses. Which is great, because it means it just works and mostly people don’t need to worry about it. But it’s awesomely elegant, IMO.

  4. Any idea on how 2.4 kernels used to this? To my knowledge blkid is relatively recent effort…

    • rich

      No, I’ve no idea. The reason we looked at this was because our RHEL 3 test machine didn’t autoload a filesystem. Maybe Linux 2.4 didn’t do this.

  5. vicky

    Very informative post…I also have a similar requirement to load a cifs fs. Currently it is not loading at the mount time. Hope this info will help me out

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s