This is fun. I added a new command to guestfish which lets you create sparse disk files. This makes it really easy to test out the limits of partitions and Linux filesystems.
Starting modestly, I tried a 1 terabyte disk:
$ guestfish Welcome to guestfish, the libguestfs filesystem interactive shell for editing virtual machine filesystems. Type: 'help' for help with commands 'quit' to quit the shell ><fs> sparse /tmp/test.img 1T ><fs> run
The real disk image so far isn’t so big, just 4K according to “du”:
$ ll -h /tmp/test.img -rw-rw-r-- 1 rjones rjones 1T 2009-11-04 17:52 /tmp/test.img $ du -h /tmp/test.img 4.0K /tmp/test.img
Let’s partition it:
><fs> sfdiskM /dev/vda ,
The partition table only uses 1 sector, so the disk image has increased to just 8K. Let’s make an ext2 filesystem on the first partition:
><fs> mkfs ext2 /dev/vda1
This command takes some time, and the sparse disk file has grown. To 17 GB, so ext2 has an approx 1.7% overhead.
We can mount the filesystem and look at it:
><fs> mount /dev/vda1 / ><fs> df-h Filesystem Size Used Avail Use% Mounted on /dev/vda1 1008G 72M 957G 1% /sysroot
Can we try this with larger and larger virtual disks? In theory yes, in practice the 1.7% overhead proves to be a problem. A 10T experiment would require a very real 170GB of local disk space, and where I was hoping to go, 100T and beyond, would be too large for my test machines.
In fact there is another limitation before we reach there. Local sparse files on my host ext4 filesystem are themselves limited to under 16T:
><fs> sparse /tmp/test.img 16T write: File too large ><fs> sparse /tmp/test.img 15T
Although the appliance does boot with that 15T virtual disk:
><fs> blockdev-getsize64 /dev/vda 16492674416640
Update
I noticed from Wikipedia that XFS has a maximum file size of 8 exabytes – 1 byte. By creating a temporary XFS filesystem on the host, I was able to create a 256TB virtual disk:
><fs> sparse /mnt/tmp/test/test.img 256T ><fs> run ><fs> blockdev-getsize64 /dev/vda 281474976710656
Unfortunately at this point things break down. MBR partitions won’t work on such a huge disk, or at least sfdisk can’t partition it correctly.
I’m not sure what my options are at this point, but at least this is an interesting experiment in hitting limitations.
You can partition it using GPT.
parted /dev/vda mklabel gpt
parted /dev/vda mkpart p …..
Yeah, we really really need to implement parted support in libguestfs. This issue makes it more pressing. sfdisk just doesn’t cut it.
That’s correct > 2TB file systems is normal these days.
parted is the way to go.
Pingback: Filesystem metadata overhead « Richard WM Jones