febootstrap now includes an image minimization tool which can remove some “non-essential” data, such as locales. (You can configure exactly what it removes).
The bootable minimal install is now around 16 MB. There’s still room for improvement, for example by removing shared libraries that are never used.
Even my realistic image, which includes LVM, NTFS, NFS server and rescue-disk utilities, isn’t too bad, weighing in comfortably under 32 MB:
Holy crap that’s pretty amazing.
Question 1: what is the high watermark on disk space needed during the whole process? Package caching included.
Question 2: Now that you have identified non-essential data. Can packages be adjusted so that some of that post-install pruning can be avoided by never downloading that data to begin with? Is there some obvious candidates for sub-packaging?
Question 3: How much of this pruning survives updating the image with package component updates? This bares on question 2.
-jef
Jef, I was going to have a really detailed answer, but it’s probably better just to look at the script itself:
http://www.annexia.org/tmp/febootstrap-minimize.sh.txt
As you can see, some of these render the question of updates meaningless – eg. ripping out the RPM database.
Some of them are only appropriate for non-interactive cases (dropping locales, timezones).
And a few of them would be appropriate for either subpackages or even bug reports, glibc’s massive locale cache which I covered before as one example. The coreutils -> pam -> cracklib dependency is another.
Once the image gets very small, even relatively small files start to consume a significant percentage of the space. For example
/etc/services
is 409K in Fedora 10, and that’s a few percent of a 16 MB image. The minimization script replaces it, but equally we could consider compressing it (gzipped 115K) or removing some obsolete entries.I’ve done this exercise a couple times in the past. Once with Fedora 6 (for an appliance which needed a JVM and only had 250MB of flash in it) and again for another project using CentOS 5. Both times I rebuilt glibc-common without all of the locale and i18n garble-de-gook. Would be very nice if this stuff was in a separate subpackage already.
WRT cracklib and things like it, it is a perfect example of why implementing “soft requires” in RPM would be nice/helpful.
Another suggestion of something I’d like to see is the use of a %{minimal} macro of some sort that could be honored in spec files. Then one can set it to true when rpmbuild-ing packages. i.e. When %{minimal} is true, don’t include docs in the build of the package. Or don’t require cracklib. Or require/build against SELinux libs, or kerberos libs, or whatnot.
yeah…ripping out the rpmdb sort of makes traditional updating a pointless question.
I guess I need to understand the use case better.
I would imagine even for non-interactive customized virtual appliance instances that do useful work that you might want to build up and tear down, there’s going to be a need to refresh that image with new components on occasion to fix or extend functionality..essentially versioning the image on the self.
-jef
Jef, I’m hoping in that case to just build a new image. That’s why keeping the time taken to build images right down is important. So is keeping the image size small, and making it boot “instantly”.
I’m going to talk about the use case (which is very specialized indeed) in some upcoming postings, and I don’t want to spoil the surprise 🙂
This is some nice work. I’ve only been reading planet fedora for a few days, but I’m happy I joined in time to read this.
Is there an easy way to try febootstrap on another distro (preferably gentoo or ubuntu)? Or do I need half of fedora to get it work?
Michael, on Ubuntu/Debian you should probably just use debootstrap. Not sure about Gentoo though.
I meant to bootstrap a minimal fedora system, not to build a minimal version of some other distro. I’m still looking for a nice binary-based small setup that might be usable on stuff like the Sheevaplug, and this also seems like a nice chance to play with fedora.
Michael, I see. Very much depends on whether yum has been ported to Debian. If it has, then it should “just work”.
Try UPX, maybe it would be usefull.
Yes UPX is a good idea. Although the whole image gets compressed for the initrd.img, it is uncompressed in memory when the machine boots. So compressing binaries is useful.
I’ve tried it and it’s not a big win.
For my custom livecd I told upx to compress all executables that exist on the livecd. This reduced the size of the internal ext3 image by around 50MB but since this gets compressed inside squasfs image the final result was the same. My livecd iso compressed with upx and uncompressed was 81MB both times.
When are you planning to put this up for review? Without being in the repository, the visibility of this tool is far less.
Rahul, soon, in a few days probably. Specifically I need to sort out the fakechroot patch first.
Pingback: Size of RPM dependencies « Richard WM Jones
Pingback: libguestfs: Access and modify virtual machine disk images « Richard WM Jones
I have installed Fedora 11 and am using KVM on a HP Pavillion laptop that supports amd-v. This configuration is very similar to running VMWare workstation. Ultimately, I would prefer to install a minimum bare metal hypervisor like ESX and run my application based OS’ as guests. I am intrigued by the Febootstrap as a candidate for this. I could install the bare metal OS and drivers then boot, use the command line to install my virtual appliance guest (Fedora 12) and once up and running use a full Fedora guest to connect back to the bare metal hypervisor for monitoring and control. Any advice on the best way to proceed?
Performance wise I am not yet impressed with KVM. Though I easily converted my VMWare images to KVM the performance on my amd-v based computer is not as good as a non-kvm x32 processor of less power running VMWare workstation. My processor is a 2.0 GHz 64x dual core RM-70 Turion. Currently, one guest OS is about all it can muster. Is there room for more performance improvement in KVM?
PJAL:
You need to install virtio drivers in the guest to get good performance from KVM.
Any other questions should go to the fedora-virt mailing list.
Richard,
on a system that has Perl you can delete the pod files as well
/usr/lib/perl5/5.*/pod
This saved me about 5MB.
There could be some others as well, I have to check.
If you have Python you can delete all *.pyc and *.pyo files and create symlinks to /dev/null. This is used by anaconda.
This saved another 2MB approximately for me.
Hi
Would it make sense to run febootstrap and create instances of fedora on a server, then serving the kernel and ramdisk images (created with dracut after chrooting into the minimal systems created by febootstrap) to diskless nodes?
Thanks
…sorry….. and of course nfs-mount the create fc dirs to the diskless nodes.
I guess in any case your question refers to the old febootstrap (2.x). The new febootstrap only creates supermin appliances, which are tied to the host distro and version.
https://rwmj.wordpress.com/2009/10/22/supermin-appliance-now-in-febootstrap/
https://rwmj.wordpress.com/2010/12/10/tip-creating-throwaway-appliances-with-febootstrap/