https://github.com/ocaml/ocaml/commit/8f3833c4d0ef656c826359f4137c1eb3d46ea0ef
We’ve been using this patch in Fedora since Nov 2016.
https://github.com/ocaml/ocaml/commit/8f3833c4d0ef656c826359f4137c1eb3d46ea0ef
We’ve been using this patch in Fedora since Nov 2016.
Filed under Uncategorized
libguestfs is a C library for creating and editing disk images. In the most common (but not the only) configuration, it uses KVM to sandbox access to disk images. The C library talks to a separate daemon running inside a KVM appliance, as in this Unicode-art diagram taken from the fine manual:
┌───────────────────┐ │ main program │ │ │ │ │ child process / appliance │ │ ┌──────────────────────────┐ │ │ │ qemu │ ├───────────────────┤ RPC │ ┌─────────────────┐ │ │ libguestfs ◀╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍▶ guestfsd │ │ │ │ │ ├─────────────────┤ │ └───────────────────┘ │ │ Linux kernel │ │ │ └────────┬────────┘ │ └───────────────│──────────┘ │ │ virtio-scsi ┌──────┴──────┐ │ Device or │ │ disk image │ └─────────────┘
The library has to be written in C because it needs to be linked to any main program. The daemon (guestfsd
in the diagram) is also written in C. But there’s not so much a specific reason for that, except that’s what we did historically.
The daemon is essentially a big pile of functions, most corresponding to a libguestfs API. Writing the daemon in C is painful to say the least. Because it’s a long-running process running in a memory-constrained environment, we have to be very careful about memory management, religiously checking every return from malloc
, strdup
etc., making even the simplest task non-trivial and full of untested code paths.
So last week I modified libguestfs so you can now write APIs in OCaml if you want to. OCaml is a high level language that compiles down to object files, and it’s entirely possible to link the daemon from a mix of C object files and OCaml object files. Another advantage of OCaml is that you can call from C ↔ OCaml with relatively little glue code (although a disadvantage is that you still need to write that glue mostly by hand). Most simple calls turn into direct CALL instructions with just a simple bitshift required to convert between ints and bools on the C and OCaml sides. More complex calls passing strings and structures are not too difficult either.
OCaml also turns memory errors into a single exception, which unwinds the stack cleanly, so we don’t litter the code with memory handling. We can still run the mixed C/OCaml binary under valgrind.
Code gets quite a bit shorter. For example the case_sensitive_path API — all string handling and directory lookups — goes from 183 lines of C code to 56 lines of OCaml code (and much easier to understand too).
I’m reimplementing a few APIs in OCaml, but the plan is definitely not to convert them all. I think we’ll have C and OCaml APIs in the daemon for a very long time to come.
Filed under Uncategorized
I pushed OCaml 4.04.0 to Fedora Rawhide last week. There are loads of new features for OCaml users, but the ones that particularly affect Fedora are:
And talking about Fedora/RISC-V, it took a month, but the mass-rebuild of all Fedora packages completed, and now we’ve got about ⅔rds of all Fedora packages available for RISC-V. That’s quite a lot:
$ du -sh SRPMS/ RPMS/ 31G SRPMS/ 27G RPMS/
Filed under Uncategorized
For more half-baked ideas, see the ideas tag.
If you prefer just to see the code, then it’s here.
Chris Siebenmann wrote a couple of interesting articles about C’s null terminated strings and how they pre-date C.
Chris notes an alternative is a length + string representation, as used in Pascal. Although there are libraries for this in C, there are several drawbacks and approximately no one uses them.
However it’s possible to have the best of both worlds: Strings using an implicit length field that takes up no extra storage. These strings are backwards compatible with ordinary C strings — you can literally pass them to legacy functions or cast them to char *
— yet the equivalent of a strlen operation is O(1).
There are two ideas here: Firstly, when you use the C malloc function, malloc stashes some extra metadata about your allocation, and with most malloc implementations there is a function to obtain the size of the allocation from a pointer. In glibc, the function is called malloc_usable_size
. Note that because of alignment concerns, the amount allocated is usually larger than the amount you originally requested.
The second idea comes from OCaml. OCaml stores strings in a clever internal representation which is both backwards compatible with C (a fancy way to say they are null terminated), and it allows you to get the real length of the string even though OCaml — like C — allocates more than requested for alignment reasons.
So here’s how we do it: When allocating an “implicit length string” (ilenstr
) we store extra data in the final byte of the “full” malloced space, in the byte marked B in the diagram below:
+-------------------------+----+------------+----+ | the string | \0 | .... | B | +-------------------------+----+------------+----+ <----- malloc we requested ----> <----------- malloc actually allocated ---------->
If malloc allocated exactly the same amount of space as is used by our string + terminating null, then B is simply the terminating \0
:
+-------------------------+----+ | the string | \0 | +-------------------------+----+
If malloc allocated 1 spare byte, we store B = 1:
+-------------------------+----+----+ | the string | \0 | 1 | +-------------------------+----+----+
If malloc allocated 4 spare bytes, we store B = 4:
+-------------------------+----+----+----+----+----+ | the string | \0 | .... | 4 | +-------------------------+----+----+----+----+----+
Getting the true length of the string is simply a matter of asking malloc for the allocated length (ie. calling malloc_usable_size
), finding the last byte (B) and subtracting it. So we can get the true string length in an O(1) operation (usually, although this may depend on your malloc implementation).
ilenstr
strings can contain \0
characters within the string.
ilenstr
strings are also backwards compatible, in that we can pass one to any “legacy” C function, and assuming the string itself doesn’t contain any \0
inside it, everything just works.
Alright. This is terrible. DO NOT USE IT IN PRODUCTION CODE! It breaks all kinds of standards, is unportable etc. There are security issues with allowing \0-containing strings to be passed to legacy functions. Still, it’s a nice idea. With proper cooperation from libc, standards authorities and so on, it could be made to work.
Here is my git repo:
Filed under Uncategorized
You can now write OCaml plugins for nbdkit – the liberally licensed NBD server. You will, however, need OCaml ≥ 4.02.2+rc1 because of this fix.
Filed under Uncategorized
Last year I wrote and rewrote a little command line tool for managing my virtualization cluster.
Of course I could use OpenStack RDO but OpenStack is a vast box of somewhat working bits and pieces. I think for a small cluster like mine you can get the essential functionality of OpenStack a lot more simply — in 1300 lines of code as it turns out.
The first thing that small cluster management software doesn’t need is any permanent daemon running on the nodes. The reason is that we already have sshd (for secure management access) and libvirtd (to manage the guests) out of the box. That’s quite sufficient to manage all the state we care about. My Mini Cloud/Cluster software just goes out and queries each node for that information whenever it needs it (in parallel of course). Nodes that are switched off are handled by ignoring them.
The second thing is that for a small cloud we can toss features that aren’t needed at all: multi-user/multi-tenant, failover, VLANs, a nice GUI.
The old mclu (Mini Cluster) v1.0 was written in Python and used Ansible to query nodes. If you’re not familiar with Ansible, it’s basically parallel ssh on steroids. This was convenient to get the implementation working, but I ended up rewriting this essential feature of Ansible in ~ 60 lines of code.
The huge down-side of Python is that even such a small program has loads of hidden bugs, because there’s no safety at all. The rewrite (in OCaml) is 1,300 lines of code, so a fraction larger, but I have a far higher confidence that it is mostly bug free.
I also changed around the way the software works to make it more “cloud like” (and hence the name change from “Mini Cluster” to “Mini Cloud”). Guests are now created from templates using virt-builder, and are stateless “cattle” (although you can mix in “pets” and mclu will manage those perfectly well because all it’s doing is remote libvirt-over-ssh commands).
$ mclu status ham0 on total: 8pcpus 15.2G used: 8vcpus 8.0G by 2 guest(s) free: 6.2G ham1 on total: 8pcpus 15.2G free: 14.2G ham2 on total: 8pcpus 30.9G free: 29.9G ham3 off
You can grab mclu v2.0 from the git repository.
Filed under Uncategorized
If you ever used the old version of virt-v2v, our software that converts guests to run on KVM, then you probably found it slow, but worse still it was slow and could fail at the end of the conversion (after possibly an hour or more). No one liked that, least of all the developers and support people who had to help people use it.
A V2V conversion is intrinsically going to take a long time, because it always involves copying huge disk images around. These can be gigabytes or even terabytes in size.
My main aim with the rewrite was to do all the work up front (and if the conversion is going to fail, then fail early), and leave the huge copy to the last step. The second aim was to work much harder to minimize the amount of data that we need to copy, so the copy is quicker. I achieved both of these aims using a lot of new technology that we developed for qemu in RHEL 7.
Virt-v2v works (now) by putting an overlay on top of the source disk. This overlay protects the source disk from being modified. All the writes done to the source disk during conversion (eg. modifying config files and adding device drivers) are saved into the overlay. Then we qemu-img convert the overlay to the final target. Although this sounds simple and possibly obvious, none of this could have been done when we wrote old virt-v2v. It is possible now because:
Filed under Uncategorized
Pictured above is my 64 bit ARM server. It’s under NDA so I cannot tell you who supplied it or even show you a proper photo.
However it runs Fedora 21 & Rawhide:
Linux arm64.home.annexia.org 3.16.0-0.rc6.git1.1.efirtcfix1.fc22.aarch64 #1 SMP Wed Jul 23 12:15:58 BST 2014 aarch64 aarch64 aarch64 GNU/Linux
libvirt and libguestfs run fine, with full KVM acceleration, although right now you have to use qemu from git as the Rawhide version of qemu is not new enough.
Also OCaml 4.02.0 beta works (after we found and fixed a few bugs in the arm64 native code generator last week).
Filed under Uncategorized
PG’OCaml is a type-safe macro binding to PostgreSQL from OCaml that I wrote many moons ago.
You can write code like:
let hostid = 33 in let name = "john.smith" in let rows = PGSQL(dbh) "select id, subject from contacts where hostid = $hostid and name = $name"
and the compiler checks (at compile time) that hostid
and name
have the correct types in the program to match the database schema. And it’ll ensure that the type of rows
is something like (int * string) list
, and integrate that with type inference in the rest of the program.
The program won’t compile if you use the wrong types. It integrates OCaml’s type safety and type inference with the PostgreSQL database engine.
It also avoids SQL injection by automatically creating a safe prepared statement. What is executed when the program runs will have: ... where hostid = ? and name = ?
.
As a side-effect of the type checking, it also verifies that the SQL code is syntactically correct.
Filed under Uncategorized
Update: Thanks to Peter Robinson, there is now a build of OCaml for aarch64 in the Fedora repository.
I have backported the upstream ARM64 support into Fedora 21’s OCaml, so you can now use it to generate native ARM64/AArch64 binaries. If you don’t have hardware, use qemu to emulate it instead.
Filed under Uncategorized