Alignment errors on Fedora ARM

Here’s what you should do if you get a compile error like this on Fedora ARM:

error: cast increases required alignment of target type [-Werror=cast-align]
  1. If it’s an easy fix with little chance of breaking things, then fix it.
  2. If it’s on a performance critical path, especially if it causes a measurable slow down, then fix it.
  3. Otherwise, disable the warning. One way is to add:
    #pragma GCC diagnostic ignored "-Wcast-align"
    

BTW I will delete any comment on this post unless you show you have read the next part:

The background is that on certain architectures (ARM and MIPS are the main ones) the processor cannot load or store values from memory which is not aligned. Say you had a protocol which sent a 1 byte length field followed by a 4 byte data field:

+-------+-------+-------+-------+-------+- - - -
| len   | <---------- data -----------> |
+-------+-------+-------+-------+-------+- - - -

If you loaded this into malloc’d memory and used a C struct like this to access it:

struct {
  uint8_t len;
  int32_t data;
} __attribute__((packed));

then accesses to p->data might trap. The malloc’d memory, hence the struct, is aligned, and so the data field is not aligned.

What happens in Fedora is the kernel gets called. It sees a FAULT_CODE_ALIGNMENT fault, looks at the failed load instruction, emulates it, then returns back to your code. This fixup step is slow.

However it’s not a bug. Fedora only runs on recent ARM chips that can now do some unaligned accesses in hardware and always generate traps for ones they can’t handle.

Your code will still run fine assuming you don’t change the /proc/cpu/alignment setting. But if these fixups are frequent it is plausible that they could cause a performance problem.

Fixing these can be hard, and is more likely to create bugs (as we found out when trying to fix alignment bugs in hivex).

Since the most popular development architecture (x86) has always been able to handle unaligned accesses in hardware, developers are going to keep adding unaligned accesses to their programs. In the Intel Haswell chips, there isn’t even a performance penalty (in fact, it’s likely to be faster to squeeze your structs as much as possible, even if it makes them unaligned). ARM has gradually been adding the ability to handle unaligned access too, so eventually one hopes the problem will go away for developers.

5 Comments

Filed under Uncategorized

5 responses to “Alignment errors on Fedora ARM

  1. Wow, we have the kernel actually emulate unaligned accesses on ARM? On TI calculators with a Motorola 68000 CPU, if you try to code something like that, it will simply crash the whole calculator OS (not just your program) with an “Address Error”. And the version of GCC in TIGCC doesn’t even have the -Wcast-align warnings, or at least I don’t remember seeing them, ever. And by the way, “Address Error” is also the most common error thrown when accidentally executing non-code data, so good luck debugging that. Programmers these days are so spoiled!

  2. Zack B.

    So, just to check my understanding: in your example the FAULT_CODE_ALIGNMENT is triggered by dereferencing a pointer (virtual address) in an explicit definition of a 5-byte sized (packed) struct. In other words, you’re trying to access memory on a 1-byte boundary where perhaps it should be on a 2 or 4 byte boundary only, and during address translation, this fault code is triggered?
    So would one fix for this example be to use uint16_t instead of uint8_t for len, at the expense of extra bytes?

    At least according to something I just read (which is for ARM11, so I’m not sure if it applies to my Cortex hardware: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0333h/ch06s09s01.html), “Alignment fault checking is independent of the MMU being enabled.” Does this mean that the MMU on ARM can never be fully ‘disabled’ or does it mean there are other places where alignment fault handlers might exist?

    Interesting post!

    • rich

      Except you can’t change the type of len since (for the sake of this example) that comes from some externally defined protocol. We have a lot of this in hivex where we are unpacking data structures from the Windows registry that are not aligned.

      Basically this is a peculiarity of ARM hardware that it cannot do loads and stores in hardware for various addresses. x86 doesn’t have this peculiarity — it can always do the load or store, although (on older x86) it may be a tiny bit slower. At least ARM traps so that software can emulate the missing functionality. In the past I have worked on hardware that didn’t trap, so you had to use all sorts of compiler juggling to make things work.

      “ARM11” (ie. armv6) is not supported by Fedora. Minimum supported in Fedora is armv7.

  3. James Taswell

    Isn’t the compiler at fault here for generating code that assumes unaligned accesses are always possible on architectures where they are not?

    The compiler knows that the access is unaligned so shouldn’t it be generating a sequence of instructions that emulates the unaligned access?

    Why does the kernel have to get involved?

    • rich

      I suspect the compiler doesn’t know, at least not always. If the values are on the stack, it should have a reasonably good idea. A little known fact about x86-64 is that gcc will generate aligned instructions in some cases when it knows (or thinks it knows) the alignment of the stack.

      However if the compiler is generating functions that have to work with heap pointers, it doesn’t know at compile time how they are aligned. So I guess it would have to generate a bunch of extra runtime code, or slower runtime code, which wouldn’t make people happy.

      Really this is something I think needs to be fixed in the architecture.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.