page maintained by Joshua LeVasseur (jtl ∂does-not-exist.ira uka de)
The Problem
The kernel is compiled with gcc, and gcc supports a few standards for inter-function calling conventions. Considering the goals of a microkernel environment, L4Ka::Pistachio uses a modified SVR4 embedded ABI. The kernel deviates from the embedded SVR4 ABI in terms of register allocation for callee-saved versus caller-saved registers, which are adjusted via gcc command lines switches.
But one issue remains in the SVR4 ABI which severely impacts the performance of the kernel's main code paths, due to the abstraction of kernel data types in bitfields and classes.
The Solution
A patch to gcc has been developed internally, but isn't yet ready for release.
Details
Rather than rewrite a description of the problem, I quote my post to the freebsd-ppc list (refer to list archives for the entire thread):
From: Joshua LeVasseur
Date: Sat Aug 17, 2002 10:29:44 PM Europe/Berlin
To: freebsd-ppc ∂does-not-exist.freebsd orgSubject: freebsd-ppc: gcc's SysV ABI and parameter passing
While analyzing the code generated by gcc, I noticed that gcc implements
quite a literal interpretation of the System V ABI for parameter
passing.
From the spec:
"A struct, union, or long double, any of which shall be treated as a
pointer to the object, or to a copy of the object where necessary to
enforce call-by-value semantics. Only if the caller can ascertain that
the object is "constant" can it pass a pointer to the object itself."
Some example code, which declares a union for representing a 32-bit
bit-field (common for kernel code).
-----------------------
class simple_t {
public:
union {
unsigned raw;
struct {
unsigned yoda : 16;
unsigned vader : 16;
} x;
};
};
int add( simple_t a, simple_t b )
{
return a.raw + b.raw;
}
int main( void )
{
simple_t a, b;
a.raw = 1;
b.raw = 2;
return add( a, b );
}
----------------------
gcc, using the SysV ABI, will generate the following code:
add:
lwz r0,0(r3)
lwz r3,0(r4)
add r3,r0,r3
blr
main:
li r0,1
li r9,2
stw r0,8(r1)
stw r9,12(r1)
addi r3,r1,8
addi r4,r1,12
bl add
Notice how gcc writes the values to the stack before calling add(), and
then loads them off the stack in the add() function. Rather than
passing them as 32-bit parameters.
Now inspect the code generated by an alternative ABI (I use a modified
eabi to generate tight code). This code is also generated by Apple's
MachO ABI, and the AIX ABI (although I lack access to an AIX box).
add:
add r3,r3,r4
blr
main:
li r3,1
li r4,2
b add