Here we look in depth at how C functions return values. Before reading this page, you should be familiar with how C functions set up and take down a stack frame (a.k.a. activation record). See these notes from CMSC 313: C Function Call Conventions and the Stack
In the following, each program was originally written in C. The C program is compiled into assembly using gcc -S which produces the .s files. The assembly language in the .s files are in the GNU Assembly syntax (GAS). This is converted to the .asm format using a combination of intel2gas -g, disassembling in gdb and manual conversion. The syntax of the resulting .asm file was checked using NASM, but typos might still exist. The comments are, of course, manually generated.
The first example is a very simple function foo() in C that returns an integer value. Note in the assembly version that the return value is simply stored in the EAX register.
Files: return1.c, return1.s and return1.asm.
Continuing with the same example, call1.c has a function proc() that calls the function foo() in return1.c. Note that the value in the EAX register is immediately stored in a temporary variable after the return from foo().
Files: call1.c, call1.s and call1.asm.
Files: return2.c, return2.s and return2.asm.
Here's what the assembly language looks like from the calling end. Note that in the assembly code generated by gcc, floating point values are kept on the FPU stack for as long as possible (more on this below). Thus, in this example, the value returned by foo is kept on the FPU stack and not saved in a temporary location.
Files: call2.c, call2.s and call2.asm.
There are two additional notes. First, the address of the return value is stored in EAX before returning. (Not sure if this is necessary.) Second, in the return operation "ret 4", the 4 means that 4 bytes should be popped off the stack after the return operation is executed. Note that in call3.asm that 4 bytes are added to the stack immediately after the return which negates the 4 in the "ret 4".
Files: return3.c, return3.s and return3.asm.
Files: call3.c, call3.s and call3.asm.
The example below shows that gcc tries to keep values on the FPU stack for as long as possible. In the expression, x1 + (x2 + (x3 + x4)), all the intermediate results are stored on the FPU stack. Note in the assembly code that there are no FINIT instructions to reset the FPU stack. The function simply assumes that all 8 FPU registers are available.
Files: fpu1.c, fpu1.s and fpu1.asm.
In the next example, the floating point expression has more than 8 intermediate values. Still, gcc uses the FPU stack as much as possible. Only (x2 + 1.3) and (x3 + 1.4) are moved to temporary locations.
Files: fpu2.c, fpu2.s and fpu2.asm.
Finally, in the last example, the complicated floating point expression contains a function call to g(). Now, all the intermediate values prior to the call to g() are stored in temporary locations in the stack frame. After the function has returned, then the FPU stack is exploited as before.
Files: fpu3.c, fpu3.s and fpu3.asm.