Tuesday, 21 January 2014

Hello World and the Compiler

SPO600 lab 2 instructions can be found here.

To begin, write a basic ‘hello world’-esque program in C. You know, a printf(“hello world\n") sort of thing. After writing that program, we were to add different options to the compiler or objdump as required for each test case.

TL;DR - compilers are power tools, not be underestimated. Respect them, or they will retaliate.

Test 4 with additional
 integer arguments
As part of the group who was tasked with Test 4, we were to add additional arguments to the code (simple integers 1 to 10) and use the objdump command with a few additional options to examine the binary file created by the compile. When additional arguments are added to the basic ‘hello world’ program, the compiler includes additional mov’s in the form of move-loads (movl) for each individual argument (ten more in our case). That the integers were not moved into a register, but straight to the stack is part of the infinite wisdom of the gcc compiler. The movl to the %esp is the equivalent to stack pushes, as apparently our additional argument (18 digits long) was too large to push all the values into a register. When the additional argument was changed from a simple integer to a long integer, as seen in the screenshots below, it appears as if the compiler set aside two separate memory locations for the long integer instead of one, and in an act of Little Endian vs Big Endian, listed the sections out of order (in regards to hex address). All other arguments hex addresses/values decreased by 4, as the values of each argument decreased (starting from nine, down to one) like normal.

Test 4 with ridiculous long integer argument
For the next trick, Test 1, adding ‘-static’ when compiling changes the size dramatically from eight kilobytes without -static when compiling to over 800 kilobytes when including -static while compiling. The biggest change in the main function is the change of the callq from printf@plt without -static to IO_printf with it. printf@plt is essentially a pointer to the library functions printf requires to run properly, and the compiler simply follows that pointer to the required code when printf is called. IO_printf on the other hand takes all the libraries used by the program and actually adds their code to the program source and as such, changes the order of the functions, with so many more added that make sense. With printf, IO_printf skips the pointer and takes a more direct route where printf is held. <main> is now no where near the end of the actual source code, with all the library functions coming after it.

With the -fno-builtin removed from the compile as part of Test 2, the compiler uses its own built-in (ahem) optimizers to make the source code more efficient than it would be normally (with -fno-builtin enabled for compilation). In this case, the callq doesn't call printf, instead calling puts, which is a more direct way of showing output on screen than printf which has to jump through other hoops before printing to the screen. (The executable size is also slightly smaller. It also removes a mov (0x0 int %eax, so a 0 is no longer being moved into the eax register), and changes the nopw to %cs:0x0(%rax,%rax,1) from 0x0(%rar, %rar,1), which I believe is a change to where the No Operation instruction is stored (from memory to the top of the stack).

--Note, on my laptop, even though I'm doing all the processing on the Ireland server, there was no change in the object dump between enabling and disabling -fno-builtin. I had another student watch me compile, and objdump both versions with no noticeable difference. My initial thought was that my laptop's cpu being as old as it is (and 32bit to as well), may be the cause of these issues

Test 3 asked to have the -g option disabled during compilation. Without -g, debug sections of code are removed from the elf file, as is any extra debugging information provided for each section of the file. Without the debug section headers, there is a lot less code (relatively speaking), and as such the program itself is about a kilobyte smaller than with -g enabled (8k without comparable to 9k with). There is a lot more debugging information than one might think for such a small, simple program like "Hello world\n".

Test 5 required moving the printf into a separate function on its own, which causes the string and integer additions to be loaded outside the main. Instead, the compiler pushes the main straight to the stack, calls the printf function, does all instructions/work to be done with the printf function, finishes with that function, and retrieves the original information from the stack with pop %rbp. The printf function is right below the main.


The last test, test 6 called for the replacing the -O0 from the compile with -O3, that increases optimization from nothing to [shall we say] level 3. Doing so removes push and pop instructions, in other words, nothing is being put on the stack for later use. O3 simply moves the code/values directly into memory, uses xor to automatically switch the value in the eax register to 0 (whereas -O0 moved a 0 value from memory into that register to achieve the same goal) and instead of the compiler using callq on address 400410 <printf@ptl> it uses jmpq 400410 <printf@ptl>, which means it jumps directly to the value at that address (and in this case, that's our line of text to be printed on screen). After jumping to that memory location, the no operation instruction is given (nop), which more or less ends the program.



No comments:

Post a Comment