Saturday 8 February 2014

SPO600 Assembler in Linux Packages lab 4

As part of group four, I ended up looking at Ogre and another group member looked into NSPR, discussing amongst ourselves the questions required to present to the rest of the class (that honour went to NSPR).

Ogre is an open source, multiplatform 3d rendering engine, used to make 3d applications (though I'd think it's most often used for game creation, where I first came across it) more so than any other type of application.

Most of the assembly code was spread across a small number of different source file directories, all located in the 'Main' Ogre directory, OgreMain in the 'src'. The assembler code seemed to be split into two separate uses. One was used for CPUID, in other words, to determine the type of CPU a particular user is running on when executing code, to determine which instruction set to follow. The other use seemed to be for atomics, and locking registers/the stack under specific circumstance, of which I couldn't 100% suss out from its use in the code (and my lack of experience with reading real world code on a regular basis). However, most of the time it still seemed to relate back to which set of CPU instructions were to be used. Oddly enough, within the code itself, the Assembler used for atomics made reference to its inclusion for only slight performance gains over C++ variations of the same coding.

One example of assembler's use in Ogre (found in OgreMain/src/nedmalloc/malloc.c.h.ppc):

/* place args to cmpxchgl in locals to evade oddities in some gccs */
int cmp = 0;
int val = 1;
int ret;
__asm__ __volatile__ ("lock; cmpxchgl %1, %2"
: "=a" (ret)
: "r" (val), "m" (*(lp)), "0"(cmp)
: "memory", "cc");

It's not 100% apparent whether or not the Assembler in Ogre was written specifically for it, or taken from an existing library, though some of the comments included in-line with the code lead me to believe it was a little of both. From those same comments and the syntax used, the assembly code is meant for x86_64 and for most of it, more architecture agnostic versions (C++ specifically) exist, with slight performance loss. If the team is putting those kinds of comments in their own source code, I have to imagine that yes, this version of Ogre could relatively easily be ported to and built for aarch64, with very little difference to from the current x86_64 version.

Here is one of the comments on Assembly use in Ogre:

"USE_BUILTIN_FFS default: 0 (i.e., not used)
Causes malloc to use the builtin ffs() function to compute indices.
Some compilers may recognize and intrinsify ffs to be faster than the
supplied C version. Also, the case of x86 using gcc is special-cased
to an asm instruction, so is already as fast as it can be, and so
this setting has no effect. Similarly for Win32 under recent MS compilers.
(On most x86s, the asm version is only slightly faster than the C version.)"

The other, NSPR, seems to only have a small amount of Assembly used for atomics in a small number of files, which could probably be rewritten in C/C++, especially since the Netscape Portable Runtime is meant to be platform neutral API - meant for web and server based applications (though in could be used for anything I suppose). However, speed might play a bigger role for NSPR than it would for CPUID function in Ogre. It was hard to tell whether the Assembler was hand written or taken from a library. My guess, library code. It could probably be built on aarch64 without too much trouble at the moment, even though the Assembly code here is for x86. I didn't access NSPR directly, as our group simply split the responsibility for each package to one a piece, and as such, I do not have any code snippets for it.

No comments:

Post a Comment