Ogre is an open source, multiplatform 3d rendering engine, used to make 3d applications (though I'd think it's most often used for game creation, where I first came across it) more so than any other type of application.
Most of the assembly code was spread across a small number of different source file directories, all located in the 'Main' Ogre directory, OgreMain in the 'src'. The assembler code seemed to be split into two separate uses. One was used for CPUID, in other words, to determine the type of CPU a particular user is running on when executing code, to determine which instruction set to follow. The other use seemed to be for atomics, and locking registers/the stack under specific circumstance, of which I couldn't 100% suss out from its use in the code (and my lack of experience with reading real world code on a regular basis). However, most of the time it still seemed to relate back to which set of CPU instructions were to be used. Oddly enough, within the code itself, the Assembler used for atomics made reference to its inclusion for only slight performance gains over C++ variations of the same coding.
One example of assembler's use in Ogre (found in OgreMain/src/nedmalloc/malloc.c.h.ppc):
/* place args to cmpxchgl in locals to evade oddities in some gccs */
int cmp = 0;
int val = 1;
int ret;
__asm__ __volatile__ ("lock; cmpxchgl %1, %2"
: "=a" (ret)
: "r" (val), "m" (*(lp)), "0"(cmp)
: "memory", "cc");
int cmp = 0;
int val = 1;
int ret;
__asm__ __volatile__ ("lock; cmpxchgl %1, %2"
: "=a" (ret)
: "r" (val), "m" (*(lp)), "0"(cmp)
: "memory", "cc");
It's not 100% apparent whether or not the Assembler in Ogre was written specifically for it, or taken from an existing library, though some of the comments included in-line with the code lead me to believe it was a little of both. From those same comments and the syntax used, the assembly code is meant for x86_64 and for most of it, more architecture agnostic versions (C++ specifically) exist, with slight performance loss. If the team is putting those kinds of comments in their own source code, I have to imagine that yes, this version of Ogre could relatively easily be ported to and built for aarch64, with very little difference to from the current x86_64 version.
Here is one of the comments on Assembly use in Ogre:
"USE_BUILTIN_FFS default: 0 (i.e., not used)
Causes malloc to use the builtin ffs() function to compute indices.
Some compilers may recognize and intrinsify ffs to be faster than the
supplied C version. Also, the case of x86 using gcc is special-cased
to an asm instruction, so is already as fast as it can be, and so
this setting has no effect. Similarly for Win32 under recent MS compilers.
(On most x86s, the asm version is only slightly faster than the C version.)"
Causes malloc to use the builtin ffs() function to compute indices.
Some compilers may recognize and intrinsify ffs to be faster than the
supplied C version. Also, the case of x86 using gcc is special-cased
to an asm instruction, so is already as fast as it can be, and so
this setting has no effect. Similarly for Win32 under recent MS compilers.
(On most x86s, the asm version is only slightly faster than the C version.)"
No comments:
Post a Comment