And the tests really exercised VectorShear (first attempt had things
messed up when more than one shear value was non-zero). Also,
Mat4Decompose wasn't orthogonalizing the z axis row. Oops. Anyway,
Mat4Decompose is now known to work well, and the usage of its output is
understood :)
I got the idea from blender when I discovered by accident that quat * vect
produces the same result as quat * qvect * quat* and looked up the code to
check what was going on. While matrix/vector multiplication still beats the
pants off quaternion/vector multiplication, QuatMultVec is a slight
optimization over quat * qvect * quat* (17+,24* vs 24+,32*, plus no need to
to generate quat*).
Buffer underflow and though strcpy has always been safe there, change to
memmove. Had the added benefit of helping me create more test cases for
better coverage.