This gets my `m * p * ~m` code as optimal as possible if my counting is
correct (this does not include the extra extends and add needed to merge
the values). Also, there might be a possibility of recombining some ops
into a vector op, but I'm happy with this.
That is, `x+x -> 2*x` (and similar for higher counts). Doesn't make much
difference for just 2, but it will make collecting scales easier and I
remember some testing showing that `2*x` is faster than `x+x` for
floating point.
Of course, motor-point keeps bouncing around numerically :/
This fixes the motor-point.r test (ie, the sub-type field selector works
on mono-group types now). Still need to sort out something for scalars
(but I imagine that can work only in an @algebra context).
This allows them to be matched with cancelling factors. My fancy zero
test is now just that: a fancy zero:
typedef @algebra(float(3,0,1)) PGA;
typedef PGA.group_mask(0xa) bivector_t;
typedef PGA.group_mask(0x1e) motor_t;
typedef PGA.tvec point_t;
typedef PGA.vec plane_t;
plane_t
apply_motor (motor_t m, point_t p)
{
return (m * p * ~m).vec;
}
0000 nop there were plums...
0001 adjstk 0, 0
apply_motor:
motor.r:32:{
0002 with 2, 0, 1
motor.r:33: return (m * p * ~m).vec;
0003 return (<void>)
The motor-point.r test fails because it uses (m * p * ~m).tvec to get
the value but the type system is slightly broken in that a mono-group
algebra type does not have a structure associated with it and thus the
"missing" field results in 0. Yes, I spent too long chasing that one,
too.
I'd missed this in the previous commit, which was a good thing, really,
as it turns out this was the trigger of the bug that causes my fancy
zero test to become non-zero. It seems the bug is in either
component_sum or in the extend merging.
Or really, the implementers. This gets my fancy zero test down to just
unrecognized permutations of commutative multiplies and dot products
(with the multiplies above the dot products).
While this does "explode" the instruction count (I'll have to collect
like terms later), it does allow for many more opportunities for things
to cancel out to 0 (once (pseudo)commutativity is taken care of).
That is, dot(scale(A,a),scale(B,b)) -> (a*b)*dot(A,B). Does the right
thing when only one side is a scale. No change to the instruction count
in my fancy zero, but it does open more opportunities when I distribute
products.
That is, scale(scale(A,b),c) becomes scale(A,b*c), thus giving the
expression dag more opportunities to find common sub-expressions. My
fancy zero test is down to 20 total instructions (including overhead, or
16 for the actual algebra).
While splitting up the scaled vector into scaled xyz and scaled w does
cost an extra instruction, it allows for other optimizations to be
applied. For one, extends get all the way to the top now, and there are
at most two (in my test cases), thus either break-even or even a slight
reduction in instruction count. However, in the initial implementation,
I forgot to do the actual scaling, and 12 instructions were removed from
my fancy zero case, but real tests failed :P It looks like it's just
distributivity and commutativity holding things back (eg,
omega*gamma*sigma - gamma*omega*sigma: should be 0, but not recognized
as that).
This fixes the motor test :) It turns out that every lead I had
previously was due to the disabling of that feature "breaking" dags
(such that expressions wouldn't be found) and it was the dagged
multi-vector components getting linked by expr->next that made a mess of
things.
Or at least mostly so (there are a few casts). This doesn't fix the
motor bug, but I've wanted to do this for over twenty years and at least
I know what's not causing the bug. However, disabling fold_constants in
expr_algebra.c does "fix" things, so it's still a good place to look.
This doesn't fix the motor bug, but it doesn't make it worse. However,
it does simplify the trees quite a bit, so it should be easier to debug.
It seems the problem has something to do with fold_constants messing up
dagged aliases: in particular, const-folding multiplication by e0123 in
3d PGA as fold_constants sees it as 1.
I'm not yet sure what went wrong, but the introduction of dags broke
something in my set_transform function (perhaps the dual?), but it's
something to do with the symbol being dagged (I guess because it's
required for everything else to dag). However, the strangest thing is
the error shows up with 155a8cbcda which
is before dags had any direct effect on the geometric algebra code. I
have a sneaking suspicion it's yet another convert_name issue.
They should be treated as such only when merging vector components. This
fixes a bug that doesn't actually exist (it's in experimental code),
where the sum of two 3-component vectors was getting lost.
For cross products: remove any a from a×(...+/-a...)
For dot products: remove any a×b from a•(...+/-a×b...) (or b×a)
This removed another 2 instructions :)
They don't have much effect that I've noticed, but the expression dags
code does check for commutative expressions. The algebra code uses the
anticommutative flag for cross, wedge and subtract (unconditional at
this stage). Integer ops that are commutative are always commutative (or
anticommutative). Floating point ops can be controlled (default to non),
but no way to set the options currently.
This takes advantage of the expression dag to detect when an expression
is on both sides of a cross product (which always results in 0). This
removes 3 instructions from my motor test (28 to go).
Especially binary expressions. That expressions can now be reused is
what caused the need to make expression lists non-invasive: the reuse
resulted in loops in the lists. This doesn't directly affect code
generation at this stage but it will help with optimizing algebraic
expressions.
The dags are per sequence point (as per my reading of the C spec).
The removal of the e tag from expr_t necessitated making convert_name
return a new expressions which resulted in get_type no longer being
enough to both convert a name expression and get the type. This was just
another missed spot. With this, all of game-source except ctf builds.
Finally, that little e. is cleaned up. convert_name was a bit of a pain
(in that it relied on modifying the expression rather than returning a
new one, or more that such behavior was relied on).
Sum expressions pull the negation through extend expressions allowing
them to switch to subtraction when appropriate, and offset_cast reaches
past negation to check for extend expressions. This has eliminated all
negation-only instructions from the motor-point, shaving off more
instructions (now 27 not including the return).
I don't know why I didn't think to do it this way before, but simply
recursing into each operand for + or - expressions makes it much easier
to generate correct code. Fixes the motor-point test.
That is, passing int constants through ... in Ruamoko progs is no longer
a warning (still is for v6p and v6 progs). I got tired of getting the
warning for sizeof expressions when int through ... hasn't been a
problem for even most v6p progs, and was intended to not be a problem
for Ruamoko progs.