This removes all the special cases and thus it should be more robust. It
did show up some out-by-one (or a factor of two!) errors elsewhere in
the group mask calculations.
I'm uncertain about the names, but this makes it much easier to get at
specific subtypes (eg, PGA.tvec as a type) rather than having to know
the group mask.
This makes working with the plethora of types a little easier still. The
check for an algebra expression in field_expr needed to be moved up
because the struct code thought the algebra type was a normal vector.
And vis-versa.
I'm not sure what I was thinking, but I've decided that not being able
to cast the pseudo-scalar from float to double (for printf etc) was a
bug.
The merge_blocks function wasn't reporting whether it had done anything
so the thread/merge/dead blocks loop was bailing early. With this,
simple functions (ie, no control flow) are fully visible to the CSE
optimizer and it can get quite aggressive (removed 3 assignments and a
cross product from my barycenter test code).
Failing to promote ints to the algebra type results in a segfault in
assignment of a multi-vector due to the symbol pointer walking off the
end of the list of symbols.
And convert addition to subtraction when extend expressions are not
involved. This has taken my little test down to 56 instructions total
(21 for `l p ~l`), down from 74 (39).
This takes care of chained sums of extend expressions. Now `l p ~l` has
only four extend instructions which is expected for the code not
detecting the cross product that always produces 0.
This goes a ways towards minimizing extend expressions, even finding
zeros and thus eliminating whole branches of expressions, but shows the
need for better handling of chained sums of extends.
Any geometric algebra product of two negatives cancels out the negative,
and if the result is negative (because only one operand was negative),
the negation is migrated to above the operation. This resulted in
removing 2 instructions from one if my mini-tests (went from 74 to 78
with the addition/subtraction change, but this takes it back to 76
instructions).
Summed extend expressions are used for merging a sub-vector with a
scalar. Putting the vector first in the sum will simplify checks later
on (it really doesn't matter which is first so long as it's consistent).
Subtraction is implemented as adding a negative (with the plan of
optimizing it later). The idea is to give tree inspection and
manipulation a more consistent view without having to worry about
addition vs subtraction.
Negation is moved as high as possible in the expression, but is always
below an extend expression. The plane here is that the manipulation code
can bypass an alias-add-extend combo and see the negation.
This doesn't affect the generated code (aliases are free), but does
simplify the dag significantly, thus optimizing the compiler somewhat,
but also makes reading dags much easier and therefore optimizing the
debugging process.
Because the aliases were treated as live, every alias of a temp resulted
in an assignment, which proved to be quite significant (4-5 assignments
in some simple GA expressions). By using an alias node in the dag, the
unaliased temp can be marked live while the alias is treated as an
operation rather than an operand. Now my GA expressions have no
superfluous assignments (generally no assignments at all).
Simple k-vectors don't use structs for their layout since they're just
an array of scalars, but having the structs for group sets or full
multi-vectors makes the system alignment agnostic.
And geometric algebra vectors. This does break things a little in GA,
but it does bring qfcc's C closer to standard C in that sizeof respects
the alignment of the type (very important for arrays).
It's implemented as the Hodge dual, which is probably reasonable until
people complain. Both ⋆ and ! are supported, though the former is a
little hard to see in Consola.
That was surprisingly harder than expected due to recursion and a
not-so-good implementation in expr_negate (it went too high-level thus
resulting in multivec expressions getting to the code generator).
But only for scalar divisors. The simple method of AB†/(BB†) works only
if B is a versor and there's also the problem of left and right
division. Thanks to sudgy for making me stop and think before I actually
implemented anything (though he mentioned only that it doesn't work for
general mutli-vector divisors).
That was tedious. I can't say I'm looking forward to writing the tests
for 3d. And even though trivector . bivector and bivector . trivector
give the same answer, they're not really commutative when it comes to
the code.
Meaning vec3 is aligned to 4 components instead of 1. 2-component ops
use vec2 in the VM thus requiring alignment to boundaries of 2, but 4
seems better as it conforms with OpenGL and Vulkan (and, I imagine,
DirectX, but I doubt QF will ever use DirectX).
The singleton alias resulted in the adjusted swizzles being corrupted
when for the same def. Other than adding properly sized swizzles
(planned), the simplest solution is to (separately) allow alias that
stick out from from the def.
While the progs engine itself implements the instructions correctly, the
opcode specs (and thus qfcc) treated the results as 32-bit (which was,
really, a hidden fixme, it seems).
I didn't particularly like that solution due to the implied extra
bandwidth (probably should profile such sometime), but I think the
extend operations could be merged into simple assignments by the
optimizer at some stage (or further cleaned up when this stuff gets
moved to actual code gen where it should be).
Currently via only the group mask (which is really horrible to work
with: requires too much knowledge of implementation details, but does
the job for testing), but it got some basics working.
It turned out they were always using floats for the source type (meaning
doubles were broken), and not shifting the component in the final sizzle
code meaning all swizzles were ?xxx (neglecting minus or 0). I'd make
tests, but I plan on modifying the instruction set a little bit.
Also, correct the handling of scalars in dot and wedge products: it
turns out s.v and s^v both scale. However, it seems the CSE code loses
things sometimes.
This has shown the need for more instructions, such as a 2d wedge
product and narrower swizzles. Also, making dot product produce a vector
instead of a scalar was a big mistake (works nicely in C, but not so
well in Ruamoko).
The current code is pretty broken when it comes to vector types (losing
the vector and bogus errors among other issues). The whole thing needs a
rework or even just to be tossed in favor of better DAG processing.
I guess Hamish's suggestion made sense at the time, but I found that
with the current instructions, the reversed bivector wasn't so nice to
implement it would need a swizzle as well as the cross-product.
By default. Conversion of quake strings needs to be requested (which is
done by nq and qw clients and servers, as well as qfprogs via an
option). I got tired of seeing mangled source code in the disassembly.
I'm not sure if that was a thinko, typo, or something else, but judging
by the relevant commit message, the use of quaternion and vector was
intended only for advanced progs (v6p).
This makes working with them much easier, and the type system reflects
what's in the multi-vector. Unfortunately, that does mean that large
algebras will wind up having a LOT of types, but it allows for efficient
storage of sparse multi-vectors:
auto v = 4*(e1 + e032 + e123);
results in:
0005 0213 1:0008<00000008>4:void 0:0000<00000000>?:invalid
0:0044<00000044>4:void assign (<void>), v
0006 0213 1:000c<0000000c>4:void 0:0000<00000000>?:invalid
0:0048<00000048>4:void assign (<void>), {v + 4}
Where the two source vectors are:
44:1 0 .imm float:18e [4, 0, 0, 0]
48:1 0 .imm float:1aa [4, 0, 0, 4]
They just happen to be adjacent, but don't need to be.
Scaling now works for multi-vector expressions, and always subtracting
even when addition is wanted doesn't work too well. However, now there's
the problem of multi-vectors very quickly becoming full algebra vectors,
which means certain things need a rethink.
This gets only some very basics working:
* Algebra (multi-vector) types: eg @algebra(float(3,0,1)).
* Algebra scopes (using either the above or @algebra(TYPE_NAME) where
the above was used in a typedef.
* Basis blades (eg, e12) done via procedural symbols that evaluate to
suitable constants based on the basis group for the blade.
* Addition and subtraction of multi-vectors (only partially tested).
* Assignment of sub-algebra multi-vectors to full-algebra multi-vectors
(missing elements zeroed).
There's still much work to be done, but I thought it time to get
something into git.
If a symbol is not found in the table and a callback is provided, the
callback will be used to check for a valid procedural symbol before
moving on to the next table in the chain. This allows for both tight
scoping of the procedural symbols and caching.
Due to joys of pointers and the like, it's a bit of a bolt-on for now,
but it works nicely for basic math ops which is what I wanted, and the
code is generated from the expression.
Only · (dot product) and × (cross product for vector, commutator product
for geometric algebra) have been tested so far, but that involved
fighting with cpp to get it to not convert the · to \U000000b7, which
was rather annoying.
Fixing a load of issues related to autoconf and some small source-level issues to re-add clang support.
autoconf feature detection probably needs some addressing - partially as -Werror is applied late.
I never liked it, but with C2x coming out, it's best to handle bools
properly. I haven't gone through all the uses of int as bool (I'll leave
that for fixing when I encounter them), but this gets QF working with
both c2x (really, gnu2x because of raw strings).
The warning flag check worked too well: it enabled the warning and
autoconf's default main wanted the const attribute. The bug has been
floating around for a while, it seems.