Simple k-vectors don't use structs for their layout since they're just
an array of scalars, but having the structs for group sets or full
multi-vectors makes the system alignment agnostic.
And geometric algebra vectors. This does break things a little in GA,
but it does bring qfcc's C closer to standard C in that sizeof respects
the alignment of the type (very important for arrays).
It's implemented as the Hodge dual, which is probably reasonable until
people complain. Both ⋆ and ! are supported, though the former is a
little hard to see in Consola.
That was surprisingly harder than expected due to recursion and a
not-so-good implementation in expr_negate (it went too high-level thus
resulting in multivec expressions getting to the code generator).
But only for scalar divisors. The simple method of AB†/(BB†) works only
if B is a versor and there's also the problem of left and right
division. Thanks to sudgy for making me stop and think before I actually
implemented anything (though he mentioned only that it doesn't work for
general mutli-vector divisors).
That was tedious. I can't say I'm looking forward to writing the tests
for 3d. And even though trivector . bivector and bivector . trivector
give the same answer, they're not really commutative when it comes to
the code.
Meaning vec3 is aligned to 4 components instead of 1. 2-component ops
use vec2 in the VM thus requiring alignment to boundaries of 2, but 4
seems better as it conforms with OpenGL and Vulkan (and, I imagine,
DirectX, but I doubt QF will ever use DirectX).
The singleton alias resulted in the adjusted swizzles being corrupted
when for the same def. Other than adding properly sized swizzles
(planned), the simplest solution is to (separately) allow alias that
stick out from from the def.
While the progs engine itself implements the instructions correctly, the
opcode specs (and thus qfcc) treated the results as 32-bit (which was,
really, a hidden fixme, it seems).
I didn't particularly like that solution due to the implied extra
bandwidth (probably should profile such sometime), but I think the
extend operations could be merged into simple assignments by the
optimizer at some stage (or further cleaned up when this stuff gets
moved to actual code gen where it should be).
Currently via only the group mask (which is really horrible to work
with: requires too much knowledge of implementation details, but does
the job for testing), but it got some basics working.
It turned out they were always using floats for the source type (meaning
doubles were broken), and not shifting the component in the final sizzle
code meaning all swizzles were ?xxx (neglecting minus or 0). I'd make
tests, but I plan on modifying the instruction set a little bit.
Also, correct the handling of scalars in dot and wedge products: it
turns out s.v and s^v both scale. However, it seems the CSE code loses
things sometimes.
This has shown the need for more instructions, such as a 2d wedge
product and narrower swizzles. Also, making dot product produce a vector
instead of a scalar was a big mistake (works nicely in C, but not so
well in Ruamoko).
The current code is pretty broken when it comes to vector types (losing
the vector and bogus errors among other issues). The whole thing needs a
rework or even just to be tossed in favor of better DAG processing.
I guess Hamish's suggestion made sense at the time, but I found that
with the current instructions, the reversed bivector wasn't so nice to
implement it would need a swizzle as well as the cross-product.
By default. Conversion of quake strings needs to be requested (which is
done by nq and qw clients and servers, as well as qfprogs via an
option). I got tired of seeing mangled source code in the disassembly.
I'm not sure if that was a thinko, typo, or something else, but judging
by the relevant commit message, the use of quaternion and vector was
intended only for advanced progs (v6p).
This makes working with them much easier, and the type system reflects
what's in the multi-vector. Unfortunately, that does mean that large
algebras will wind up having a LOT of types, but it allows for efficient
storage of sparse multi-vectors:
auto v = 4*(e1 + e032 + e123);
results in:
0005 0213 1:0008<00000008>4:void 0:0000<00000000>?:invalid
0:0044<00000044>4:void assign (<void>), v
0006 0213 1:000c<0000000c>4:void 0:0000<00000000>?:invalid
0:0048<00000048>4:void assign (<void>), {v + 4}
Where the two source vectors are:
44:1 0 .imm float:18e [4, 0, 0, 0]
48:1 0 .imm float:1aa [4, 0, 0, 4]
They just happen to be adjacent, but don't need to be.
Scaling now works for multi-vector expressions, and always subtracting
even when addition is wanted doesn't work too well. However, now there's
the problem of multi-vectors very quickly becoming full algebra vectors,
which means certain things need a rethink.
This gets only some very basics working:
* Algebra (multi-vector) types: eg @algebra(float(3,0,1)).
* Algebra scopes (using either the above or @algebra(TYPE_NAME) where
the above was used in a typedef.
* Basis blades (eg, e12) done via procedural symbols that evaluate to
suitable constants based on the basis group for the blade.
* Addition and subtraction of multi-vectors (only partially tested).
* Assignment of sub-algebra multi-vectors to full-algebra multi-vectors
(missing elements zeroed).
There's still much work to be done, but I thought it time to get
something into git.
If a symbol is not found in the table and a callback is provided, the
callback will be used to check for a valid procedural symbol before
moving on to the next table in the chain. This allows for both tight
scoping of the procedural symbols and caching.
Due to joys of pointers and the like, it's a bit of a bolt-on for now,
but it works nicely for basic math ops which is what I wanted, and the
code is generated from the expression.
Only · (dot product) and × (cross product for vector, commutator product
for geometric algebra) have been tested so far, but that involved
fighting with cpp to get it to not convert the · to \U000000b7, which
was rather annoying.
Probably not a good idea for large maps, but handy for generating C
structs for small test maps. Does not include vertices or surfaces, just
the bsp tree itself for now.
Fixing a load of issues related to autoconf and some small source-level issues to re-add clang support.
autoconf feature detection probably needs some addressing - partially as -Werror is applied late.
The source tree is made read-only by `make distcheck`, so writing
temporary files to the source directory is a no-no (really, it's a bit
of a bug in qfcc, as per #51).
With the use of the full type for encoding type aliases, ptraliasenc's
simple check became invalid (it's purpose is to ensure the encoding
doesn't have "null" in it, not the exact encoding itself, but this is
good enough).
Two variables declared as arrays (same size) of different typedefs to
the same base type have their type encodings both pointing to the same
short alias.
From vkgen:
51d3 ty_array [4={int32_t>i}] 207f 0 4
51d9 ty_array [4=i] 1035 0 4
51df ty_alias {>[4=i]} 16 51d9 51e6
51e6 ty_array [4={uint32_t>i}] 2063 0 4
51ec ty_union {tag VkClearColorValue-} tag VkClearColorValue
4ca0 0 float32
51df 0 int32
51df 0 uint32
uint32 should use 51e6 and int32 should use 513d,
I never liked it, but with C2x coming out, it's best to handle bools
properly. I haven't gone through all the uses of int as bool (I'll leave
that for fixing when I encounter them), but this gets QF working with
both c2x (really, gnu2x because of raw strings).
The warning flag check worked too well: it enabled the warning and
autoconf's default main wanted the const attribute. The bug has been
floating around for a while, it seems.
This uses ud-chains for function statements (call/return) to force their
arguments to be live (in particular, indirect references via pointers)
this fixes the arraylife test.
The ud- and du-chains include known side-effects of the instructions and
thus depict a more accurate view of what operands an instruction uses or
defines. Fixes the arraylife2 test.
Like defs, a partial write should not define the whole temp. Thus, copy
the "don't visit main" behavior recently added to def_visit_all. Fixes
missing ud-chains for component-by-component assignments to temporary
vectors.
I'm not certain this is correct, but it seems to me that du-chains are
the same information as ud-chains, but from the defining statement's
point of view instead of that of the using statement.
As certain statements (in particular, function calls) can use additional
variables via pointer parameters, it's necessary to iterate ud-chain
building until the count stabilizes. This should make live variable
analysis much easier.
I think the current build_element_chain implementation does a reasonable
job, but I'm in the process of getting designated initializers working,
thus it will become important to ensure uninitialized members get
initialized.
I never liked the various hacks I had come up with for representing
resource handles in Ruamoko. Structs with an int were awkward to test,
pointers and ints could be modified, etc etc. The new @handle keyword (@
used to keep handle free for use) works just like struct, union and
enum in syntax, but creates an opaque type suitable for a 32-bit handle.
The backing type is a function so v6 progs can use it without (all the
necessary opcodes exist) and no modifications were needed for
type-checking in binary expressions, but only assignment and comparisons
are supported, and (of course) nil. Tested using cbuf_t and QFile: seems
to work as desired.
I had considered 64-bit handles, but really, if more than 4G resource
objects are needed, I'm not sure QF can handle the game. However, that
limit is per resource manager, not total.
This takes advantage of the ud-chains to follow the trail of pointer
assignments looking for an address. This gets array element assignments
surviving across blocks when the array itself is passed to a function.
It doesn't help when the address of the element is taken though. I think
that's a dags problem and probably needs du-chains. Also, the ud-chain
creation should probably be done in two passes so the newly found
information can be recorded.
Def and kill are still handled in flow_analyze_statement, but this makes
call meta data more consistent between v6 and ruamoko progs, allowing
the statement use chain to be used for call argument analysis. It even
found a bug in the extraction of param counts from the call instruction.
I had missed the flowvar clearing for auxiliary use/def/kill operands.
It's possible it wasn't necessary at the time since the operands were
added just for dealloc checking, but there's every reason it could
become necessary.
The first use will be pointer analysis for function arguments where the
argument points to an array to mark the array as live, but I'm sure
there'll be plenty of other uses.
A partial write to a def should not define the whole def, thus
def_visit_all's overlap parameter now has a flag that prevents a visit
to the main def when accessing the def from an alias def. This prevents
a lot of spurious kills and defines in flow analysis.
The array access code was loading the vector, modifying the element,
then forgetting to write the modified vector back to whence it came.
However, that would be rather sub-optimal, so now when the vector is
accessed by a pointer, the array code switches to field access to get at
the vector element thus avoiding the need to copy the whole vector.
Needed for proper analysis (ud-chains etc). Of course, it was then
necessary to remove the parameter defs from the uninitialized defs.
Also, plug a couple of memory leaks (forgot to free some temporary
sets).
That is, `array + offset`. This actually works around the bug
highlighted by arraylife.r (because the array is explicitly used), but
is not a proper solution, so that test still fails of course. However,
with this, it's no longer necessary to use `&array[index]` instead of
`array + index`.
I could never remember what any of the numbers meant. While define is
still a little fuzzy (they're (pseudo)statement numbers), at least now
I'll always know that the numbers are the define set. Also, having the
flow address of the variable helps with understanding the reaching defs
output.
It seems that the optimizer keeps array assignments live when passing
the array as a pointer, but not when passing the address of an element.
Found when testing the following code:
BasisBlade *pga_blades[16] = {
blades[1], blades[2], blades[3], blades[4],
blades[7], blades[6], blades[5], blades[0],
blades[8], blades[9], blades[10], blades[15],
blades[14], blades[13], blades[12], blades[11],
};
BasisGroup *pga_groups[4] = {
[BasisGroup new:4 basis:&pga_blades[ 0]],
[BasisGroup new:4 basis:&pga_blades[ 4]],
[BasisGroup new:4 basis:&pga_blades[ 8]],
[BasisGroup new:4 basis:&pga_blades[12]],
};
Only the first element of pga_blades is being assigned in the optimized
code, but everything is correct when not optimizing.