When I implemented the st_alias handing (d8a78fc849) I was
unsure if I needed to unalias aliased defs too, but it turns out to have
been necessary: this is what caused my 2d PGA dynamics test to blow up
strangely due to the GA vector loaded from an array into a local
variable getting the local var replaced by a temp but the var itself
being read later in the code (uninitialized variable due to incorrect
optimization... oops).
When sum_expr gets a null expression as one of its args, it simply
returns the other arg, but that arg needs to have the correct type
applied.
Handle zero (null result) cross product in bivector geometric product.
Clean up types in bivector * vector geometric product.
I'm not sure the regressive product is right (overall sign), but that's
actually partly a problem in the math itself (duals and the regressive
product still get poked at, so it may be just a matter of
interpretation).
I'm not sure anything other than == or != has much meaning on anything
but scalars and pseudo scalars, but all comparisons are supported as a
simple boolean test. Any missing components are assumed to be 0. If
nothing else, it makes unit tests easier to write.
It turns out the algebra types make expression dag creation much more
difficult resulting in missed optimizations (eg, recognizing `a × a`).
This fixes the dead cross products in `⋆(s.B × ⋆s.B)`
The switch to using expression dags instead of trees meant that the
statement generator could traverse sub-expressions multiple times. This
is inefficient but usually ok if there are no side effects. However,
side effects and branches (usually from ?:, due to labels) break: side
effects happen more than once, and labels get emitted multiple times
resulting in orphaned statement blocks (and, in the end, uninitialized
temporaries).
Just running through the list of expressions in a block expression
results in label expressions within the block getting printed by
expressions that reference them and thus don't receive the correct next
pointer and wind up pointing to themselves. Printing the labels first
ensures they have the correct next pointer. However, I suspect there are
other ways things will get tangled.
I'm surprised it took almost two years to discover that I had no
quaternion multiplications in any test code, but getting an ICE for a
quaternion-vector product, and the Hadamard product for
quaternion-quaternion was a bit of a nasty surprise.
This makes a slight improvement to the commutator product in that it
removes the expand statement, but there's still the problem of (a+a)/2.
However, at least now the product is correct and slightly less abysmal.
This takes advantage of evaluate_constexpr to do all the work. Necessary
for use of basis blade constants in algebra contexts (avoids an internal
error).
This fixes the upostop-- test by auto-casting implicit constants to
unsigned (and it gives a warning for signed-unsigned comparisons
otherwise). The generated code isn't quite the best, but the fix for
that is next.
Also clean up the resulting mess, though not properly. There are a few
bogus warnings, and the legit ones could do with a review.
It's actually good enough for building qfcc: it fails a couple of
obscure c23 preprocessor examples that can be ignored for now (the rest
of QF builds without any apparent issue).
The expansion is necessary for the final test in preproc-2.r, but breaks
preproc-1.r because the closing ')' is *not* visible to collect_args
(its assumption is incorrect). This needs reworking (and probably
rethinking) of the entire macro argument collection, but I need a little
break from the preprocessor (and it's good enough for *most* uses), so
I'm adding the code (disabled) in order to avoid losing it and my notes
about the problem.
That is, if anything other than '(' (even a macro/argument that expands
to '(')is seen while checking for a function-type macro, the expansion
fails. This gets preproc-1.r working properly.
It really affected only collect_args, but in fixing that I noticed I was
very inconsistent with scanner's type (should be yyscan_t or void *, but
not yyscan_t *).
There's no guarantee the source file is in a writable directory (in
fact, it is very definitely in a read-only directory when running
`make distcheck`). However, it is reasonable to assume the output file
is being written to a writable directory thus default the object file
directory to that of the output file, but still use the source file's
name for the object file name.
Fixes#51
This gets most of the second preprocessor test working, apparently just
some problems with macro arguments not getting expanded for ## (unless
there's more lurking, of course, which I know there is for __LINE__).
It just feels cleaner than unnecessarily copying token chains. It turns
out that the core problem was just order of operations in next_token:
moving the pending_macro code to after arg/macro detection seems to be
correct (even bare `G LPAREN() 0)` is *not* expanding `G`, as expected).
It turned out I had simply forgotten to ensure the token chains were
properly terminated (the struct copy would copy the next of the source
token and thus macro args always expanded to the last token of the
parent macro). And then I'd missed saving the token text when parsing
predefined macros. __VA_OPT__ is still a problem, but this work was for
making that a little easier.
I got tired of the way the separate token types for macro expansion and
the rest of the preprocessor parser were handled. This makes them a
little more unified. Macro expansion seems to be slightly broken again
in that min/max/bound mess up badly, and __VA_OPT__ does things in the
wrong order, but I wanted to get this in as a checkpoint.
__VA_ARGS__ seems to be working but __VA_OPT__ still needs a lot of work
for dealing with its expansions, but basic error checking and simple
expansions seem to work.
Macros now store their arguments and have a cursor pointing to the next
token to take from their expansion list. While not checked yet, this
will make avoiding recursive macro invocations much easier. More
importantly, it's a step closer to correct argument expansion (though
token pasting is currently broken).
This makes it much easier to keep track of end of file in a conditional
block (#if...#endif) as #include in non-suppressed code would result in
spurious eof errors otherwise. I'm a little concerned about correctness,
but everything seems to work and it should be right as suppressed
include directives do not change the state at all, and the suppressed is
its own flag not in the condition stack.
The op code needs to be set just before being passed to the qc parser so
it doesn't get lost in macro expansion.
And vector values need to not be processed when recording otherwise they
get lost.
I'm not sure what exactly caused vector literals to break, but bailing
out of the vector ops section on conversion to vector or quaternion
fixes game-source.
Allow 32-bit positive values without a warning and warn on conversion
issues for float. The whole conversion system needs cleaning up for
v6/v6p/ruamoko.
-D options weren't counting correctly so build_cpp_args was writing past
the end of the array allocated for command line arguments
parse_cpp_name had an out-by-one resulting in reading past the end of
the string.
The qfcc system include path was being set in the wrong place (not sure
why I thought that was right), and not respecting no_default_paths.
-M was generating preprocessor output when it should not have been,
resulting in corrupted dependency files.
I don't remember why I couldn't get this to work last time I tried, but
it went well this time, and made a significant difference to compiling
vulkan.r (from 1.1s to 0.15s unoptimized or 0.8s optimized).
This fixes the problem with // comments after the file in #include and
the core problem the complicated // handling tried to fix with
suppressed directives. Funny how it's always the simpler code that works
better :/
I'm undecided about @ in macro names, but treating @id as one token in
the body is necessary with the single-pass tokenizing. Fixes an infinite
macro expansion loop in vecaddr.r (`#define dot @dot`), but that's
really only a bandaid for *that* issue as there are plenty of other
cases where macros will loop.
It turns out that the recursive lexing was over-complicated as the
tokens for nested macros need to come from the expanded stream, not the
raw input stream.
cpp does this for us so the double-generation is redundant (and
currently wrong anyway with things like <built-in> and <command-line>
getting into the list of dependencies).
Or at least mostly so. The __QFCC__ define isn't visible, and it seems
undef might not be working properly (ruamoko/lib/types.r doesn't
compile). Of course, there's still the issue of whether it's compiling
correctly.
If ID gets to the preprocessor parser in expressions, the ID is not
defined because if it was defined, it would have been expanded. Thus,
all IDs are 0.
In addition to cleaning up the old flex line rules, this improves
handling of the '# num "file" flags' from cpp to at least parse the
additional flags (support for the system header flag might come later,
but I doubt the extern-c flag will have much meaning).
QuakePascal has lost its line directive handling (no errors, but dead
rules) for now. Eventually the lexers will be merged.
I had wanted to do this earlier but shied away from the large edit. Now
it became more necessary (and will become even more necessary when I get
to the glsl front-end).
Really, function-type macros expand too, but incorrectly as the
parameters are not parsed and thus not expanded, but this gets the basic
handling implemented, including # and ## processing.
Converting ID and char constants too early resulted in poor handling of
keywords and spurious diagnostics about multi-byte character constants,
particularly with -E (preprocess-only)
Mostly white space, but a little more consistency in handling and remove
the use of REJECT (which does actually seem to make a difference, but it
could be just noise).
As far as I can tell, the preprocessor numbers conform with C23 except
for a couple of extensions (both ' and _ work for digit separators, and
d/D work for explicit doubles (since qfcc current defaults to float
instead of double)). This massively cleaned up the numeric rules and
even took care of some UB in the vector parsing code (I'm not sure which
is more surprising: that I didn't see it at the time, or that it was
blindingly obvious now).
This will be used for unifying preprocessing and parsing, the idea being
that the tokens will be recorded for later expansion via macros, without
the need to retokenize.
While my modified version is needed to actually avoid warnings (vs
upstream git flex), the files still work with debian's flex (with no
warnings). I needed to update (and fix) flex so the lexer line numbers
would be correct.
The improved location tracking isn't used yet, but was fairly invasive
on the bison-flex api.
It also cleans up some of the ancient workarounds for bad flex cores.
It turns out I need to create my own cpp in order to handle glsl's
directives. I've decided to make a unified lexer, and continuation lines
seemed a good place to start.
This fixes the really odd bug of certain string values getting swapped
in vkgen when DEBUG_QF_MEMORY was defined in expr.c. It will also
prevent a lot of fun with floats in the future, I imagine.
It's now meant only for ALLOC. Interestingly, when DEBUG_QF_MEMORY is
defined in expr.c, something breaks badly with vkgen (no sniffles out of
valgrind, though), but everything is fine with it not defined. It seems
there may be some unpleasant UB going on somewhere.
I'm not sure why this showed up now (I guess just not enough large
immediate values), but this fixes a segfault in the algtypes test (the
mystery is why it showed up this late).
This gets my `m * p * ~m` code as optimal as possible if my counting is
correct (this does not include the extra extends and add needed to merge
the values). Also, there might be a possibility of recombining some ops
into a vector op, but I'm happy with this.
That is, `x+x -> 2*x` (and similar for higher counts). Doesn't make much
difference for just 2, but it will make collecting scales easier and I
remember some testing showing that `2*x` is faster than `x+x` for
floating point.
Of course, motor-point keeps bouncing around numerically :/
This fixes the motor-point.r test (ie, the sub-type field selector works
on mono-group types now). Still need to sort out something for scalars
(but I imagine that can work only in an @algebra context).
This allows them to be matched with cancelling factors. My fancy zero
test is now just that: a fancy zero:
typedef @algebra(float(3,0,1)) PGA;
typedef PGA.group_mask(0xa) bivector_t;
typedef PGA.group_mask(0x1e) motor_t;
typedef PGA.tvec point_t;
typedef PGA.vec plane_t;
plane_t
apply_motor (motor_t m, point_t p)
{
return (m * p * ~m).vec;
}
0000 nop there were plums...
0001 adjstk 0, 0
apply_motor:
motor.r:32:{
0002 with 2, 0, 1
motor.r:33: return (m * p * ~m).vec;
0003 return (<void>)
The motor-point.r test fails because it uses (m * p * ~m).tvec to get
the value but the type system is slightly broken in that a mono-group
algebra type does not have a structure associated with it and thus the
"missing" field results in 0. Yes, I spent too long chasing that one,
too.
I'd missed this in the previous commit, which was a good thing, really,
as it turns out this was the trigger of the bug that causes my fancy
zero test to become non-zero. It seems the bug is in either
component_sum or in the extend merging.
Or really, the implementers. This gets my fancy zero test down to just
unrecognized permutations of commutative multiplies and dot products
(with the multiplies above the dot products).
While this does "explode" the instruction count (I'll have to collect
like terms later), it does allow for many more opportunities for things
to cancel out to 0 (once (pseudo)commutativity is taken care of).
That is, dot(scale(A,a),scale(B,b)) -> (a*b)*dot(A,B). Does the right
thing when only one side is a scale. No change to the instruction count
in my fancy zero, but it does open more opportunities when I distribute
products.
That is, scale(scale(A,b),c) becomes scale(A,b*c), thus giving the
expression dag more opportunities to find common sub-expressions. My
fancy zero test is down to 20 total instructions (including overhead, or
16 for the actual algebra).
While splitting up the scaled vector into scaled xyz and scaled w does
cost an extra instruction, it allows for other optimizations to be
applied. For one, extends get all the way to the top now, and there are
at most two (in my test cases), thus either break-even or even a slight
reduction in instruction count. However, in the initial implementation,
I forgot to do the actual scaling, and 12 instructions were removed from
my fancy zero case, but real tests failed :P It looks like it's just
distributivity and commutativity holding things back (eg,
omega*gamma*sigma - gamma*omega*sigma: should be 0, but not recognized
as that).
This fixes the motor test :) It turns out that every lead I had
previously was due to the disabling of that feature "breaking" dags
(such that expressions wouldn't be found) and it was the dagged
multi-vector components getting linked by expr->next that made a mess of
things.
Or at least mostly so (there are a few casts). This doesn't fix the
motor bug, but I've wanted to do this for over twenty years and at least
I know what's not causing the bug. However, disabling fold_constants in
expr_algebra.c does "fix" things, so it's still a good place to look.
This doesn't fix the motor bug, but it doesn't make it worse. However,
it does simplify the trees quite a bit, so it should be easier to debug.
It seems the problem has something to do with fold_constants messing up
dagged aliases: in particular, const-folding multiplication by e0123 in
3d PGA as fold_constants sees it as 1.
I'm not yet sure what went wrong, but the introduction of dags broke
something in my set_transform function (perhaps the dual?), but it's
something to do with the symbol being dagged (I guess because it's
required for everything else to dag). However, the strangest thing is
the error shows up with 155a8cbcda which
is before dags had any direct effect on the geometric algebra code. I
have a sneaking suspicion it's yet another convert_name issue.