I guess I'd wanted to avoid supporting automatic types in type
functions, but that broke long and unsigned int: either no type, or just
int, depending on the decl style.
Unfortunately, it turns out that generic functions are over-grouped so
all functions with the same name get the last definition, so `mix` with
a float (which should get GLSLstd450FMix) gets the bool version instead
(SpvOpSelect).
When the number of supplied vectors matches the number of columns in the
matrix and all vectors have the same width as the number of rows in the
matrix, there's no need to expand the vectors into components only to be
gathered again.
SPIR-V requires that matrices are constructed from vectors rather than
individual components. While not optimal, iqm.vert's output now passes
spirv-val. Also probably still lots wrong with fine details.
I think I need to come up with a better way of defining glsl builtins
and the strings are rather ugly (and I don't want to use qfcc system
header files).
iqm.vert now compiles, but doesn't pass validation yet (matrix bugs).
And, nicely, simplify it quite a bit. I'm not sure why I didn't thinkg
of this approach before. While the ruamoko back-end doesn't support
matrices yet, the expressions are handled.
As a side effect, type checking on comparisons is "stricter" in that
more potentially bogus comparisons (eg, int-float) are caught, resulting
in a few warnings in ruamoko code and even finding a couple of bugs.
Using bit masks for valid source types for each destination type makes
the logic a lot easier to read. Still had to have some explicit checks
for enums and bools.
The code gen return statements checks for out/inout parameters in the
current function and thus could result in some undesired behavior when
constants are evaluated within such a function.
The various indices were a little inconsistent making modifications
tricky. I discovered that signed left-shifts are considered UB if the
value overflows the non-sign bits (but unsigned left shifts are fine),
and signed right shifts are implementation dependent. Whee. However,
it's likely that signed right shifts can be relied upon, at least well
enough for unit tests. I imagine signed left, too, but I plan on
converting them to unsigned. Also, negative shift values are UB, but
that's less of a worry, but also needs "fixing" (ie, make unsigned).
However, later.
I found a need to check for shifts separately (not sure it's the right
approach for that problem, though), and there are a few more math ops
than just +-*/.
I'm not sure what it's useful for, but GLSL has a function for it thus I
decided to add the instruction to the VM, so this is part of the
compiler side.
The name for VMMUL was outright wrong (outer), but both MVMUL and VMMUL
can be mul because of the type and width/columns specifiers. I think
OUTER can too, but I'll leave that for now.
I got a sync validation error on a scatter command (I think) thus the
setting was probably wrong. Most of the parameters are still what they
were, but I'll be able to tweak the barriers as necessary.
Unfortunately, it didn't help with the hang on fetching the light cull
query data when starting in fisheye mode (no hang when enabling fisheye
after startup). I'm not sure what's going on there other than the
queries aren't getting updated: the counts seem to be fine so maybe the
commands aren't running. I've probably got a tangled mess of
pseudo-parallel command buffers: I need to go through my system and
clean everything up.
The tricky bit was figuring out how to get `floor()` out of the
available instructions. It's handy that the comparison ops always
returned floats and didn't force the use of branches.
Now both width and columns must match for an instruction to be selected.
Found a few errors in my opcode specs, and some minor goofs in the type
system (really just overthinking things when I added matrices).
Only matrix-vector, vector-matrix and vector-vector outer products (no
more room), but that's enough to get decent performance out of
matrix-matrix and matrix-scalar (both of which can be done as a set of
matrix-vector or vertex-scalar products).
Progs version bumped because I found that I'd put the swizzle and 2d
wedge ops in the wrong spot (compared to both intention and docs) and
rather than adjust the docs, I took advantage of the opportunity to get
a nicer layout for the wedge products (nestled into the spare slots left
by the 2x2 matrix ops, which seems fitting as the 2d wedge is the
determinant of a 2x2 matrix).
Implemented via specific overloads of the function.
It's not quite working correctly in that parameter names are taken from
the declaration instead of definition. However, this seems to be an old
bug that went unnoticed due to me almost always using the same parameter
names in declaration and definition.
Also, the code in get_function() is a horrible mess.
However, the basic idea turned out to be simpler than I though (though
details of the implementation are indeed a little trickier): generic
functions are essentially prototype generators when implemented using
non-generic specialized overloads.