This cleans up dprograms_t, making it easier to read and see what chunks
are in it (I was surprised to see only 6, the explicit pairs made it
seem to have more).
While I think the reason the dags code moved an instruction before
adjstk and with was they shared a constant with that instruction (which
is a different bug), this ensures other instructions cannot get
reordered in front of adjstk and with, as doing so would cause any such
instructions to access incorrect data.
The goal was to get lea being used for locals in ruamoko progs because
lea takes the base registers into account while the constant pointer
defs used by v6p cannot. Pointer defs are still used for gobals as they
may be out of reach of 16-bit addressing.
address_expr() has been simplified in that it no longer takes an offset:
the vast majority of the callers never passed one, and the few that did
have been reworked to use other mechanisms. In particular,
offset_pointer_expr does the manipulations needed to add an offset
(unscaled by type size) to a pointer. High-level pointer offsets still
apply a scale, though.
Alias expressions now do a better job of hanling aliasing of aliases by
simply replacing the target type when possible.
It's possible I lost the child printing when creating the return
expressions, but dot diagrams are much more useful when they don't have
nodes with just pointer values.
The parameter defs are allocated from the parameter space using a
minimum alignment of 4, and varargs functions get a va_list struct in
place of the ...
An "args" expression is unconditionally injected into the call arguments
list at the place where ... is in the list, with arguments passed
through ... coming after the ...
Arguments get through to functions now, but there's problems with taking
the address of local variables: currently done using constant pointer
defs, which can't work for the base register addressing used in Ruamoko
progs.
With the update to test-bi's printf (and a hack to qfcc for lea),
triangle.r actually works, printing the expected results (but -1 instead
of 1 for equality, though that too is actually expected). qfcc will take
a bit longer because it seems there are some design issues in address
expressions (ambiguity, and a few other things) that have pretty much
always been there.
The aux use ops need to be counted and given nodes explicitly as they
may refer to defs that are not accessed by other statements other than
by aliases, and those aliases need to be marked live as well as the used
def.
This is part of the work for #26 (Record resource pointer with builtin
function data). Currently, the data pointer gets as far as the
per-instance VM function table (I don't feel like tackling the job of
converting all the builtin functions tonight). All the builtin modules
that register a resources data block pass that block on to
PR_RegisterBuiltins.
This will make it possible for the engine to set up their parameter
pointers when running Ruamoko progs. At this stage, it doesn't matter
*too* much, except for varargs functions, because no builtin yet takes
anything larger than a float quaternion, but it will be critical when
double or long vec3 and vec4 values are passed.
Storing a variable into a dereference pointer (*p = x) is not marking
the variable as used (due to a mistake while converting to Ruamoko
statement format) resulting in assignments to that variable being
dropped due to it being a dead assignment as the assignment to the
variable and the storing need to be in separate basic blocks (thus the
call in the test, though an if would have worked, I think) for the bug
to trigger.
The problem was a missed change when switching the internal statement
format to Ruamoko: I "used" the statement's operands directly rather
than the rotated ones when emitting v6p progs. Fixes a compile segfault
when NOT optimizing.
There was an out-by-one where attempting to run a program with only one
argument would result in the argument not being passed to the program
(two worked). This is actually the source of the error fixed in
9347e4f901 because test-harness.c was the
basis for qwaq's main.c
While all base registers can be used for any purpose at any time (this
is why the with instruction has hard-absolute modes: you can never get
permanently lost), qfcc currently uses the convention of register 0 for
globals and register 1 for stack locals (params, locals, function args).
The register used to access a def is stored in the def and that is used
to set the register bits in the instruction opcode.
The def code actually doesn't know anything about any conventions: it
assumes all defs are global for non-temp defs (the function code updates
the defs before emitting code) and the current function provides the
register to use for any temp defs allocated while emitting code.
Seems to work well, but debug is utterly messed up (not surprised, that
will be tricky).
Still need to get the base register index into the instructions, but I
think this is it for basic code generation. I should be able to start
testing Ruamoko properly fairly soon :)
Thanks to the use/def/kill lists attached to statements for pseudo-ops,
it turned out to be a lot easier to implement flow analysis (and thus
dags processing) than I expected. I suspect I should go back and make
the old call code use them too, and probably several other places, as
that will greatly simplify the edge setting.
The means that the actual call expression is not in the statement lint
of the enclosing block expression, but just its result, whether the call
is void or not. This actually simplifies several things, but most
importantly will make Ruamoko calls easier to implement.
The test is because I had some trouble with double-calls, and is how I
found the return-postop issue :P
Commit 76b3bedb72 broke more than just the
swap test, but at least I know I need to get an edge in the dag.
Currently, the following code is generated: return and add are reversed.
../tools/qfcc/test/return-postop.r:8: return counter++;
0001 store.i counter, .tmp0
0002 return .tmp0
0003 add.i .tmp0, (1), counter
However, I don't want to deal with it right now, so it's marked XFAIL.
Since Ruamoko now uses the stack for parameters and locals, parameters
need to come after locals in the address space (instead of before, as in
v6 progs). Thus use separate spaces for parameters and locals regardless
of the target, then stitch them together appropriately for the target.
The third space is used for allocating stack space for arguments to
called functions. It us not used for v6 progs, and comes before locals
in Ruamoko progs.
Other than the return value, and optimization (ice, not implemented)
calls in Ruamoko look like they'll work.
Thanks to me having done something right 20 years ago, that was pretty
easy :). The two boolean types aren't supported yet because I haven't
decided on just how to represent their types in qfcc.
This seems to be the most reasonable approach to allocating space for
function call parameters without using push and pop (or adding to the
stack pointer), though it's probably good even when using push and pop
to help keep things aligned.
My little test program now builds with the Ruamoko ISA :)
void cp (int *dst, int *src, int count)
{
while (count--) {
*dst++ = *src++;
}
}
Calls are broken (unimplemented), and non-void returns are not likely to
work either (only partially implemented).
Operand width is encoded in the instruction opcode, so the width needs
to be accounted for in order to select the correct instruction. With
this, my little test generates correct code for the ruamoko ISA (except
for return, still fails).
For the most part, it wasn't too bad as it's just a rotation of the
operands for some instructions (store, assign, branch), but dealing with
all the direct accesses to specific operands was a small pain. I am very
glad I made all those automated tests :)
This makes the v6p instruction table consistent with the ruamoko
instruction table, and clears up some of the ugliness with the load,
store, and assign instructions (. .= and = are now spelled out). I think
I'd still prefer an enum code (faster) but at least this is more
readable.
Missed this case in duplicate_type. Allows "short foo" and
"sizeof(short)" (even though qfcc and the engine have two ideas of the
size: I expect trouble later).