Ruamoko passes va_list (@args) through the ... parameter (as such), but
IMP uses ... to defeat parameter type and count checking and doesn't
want va_list. While possibly not the best solution, adding a no_va_list
flag to function types and skipping ex_args entirely does take care of
the problem without hard-coding anything specific to IMP.
The system currently just sets some bits in the type specifier (the
attribute list should probably be carried around with the specifier),
but it gets the job done for now, and at least gets things started.
This makes it much easier to check (and more robust to name changes),
allowing for effectively killing the node to which the variable being
addressed is attached. This fixes the incorrect address being used for
va_list, which is what caused double-alias to fail.
In order to not waste instructions, the Ruamoko ISA does not provide 1
and 2 component 64-bit load/store instructions since they can be
implemented using 2 and 4 component 32-bit instructions (load and store
are independent of the interpretation of the data). This fixes the
double test, and technically the double-alias test, but it fails due to
a problem with the optimizer causing lea to use the wrong reference for
the address. It also breaks the quaternion test due to what seems to be
a type error that may have been lurking for a while, further
investigation is needed there.
Since the call instruction in the Ruamoko ISA specifies the destination
of the return value of the called function, it is much like any
expression type instruction in that the def referenced by its c operand
is both defined and killed by the instruction. However, unlike other
instructions, it really has many pseudo-operands: the arguments placed
on the stack. The problem is that when one of the arguments is also the
destination of the return value, the dags code wants to use the stack
argument as it was the last use of the real argument. Thus, instead of
using the value of the child node for the result, use the value label
attached to the call node (there should be only one such label).
This fixes iterfunc, typedef, zerolinker and vkgen when optimizing. Now
all but the double tests and return postop tests pass (and the retun
postop test is not related to the Ruamoko ISA, so fails either way).
That is, updating a variable using a function that takes the same
variable, probably very common in iterators, thus the name. It happens
to be the first qfcc test specific to Ruamoko. It's really just the
typedef, zerolinker, and vkgen type encoding loop stripped down for ease
of debugging.
Of course, it fails :)
I really need to come up with a better way to get the result type into
the flow analyser. However, this fixes the aliasing ICE when optimizing
Ruamoko code that uses struct assignment.
It's currently only 4 (or even 3 for v6) words, but this fixes false
positives when checking for null pointers in Ruamoko progs due to
pr_return pointing to the return buffer and thus outside the progs
memory map resulting in an impossible to exceed value.
or 512kW (kilowatts? :P). Barely enough for vkgen to run (it runs out if
auto release is run during scan_types, probably due to fragmentation). I
imagine I need to look into better memory management schemes, especially
since I want to make zone allocations 64-byte aligned (instead of the
current 8). And it doesn't help that 16 words per allocation are
dedicated to the zone management.
Anyway, with this, vgken runs and produces sufficiently correct results
for the rest of QF to build, so long as qfcc is not optimizing.
Since Z_Malloc uses Z_TagMalloc to do the work, this ensures the check
is always run.
Also, add the check to Z_Realloc when it needs to adjust an existing
block.
Builtins that call progs with parameters now must always wrap the call
to PR_ExecuteProgram so that the data stack is properly preserved across
the call.
I need to do an audit of all the calls to PR_ExecuteProgram.
It doesn't do much good for dynamic progs memory because zone currently
aligns to 8 bytes (oops, forgot to fix that), but at least the stack and
globals are properly aligned.
The ABI for the Ruamoko ISA set currently puts va_list into the
parameter stream whenever a varargs function is called. Unfortunately,
IMP is declared with ... and thus cannot be used directly unless all
methods become varargs, but I think that would cause even more
headaches.
It turns out the return pointer still needs to be saved even when a
builtin sets up a chain call to progs, but rather than the pointer being
simply restored, it needs to be saved in the call stack exactly as if
the function was called directly by progs. This fixes the invalid self
issue quite thoroughly: parameter state seems to be correct across all
calls now.
I should set up an automated test now that I know and understand the
situation.
In Ruamoko ISA progs, the param pointers point to the stack and
generally must most be manipulated by builtins, and there is no need
anyway as Ruamoko doesn't have RCALL. Fixes the mangling of .super.
When calling a builtin, normally the return pointer needs to be
restored, but if the builtin changes the call depth (usually by
effecting "return foo()" as in support for objects, but possibly
setjmp/longjmp when they are implemented), then the return pointer must
not be restored. This gets vkgen past object allocation, but it dies
when trying to send messages to super. This appears to be a compiler
bug.
Many math instructions don't care about the difference between signed
and unsigned operands and are thus specified using int, but need to be
usable with uint. div is NOT mapped because there is a difference:
0x8000 / 2 (16-bit) is 0x4000 unsigned but 0xc000 signed, and 0x8000 /
0xfffe is 0 unsigned and 0x4000 signed. This means I'll need to add some
more instructions. Not sure what to do about % and %% though as that's a
lot of instructions (12).
Since the operand types sort out the difference between asr and shr, no
need to give them different opnames. Means qfcc doesn't need to worry
about which one it's searching for.
Yet another redundant addressing mode (since ptr + 0 can be used), so
replace it with a variable-indexed array (same as in v6p). Was forced
into noticing the problem when trying to compile Machine.r.
Thanks to the size of the type encoding being explicit in the encoding,
anything that tries to read the encodings without expecting the width
will simply skip over the width, as it is placed after the ev type in
the encoding.
Any code that needs to read both the old encodings and the new can check
the size of the basic encodings to see if the width field is present.
It's full of evil hacks, but has always been an evil hack relying on
undefined behavior. The weird shenanigans with local variables are
because Ruamoko doesn't copy the parameters like v6p does and thus v and
z are NOT adjacent as parameters. Worse, the padding is uninitialized
and thus should not be relied upon to be any particular value. Still
does a nice job of testing dot products, though.
With explicit operators, even. While they're a tad verbose, they're at
least unambiguous and most importantly have the right precedence (or at
least adjustable precedence if I got it wrong, but vector ops having
high precedence than scalar or component seems reasonable to me).
I don't remember why I did this originally, but it causes the dags code
to lose the offset temp alias when accessing fields on structural temps
(known to be the case for vectors (temp-component.r), and I seem to
remember having problems with structs).
While it specifically checks vectors, I'm pretty sure it applies to
structs, too. Also, it's a little redundant with vecaddr.r, but is much
more specific and far less evil in what it does (no horrible pointer
shenanigans): just something that is fairly common practice.
I abandoned the reason for doing it (adding a pile of vector types), but
I liked the cleanup. All the implementations are hand-written still, but
at least the boilerplate stuff is automated.
Since Ruamoko progs must use lea to get the address of a local variable,
add use/def/kill references to the move instruction in order to inform
flow analysis of the variable since it is otherwise lost via the
resulting pointer (not an issue when direct var reference move can be
used).
The test and digging for the def can probably do with being more
aggressive, but this did nicely as a proof of concept.