Since the call instruction in the Ruamoko ISA specifies the destination
of the return value of the called function, it is much like any
expression type instruction in that the def referenced by its c operand
is both defined and killed by the instruction. However, unlike other
instructions, it really has many pseudo-operands: the arguments placed
on the stack. The problem is that when one of the arguments is also the
destination of the return value, the dags code wants to use the stack
argument as it was the last use of the real argument. Thus, instead of
using the value of the child node for the result, use the value label
attached to the call node (there should be only one such label).
This fixes iterfunc, typedef, zerolinker and vkgen when optimizing. Now
all but the double tests and return postop tests pass (and the retun
postop test is not related to the Ruamoko ISA, so fails either way).
That is, updating a variable using a function that takes the same
variable, probably very common in iterators, thus the name. It happens
to be the first qfcc test specific to Ruamoko. It's really just the
typedef, zerolinker, and vkgen type encoding loop stripped down for ease
of debugging.
Of course, it fails :)
I really need to come up with a better way to get the result type into
the flow analyser. However, this fixes the aliasing ICE when optimizing
Ruamoko code that uses struct assignment.
Many math instructions don't care about the difference between signed
and unsigned operands and are thus specified using int, but need to be
usable with uint. div is NOT mapped because there is a difference:
0x8000 / 2 (16-bit) is 0x4000 unsigned but 0xc000 signed, and 0x8000 /
0xfffe is 0 unsigned and 0x4000 signed. This means I'll need to add some
more instructions. Not sure what to do about % and %% though as that's a
lot of instructions (12).
Thanks to the size of the type encoding being explicit in the encoding,
anything that tries to read the encodings without expecting the width
will simply skip over the width, as it is placed after the ev type in
the encoding.
Any code that needs to read both the old encodings and the new can check
the size of the basic encodings to see if the width field is present.
It's full of evil hacks, but has always been an evil hack relying on
undefined behavior. The weird shenanigans with local variables are
because Ruamoko doesn't copy the parameters like v6p does and thus v and
z are NOT adjacent as parameters. Worse, the padding is uninitialized
and thus should not be relied upon to be any particular value. Still
does a nice job of testing dot products, though.
With explicit operators, even. While they're a tad verbose, they're at
least unambiguous and most importantly have the right precedence (or at
least adjustable precedence if I got it wrong, but vector ops having
high precedence than scalar or component seems reasonable to me).
I don't remember why I did this originally, but it causes the dags code
to lose the offset temp alias when accessing fields on structural temps
(known to be the case for vectors (temp-component.r), and I seem to
remember having problems with structs).
While it specifically checks vectors, I'm pretty sure it applies to
structs, too. Also, it's a little redundant with vecaddr.r, but is much
more specific and far less evil in what it does (no horrible pointer
shenanigans): just something that is fairly common practice.
Since Ruamoko progs must use lea to get the address of a local variable,
add use/def/kill references to the move instruction in order to inform
flow analysis of the variable since it is otherwise lost via the
resulting pointer (not an issue when direct var reference move can be
used).
The test and digging for the def can probably do with being more
aggressive, but this did nicely as a proof of concept.
This code now reaches into one level of the expression tree and
rearranges the nodes to allow the constant folder to do its things, but
only for ints, and only when the folding is trivially correct (* and *,
+/- and +/-). There may be more opportunities, but these cover what I
needed for now and anything more will need code generation or smarter
tree manipulation as things are getting out of hand.
It now addressing_mode cleaning up store instructions to use ptr+offset
instead of lea;store ptr...
Entity.field addressing has been impelmented as well.
Move instructions still generate sub-optimal code in that they use an
add instruction instead of lea.
This allows the code handling simple pointer dereferences to recurse
along an alias chain that resulted from casting between different
pointer types (such chains could probably be eliminated by replacing the
type in the original pointer expression, but it wasn't worth it at this
stage).
Aliasing an alias expression to the same type as the original aliased
expression is a no-op, so drop the alias entirely in order to simplify
code generation.
Simply dereferencing a pointer does not need to go through array_expr
and thus collect a 0 offset that will only be constant-folded out again.
Really just a minor optimization in qfcc, but at one stage in today's
modification, it resulted in some unwanted aliasing chains.
While this does make the generated code a little worse, load is behaving
nicely), the two are at least consistent with each other and when I fix
one, I'll fix both. I missed this change the other day when I did the
address_expr cleanup. Yay near-duplicate code :P
This is what using new_ret_expr would result in, but new_ret_expr is no
longer used for referencing .return (except in pascal, but I haven't
gotten around to sorting that out) due to the recent changes for Ruamoko
progs. Fixes an ICE when compiling (with optimization) something like
the following (dir is a vector):
dir /= sqrt (dir * dir);
return dir * speed;
It turns out the sorting wasn't working properly and I've decided that
anything that actually needs the defs to be sorted by address (such as a
debugger searching for defs by address) can do the sorting itself. Fixes
a weird swapping of def names.
This is necessary to get statement disassembly working, and likely
debugging in general. locals is the total size of the stack frame and
thus reaches above the function-entry stack pointer, and params_start is
the local space relative start of the parameters. Thus, knowing the
function-entry stack pointer, the bottom of the locals space can be
found by subtracting params_start, and the top of the locals space by
adding (locals - params_start).
This gets all the sections of the progs file nicely aligned and the code
easier to read with the offset and size calculations not being spread
through the function. ivar-struct-return now works when compiled for
Ruamoko.
This cleans up dprograms_t, making it easier to read and see what chunks
are in it (I was surprised to see only 6, the explicit pairs made it
seem to have more).
While I think the reason the dags code moved an instruction before
adjstk and with was they shared a constant with that instruction (which
is a different bug), this ensures other instructions cannot get
reordered in front of adjstk and with, as doing so would cause any such
instructions to access incorrect data.
The goal was to get lea being used for locals in ruamoko progs because
lea takes the base registers into account while the constant pointer
defs used by v6p cannot. Pointer defs are still used for gobals as they
may be out of reach of 16-bit addressing.
address_expr() has been simplified in that it no longer takes an offset:
the vast majority of the callers never passed one, and the few that did
have been reworked to use other mechanisms. In particular,
offset_pointer_expr does the manipulations needed to add an offset
(unscaled by type size) to a pointer. High-level pointer offsets still
apply a scale, though.
Alias expressions now do a better job of hanling aliasing of aliases by
simply replacing the target type when possible.