I had forgotten that unsigned division was different from signed
division (rather silly of me). However, with some testing and analysis,
unsigned true modulo is not needed as it's not possible to have
negative inputs and thus it's the same as remainder.
Yet another redundant addressing mode (since ptr + 0 can be used), so
replace it with a variable-indexed array (same as in v6p). Was forced
into noticing the problem when trying to compile Machine.r.
This cleans up dprograms_t, making it easier to read and see what chunks
are in it (I was surprised to see only 6, the explicit pairs made it
seem to have more).
It turned out that address mode B was redundant as C with 0 offset
(immediate) was the same (except for the underlying C code of course,
but adding st->b is very cheap). This allowed B to be used for
entity.field for all transfer operations. Thus instructions 0-3 are now
free as load E became load B, and other than the specifics of format
codes for statement printing, transfers+lea are unified.
They are both gone, and pr_pointer_t is now pr_ptr_t (pointer may be a
little clearer than ptr, but ptr is consistent with things like intptr,
and keeps the type name short).
It currently fails for two reasons:
- call's mode selection is incorrect (never updated from when there was
only the one call instruction and the mode was encoded in operand c)
- return should automatically restore the stack pointer to the value it
had on entry to the function, thus allowing local values stored on
the stack to be safely returned.
ANY/ALL/NONE have been temporarily removed until I implement the HOPS
(horizontal operations) sub-instructions, which will all both 32-bit and
64-bit operands and several other operations (eg, horizontal add).
All the fancy addressing modes for the conditional branch instructions
have been permanently removed: I decided the gain was too little for the
cost (24 instructions vs 6). JUMP and CALL retain their addressing
modes, though.
Other instructions have been shuffled around a little to fill most of
the holes in the upper block of 256 instructions: just a single small
7-instruction hole.
Rearrangements in the actual engine are mostly just to keep the code
organized. The only real changes were the various IF statements and
dealing with the resulting changes in their addressing.
When creating the tests for lea, I noticed that B was yet another simple
assign, so I decided it was best to drop it and move E into its place
(freeing up another instruction).
Most useful for 64-bit values as only one instruction is needed to move
the data around rather than two, but could be slightly faster for 32-bit
as the addressing is simpler (needs profiling).
Does not include string concatenation because I don't feel like messing
with zone init, but all the other operators are tested (currently
failing due to bool convention)
It calculating only the size of the array (which was often 4 or 8
globals per element) proved to be a pain when I forgot to alter the size
for the new scale tests. Fixing the size calculation even found a bug in
the shiftop tests.
It seems casting from float/double to [unsigned] int/long when the value
doesn't fit is undefined (which would explain the inconsistent results).
Mentioning the possibility seems like a good idea should the results for
such casts change and cause the tests to fail.
They currently fail because for vector values, gcc casts the view, not
the value, so vec4 cast to ivec4 simply views the bits as int rather
than doing the actual conversion.
While working on the new opcode table, I decided a lot of the names were
not to my liking. Part of the problem was the earlier clash with the
v6p opcode names, but that has been resolved via the v6p tag.
* / % %% + -
As a bonus, includes partial tests for a few extra operators. Several
things are broken at this stage, but uncommitted code is already
working.
They even found a bug in the addressing mode functions :) (I'd forgotten
that I wanted signed offsets from the pointer and thus forgot to cast
st->b to short in order to get the sign extension)