Because type aliases need to be unaliased, the type pointers in the type
encodings need to be correct when it comes to linking defs and
functions. This fixes the linking errors in ruamoko/game.
I was very uncertain about the validity of messing with the old type
encoding that way, but adding the check to ensure the type has been
processed never fired, so it seems ok. And the comments help me a lot :)
When aliasing a type that already has aliases, the top node needs to be
replaced if it is unnamed, or the alias-free branch of the new node
needs to reach around to the alias-free branch of the existing node.
This fixes the bogus param counts in qwaq.
This fixes the typelinker test, but not the linking error in
ruamoko/game that it was supposed to represent. I guess there's
something more going on (maybe type encoding relocation issues).
Fixes#6
It turned out that the problem with @zero was caused by initial type
chaining occurring before the structures had been initialized and thus
the linker's @zero type encoding string was incorrect: {?=} instead of
{tag @zero-}, thus when the actual type encoding supplied by an object
file came along (with the correct encoding string), it wasn't found.
It is now "consistent" with the rest of the type building in that it
uses find_type(append_type(return, params)) like the C version, thus
allowing append_type to do its thing with type aliases. This fixes the
overload test.
The full_type branch of an alias splitter (alias with null name) needs
to mirror the clean side up to the type alias. It is causing problems
with functions, but that's expected because parameters complicate
things.
It's not connected up yet because I'm unsure of just where to put things
(it gets messy fast), but just being able to see the structure of
complex types is nice.
This eases type unaliasing on functions a little.
Still more to to go, but this fixes a really hair-pulling bug: linux's
heap randomiser was making the typedef test fail randomly whenever
typedef.qfo was compiled.
When a type is aliased, the alias has two type chains: the simple type
chain with all other aliases stripped, and the full type chain. There
are still plenty of bugs in it, but having the clean type chain takes
care of the major issue that was in the previous attempt as only the
head of the type-chain needs to be skipped for type comparison.
Most of the bugs are in finding the locations where the head needs to be
skipped.
All simple type checks are now done using is_* helper functions. This
will help hide the implementation details of the type system from the
rest of the compiler (especially the changes needed for type aliasing).
They take a pointer to a free-list used for hashlinks so the hashlink
pools can be per-thread. However, hash tables that are not updated are
always thread-safe, so this affects only updates. progs_t has been set
up such that it is easy for multiple progs within one thread can share
hashlinks.
and its usage. The parts of flow_analyze_statement that use it know
where the returned operand needs to go. Unfortunately, this breaks dags
pretty hard, but that's because dags needs to learn about the fancy
assignment-type statements.
This fixes the technically correct but horrible mess of temps and
addressing when dealing with ivars, and the resulting uninitialized
temps due to the non-constant pointers (do need statement level constant
folding, though).
This is part of what messed up float_val in the encoding for @params.
The other part is something in the linker type encoding merge code: it
may be too aggressive. It's also what messed up the size of @params.
That is, those created by operand_address. The dag code needs the
expression that is attached to the statement to have the correct
expression type in order to do the right thing with the operands and
aliasing, especially when generating temps. This fixes assignchain when
optimizing (all tests pass again).
This reverts commit c78d15b331.
While a block expression's result may be an l-value, block expressions
are not (and their results may not be), thus taking the address of one
is not really correct. It seems the only place that tries to do so is
the assignment code when dealing with structures.
This reverts commit b49d90e769.
I suspect this was a workaround for the mess in assignment chains.
However, it caused compile errors with the new implementation, and is
just bogus anyway.
While I still hate ".=", at least it's more hidden, and the new
implementation is a fair bit cleaner (hah, goto a label in an if (0) {}
block).
Most importantly, the expression tree code knows nothing about it. Now
just to figure out what broke func-epxr. A bit of whack-a-mole, but yay
for automated tests.
Doing it in the expression trees was a big mistake for a several
reasons. For one, expression trees are meant to be target-agnostic, so
they're the wrong place for selecting instruction types. Also, the move
and memset expressions broke "a = b = c;" type expression chains.
This fixes most things (including the assignchain test) with -Werror
turned off (some issues in flow analysis uncovered by the nil
migration: memset target not extracted).
Now convert_nil only assigns the nil expression a type, and nil makes
its way down to the statement emission code (where it belongs, really).
Breaks even more things :)
It's not possible to take the address of constants (at this stage) and
trying to use a move instruction with .zero as source would result in
the VM complaining about null pointer access when bounds checking is on.
Thus, don't convert a nil source expression until it is known to be
safe, and use memset when it is not.
This fixes the problem of using the return value of a function as an
element in a compound initializer. The cause of the problem is that
compound initializers were represented by block expressions, but
function calls are contained within block expressions, so def
initialization saw the block expression and thought it was a nested
compound initializer.
Technically, it was a bug in the nested element parsing code in that it
wasn't checking the result value of the block expression, but using a
whole new expression type makes things much cleaner and the work done
paves the way for labeled initializers and compound assignments.
Not that it really makes any difference for labels since they're
guaranteed unique, but it does remove the question of "why nva instead
of save_string?". Looking at history, save_string came after I changed
it from strdup (va()) to nva(), and then either didn't think to look for
nva or thought it wasn't worth changing.
Multi-line calls (especially messages) got rather confusing to read as
the lines jumped back and forth. Now the binding is better but the dags
code is reordering the parameters sometimes.
The server code is not yet ready for doubles, especially in its varargs
builtins: they expect only floats. When float promotion is enabled
(default for advanced code, disabled for traditional or v6only),
"@float_promoted@" is written to the prog's strings.
That was a fair bit trickier than I thought, but now .return and .paramN
are handled correctly, too, especially taking call instructions into
account (they can "kill" all 9 defs).
As expected, this does not fix the mangled pointer problem in
struct-init-param.r, but it does improve the ud-chains. There's still a
problem with .return, but it's handling in flow_analyze_statement is a
bit "special" :P.
Doing the same thing at the end of two branches of an if/else seems off.
And doing an associative(?) set operation every time through a loop is
wasteful.
This fixes the ICE when attempting to compile address-cast without
optimization (just realized why, too: the assignment was optimized out
of existence).
This the fixes the incorrect flow analysis caused by the def being seen
to have the wrong size (structure field of structure def seen through a
constant pointer). Fixes the ICE, but the pointer constant is broken
somewhere in dags, presumably.
This fixes the problem of using nil for two different compound types
within the one expression. The problem is all compound types have the
same low-level type (ev_invalid) and this caused the two different nils
to have the same type when taken back up to expression level.
While expression symbols worked for what they are, they weren't so good
for ivar access because every ivar of a class (and its super classes)
would be accessed at method scope creation, generating spurious access
errors if any were private. That is, when the access checks worked at
all.
The end goal was to fix erroneous non-constant initializer errors for
the following (ie, nested initializer blocks):
typedef struct { int x; int y; } Point;
typedef struct { int width; int height; } Extent;
typedef struct Rect_s { Point offset; Extent extent; } Rect;
Rect makeRect (int xpos, int ypos, int xlen, int ylen)
{
Rect rect = {{xpos, ypos}, {xlen, ylen}};
return rect;
}
However, it turned out that nested initializer blocks for local
variables did not work at all in that the relocations were lost because
fake defs were being created for the generated instructions.
Thus, instead of creating fake defs, simply record the offset relative
to the base def, the type, and the basic type initializer expression,
then generate instructions that all refer to the correct def but with a
relative offset.
Other than using the new element system, static initializers are largely
unaffected.
This is for adding methods to classes and protocols via their interface,
not for adding methods by adding protocols (they still get copied).
Slightly more memory efficient.
Copying methods is done when adding protocols to classes (the current
use for adding regular methods is an incorrect solution to a different
problem). However, when a method is added to a class, the type of its
self parameter is set to be a pointer to the class. Thus, not only does
the method need to be copied, the self parameter does too, otherwise
the self parameter of methods added via protocols will have their type
set to be a pointer to the last class seen adding the protocol.
That is, if, while compiling the implementation for class A, but the
interface for class B is comes after the interface for class A, and both
A and B add protocol P, then all methods in protocol P will have self
pointing to B rather than A.
@protocol P
-method;
@end
@interface A <P>
@end
@interface B <P>
@end
@implementation A
-method {} // self is B, not A!
@end
Duplicate methods in an interface (especially across protocols and
between protocols and the interface) are both harmless and even to be
expected. They certainly should not cause the compiler to demand
duplicate method implementations :)
This is actually a double issue: when a class implementing a protocol
used the protocol in @protocol(), not only would the protocol get
emitted as part of the class data specifying that the class conforms to
the protocol, a second instance would be emitted again when @protocol()
was used. On top of that, only the instance referenced by @protocol()
would be initialized. Now, both class emission and @protocol() get their
protocol def from the same place and thus only one, properly
initialized, protocol instance is emitted.
The problem was an erroneous assumption that the methods had to be
defined. Any class implementing a protocol must implement (and thus
define) the methods, but a protocol declaration cannot: it merely
declares the methods, and it's entirely possible for a module to see
only the protocol definition and not any classes implementing the
protocol.
Unlike gcc, qfcc requires foo to be defined, not just declared (I
suspect this is a bug in gcc, or even the ObjC spec), because allowing
forward declarations causes an empty (no methods) protocol to be
emitted, and then when the protocol is actually defined, one with
methods, resulting in two different versions of the same protocol, which
comments in the gnu objc runtime specifically state is a problem but is
not checked because it "never happens in practice" (found while
investigating gcc's behavior with @protocol and just what some of the
comments about static instance lists meant).
It proved to be too fragile in its current implementation. It broke
pointers to incomplete structs and switch enum checking, and getting it
to work for other things was overly invasive. I still want the encoding,
but need to come up with something more robust.a
Such declarations were being lost, thus in the following, the id field
never got added:
typedef struct qwaq_mevent_s {
int id;
int x, y, z;
int buttons;
} qwaq_mevent_t;
typedef is meant to create a simple renaming of a potentially complex
type, not create a new type. Keeping the parameter type alias info makes
the types effectively different when it comes to overloaded function
resolution, which is quite contrary to the goal. Does expose some
breakage elsewhere, though.
For technical reasons (programmer laziness), qfcc does not fix up local
def type encodings when writing the debug symbols file (type encoding
location not readily accessible).
The debug subsystem now uses the resources system to ensure it cleans
up, and its data is now semi-private. Unfortunately, PR_LoadDebug had to
remain public for qfprogs because using PR_RunLoadFuncs would cause
builtin resolution to complain.
Attempting to define a variable with an incomplete type is an error, and
results in a default size 1 of allocated, but I forgot to set default
alignment when implementing alignment.
The addition of xdef data has made qfo_to_progs unusable in qfprogs,
resulting in various invalid memory accesses. It always was an ugly hack
anyway, so this is the first step to proper qfo support in qfprogs.
I was originally going to put it in the debug syms file, but I realized
that the data persistence code would need access to both def type and
certainly correct def offsets for defs in far data.
This far better reflects the actual meaning. It is very likely that
ty_none is a holdover from long before there was full type encoding and
it meant that the union in qfcc's type_t had no data. This is still
true for basic types, but only if not a function, field or pointer type.
If the type was function, field or pointer, it was not true, so it was
misnamed pretty much from the start.
It was long wrong anyway as it checked past the end of the function's
parameters, which caused a segfault when calling varargs functions with
no formal parameters.
The encoding is 3:5 giving 3 bits for alignment (log2) and 5 bits for
size, with alignment in the 3 most significant bits. This keeps the
format backwards compatible as until doubles were added, all types were
aligned to 1 word which gets encoded as 0, and the size is unaffected.
This fixed the uninitialized temp warning in HUD.r. The problem was
caused by the flow analyzer not being able to detect that the struct
temp was being initialized by the move statement due to the address of
the temp being in a pointer temp. While it would be good to use a
constant pointer for the address of the struct temp or improving the
flow analyzer to track actual data, avoiding the temp in the first place
results in nicer code as it removes a move statement.
With this, cast address initializers work. I have to wonder if the alias
value short-circuit was legacy from long before the rewrite, as it was
quite trivial to handle in the back-end.
All functions are stored in the overload functions table, even those
that are never explicitly overloaded, but only explicitly overloaded
functions (those with @overload) use the type-qualified naming.
Only as scalars, I still need to think about what to do for vectors and
quaternions due to param size issues. Also, doubles are not yet
guaranteed to be correctly aligned.
I plan on adding doubles, and so it's necessary to ensure that attempts
to align doubles in local or far data spaces remain aligned after final
linking.
In order to keep enumerator type and enum type the same, the values need
to have their type set after the enum type is finalized, and then the
appropriate symbols created in the parent scope. This fixes the infinite
recursion when assigning an enum value to its own type.
This is where constant folding should have happened all along. While
unary_expr should fold constants too, it seems to already try to do so
and it's a bit much of a mess to clean up right now.
This is for modern code. Traditional code still treats initialized
globals as constant and nosave. This will make a bit of a mess of
modern code that expects traditional behavior.
While this does break automatic type promotion, it does stop
fold_constants recursing through complex expressions: only the top level
expression needs to be folded, and then only if both sides are actually
constant.
I don't remember what the goal was (stopped working on it eight months
ago), but some possibilities include:
- better handling of nil (have trouble with assigning into struts)
- automatic forward declarations ala C# and jai (I was watching vids
about jai at the time)
- something for pascal
- simply that the default symbol type should not be var (in which case,
goal accomplished)
After messing with SIMD stuff for a little, I think I now understand why
the industry went with xyzw instead of the mathematical wxyz. Anyway, this
will make for less pain in the future (assuming I got everything).
I've decided that setting pr.max_edicts and pr.zone_size as part of the
local progs initialization rather than in PR_LoadProgsFile makes more
sense. For one, it is unlikely for the limits to change every time progs is
reloaded. Also, they seem to be a property of the VM rather than the progs.
However, there is nothing stopping the caller from updating max_edicts and
zone_size every call.
While scan-build wasn't what I was looking for, it has proven useful
anyway: many of the sizeof errors were just noise, but a few were actual
bugs (allocating too much or too little memory).
This fixes a segfault when optimizing the empty-body test. The label was
getting moved, but the statement block to which it pointed was not updated
and thus it pointed to dead data.
Saw a discussion of such in #qc and that gcc implemented it. I realized it
would be pretty easy to detect and very useful (I've made such mistakes at
times).
It is now in its own file and uses table lookups to check for valid type
and operator combinations, and also the resulting type of the expression.
This probably breaks multiple function calls in the one expression.
This is a bit of a workaround to ensure the operands have their types
setup correctly. Really, binary_expr needs to handle expression types
properly.
This fixes the bogus error for comparing the result of pointer subtraction
with an integer.
Currently, they can represent either vectors or quaternions, and the
quaternions can be in either [s, v] form or [w, x, y, z] form.
Many things will not actual work yet as the vector expression needs to be
converted into the appropriate form for assigning the elements to the
components of the "vector" type.
It's sometimes more useful to have direct access to each individual
component of the imaginary part of the quaternion, and then for
consistency, alias w and s.
This is a nice feature found in fteqcc (also a bit of a challenge from
Spike). Getting bison to accept the new expression required rewriting the
state expression grammar, so this is mostly for the state expression. A
test to ensure the state expression doesn't break is included.
This goes towards complementing the "if not" logic extension. I need to
check if fteqcc supports "not" with "while" (the version I have access to
at the moment does not), and also whether it would be good to support
"not" with "for", and if so, what form the syntax should take.
It is syntactic sugar for if (!(foo)), but is useful for avoiding
inconsistencies between such things as if (string) and if (!string), even
though qcc can't parse if not (string). It also makes for easier to read
code when the logic in the condition is complex.
It turns out this is required for compatibility with qcc (and C, really).
Once string to boolean conversions are sorted out completely (not that
simple as qcc is inconsistent with if (string) vs if (!string)), Qgets can
be implemented :)
It looks like I had forgotten that the compare function is supposed to
return true/false (unlike memcmp's sorting ability). Also, avoid the
pointers in the value struct as they can change without notice.
Using enums in switches now works nicely, including warnings for unused
enum values.
Either I had gotten confused while writing the code and mixed up line and
offset, or I had changed offset to line at one stage but missed a place.
This fixes the segfault when compiling chewed-alias.r and return-ivar.r
Rather than prefixing free_ to the supplied name, suffix _freelist to the
supplied name. The biggest advantage of this is it allows the free-list to
be a structure member. It also cleans up the name-space a little.
type_obj_class is no longer a class, so its ivars are not stored in
type_obj_class.t.class->ivars but rather type_obj_class.t.symtab.
This fixes the segfault Spirit and Randy were experiencing.
In passing, correct the unneeded emission of meta class ivars for non-root
classes. This should make for much smaller progs that use classes.
MOVEP's opc itself is always known and used, whether it's a constant
pointer or variable doesn't matter. This fixes the lost pointer calculation
for va_list.list[j] = object_from_plist (item);
Dead nodes are those that generate unused values (unassigned leaf nodes,
expressions or destinationless move(p) nodes). The revoval is done by the
flow analysis code (via the dags code) so that any pre and post removal
flow analysis and manipulation may be done (eg, available expressions).
assign_expr mangles the destination expression for dereferenced
assignments into something that is invalid as an lvalue, so simply use
new_binary_expr with the same opcode.
It turns out expression trees are (mostly?) valid DAGs, so all edges being
constrained works, though the graphs get a little tall (but easier to read).
This fixes the infinite loop in if ((x = self.heat && x))
Really, I think I need to revisit the whole expression tree code. It's
proving to be rather fragile.
The source of the assignment is used as the value to test, and the
assignment itself is inserted into the boolean expressions's block. This
fixes the inernal error for "if ((x = 0))".
Normally, it will happen only as a follow-on error, but I can think of a
way to force it without other errors, so treating it as an internal error
is a bit harsh.
But reset current_symtab to its prior value when done. This fixes a
segfault caused by initializing the class system while parsing a struct
(eg, one of the members is of type id).