While scan-build wasn't what I was looking for, it has proven useful
anyway: many of the sizeof errors were just noise, but a few were actual
bugs (allocating too much or too little memory).
If the kahan triangle area method breaks, I did something wrong with qfcc's
handling of parentheses (ie, floating point math is not truly associative).
This fixes a segfault when optimizing the empty-body test. The label was
getting moved, but the statement block to which it pointed was not updated
and thus it pointed to dead data.
Saw a discussion of such in #qc and that gcc implemented it. I realized it
would be pretty easy to detect and very useful (I've made such mistakes at
times).
It is now in its own file and uses table lookups to check for valid type
and operator combinations, and also the resulting type of the expression.
This probably breaks multiple function calls in the one expression.
This is a bit of a workaround to ensure the operands have their types
setup correctly. Really, binary_expr needs to handle expression types
properly.
This fixes the bogus error for comparing the result of pointer subtraction
with an integer.
Currently, they can represent either vectors or quaternions, and the
quaternions can be in either [s, v] form or [w, x, y, z] form.
Many things will not actual work yet as the vector expression needs to be
converted into the appropriate form for assigning the elements to the
components of the "vector" type.
It's sometimes more useful to have direct access to each individual
component of the imaginary part of the quaternion, and then for
consistency, alias w and s.
This is a nice feature found in fteqcc (also a bit of a challenge from
Spike). Getting bison to accept the new expression required rewriting the
state expression grammar, so this is mostly for the state expression. A
test to ensure the state expression doesn't break is included.
I think it may have been for compatibility with a certain qcc variant (no
idea which one, though). While the shift/reduce conflict is fixable using
"%prec IFX" on the const:string rule, the colon breaks test?"a":"b".
Putting parentheses around "a" allows such a construct, requiring them
breaks comatibility with C. I think this feature just isn't worth that.
This goes towards complementing the "if not" logic extension. I need to
check if fteqcc supports "not" with "while" (the version I have access to
at the moment does not), and also whether it would be good to support
"not" with "for", and if so, what form the syntax should take.
It is syntactic sugar for if (!(foo)), but is useful for avoiding
inconsistencies between such things as if (string) and if (!string), even
though qcc can't parse if not (string). It also makes for easier to read
code when the logic in the condition is complex.
It turns out this is required for compatibility with qcc (and C, really).
Once string to boolean conversions are sorted out completely (not that
simple as qcc is inconsistent with if (string) vs if (!string)), Qgets can
be implemented :)
It looks like I had forgotten that the compare function is supposed to
return true/false (unlike memcmp's sorting ability). Also, avoid the
pointers in the value struct as they can change without notice.
Using enums in switches now works nicely, including warnings for unused
enum values.
Either I had gotten confused while writing the code and mixed up line and
offset, or I had changed offset to line at one stage but missed a place.
This fixes the segfault when compiling chewed-alias.r and return-ivar.r
Rather than prefixing free_ to the supplied name, suffix _freelist to the
supplied name. The biggest advantage of this is it allows the free-list to
be a structure member. It also cleans up the name-space a little.
type_obj_class is no longer a class, so its ivars are not stored in
type_obj_class.t.class->ivars but rather type_obj_class.t.symtab.
This fixes the segfault Spirit and Randy were experiencing.
In passing, correct the unneeded emission of meta class ivars for non-root
classes. This should make for much smaller progs that use classes.
MOVEP's opc itself is always known and used, whether it's a constant
pointer or variable doesn't matter. This fixes the lost pointer calculation
for va_list.list[j] = object_from_plist (item);
Dead nodes are those that generate unused values (unassigned leaf nodes,
expressions or destinationless move(p) nodes). The revoval is done by the
flow analysis code (via the dags code) so that any pre and post removal
flow analysis and manipulation may be done (eg, available expressions).
assign_expr mangles the destination expression for dereferenced
assignments into something that is invalid as an lvalue, so simply use
new_binary_expr with the same opcode.
It turns out expression trees are (mostly?) valid DAGs, so all edges being
constrained works, though the graphs get a little tall (but easier to read).
This fixes the infinite loop in if ((x = self.heat && x))
Really, I think I need to revisit the whole expression tree code. It's
proving to be rather fragile.
The source of the assignment is used as the value to test, and the
assignment itself is inserted into the boolean expressions's block. This
fixes the inernal error for "if ((x = 0))".
Normally, it will happen only as a follow-on error, but I can think of a
way to force it without other errors, so treating it as an internal error
is a bit harsh.
But reset current_symtab to its prior value when done. This fixes a
segfault caused by initializing the class system while parsing a struct
(eg, one of the members is of type id).
The keywords table was rather awkward to edit (and sometimes confusing).
Worse, because the hash table used to look up the keywords was initialized
only once, changing modes in the same execution of qfcc would not work
properly as keywords would not be added or removed as appropriate.
Now there are four categories of keywords:
o "core" Always available. They form the core of QuakeC except for two
extensions.
o "@" In extended and advanced modes, the preceeding @ is optional,
but tranditional mode requires the keywords to be preceeded by
an @. They are the C keywords that QuakeC did not use, but can
be implemented in v6 progs under certain circumstances.
o "QF" These keywords require the QuakeForge VM to be usable.
o "Obj" These keywords form Ruamoko/Objective-QuakeC and require both
advanced mode and the QuakeForge VM.
This fixes the segfault/null pointer access in sendv.r. While I wanted to
use the edge setting code to set the live bit, I didn't expect it to be
this easy. def_visit_all is proving to be worth every bit it consumes :)
It seems dag_set_live_vars still served a purpose after all, but I don't
feel like bringing it as I'd rather implement its param handing in
dagnode_set_edges. I've now got a test case for it, though the test
currently causes the VM to segfault (even with pr_boundscheck 2!).
If the final block ends in a conditional statement, appending return to the
block will hide the conditional statement from the flow analyzer. This may
cause the conditional statement's destination node be become unreachable
according to the analyzer and thus eliminated. The label for the branch
then loses its target sblock and thus the code generator will produce a
zero-distance jump resulting in an infinite loop.
Thus, if the final block ends in a conditional statement (or, for
completeness, a call statement), append a new empty block before adding the
return statement.
If MOVEP's destination is variable, then the actual destination isn't (at
this stage) knowable, so it can't be attached to the dagnode and thus must
be a child.
Getting the operands directly from the statement was missing the
destination operand of movep when movep's op_c was a constant pointer and
thus the flowvar wasn't being counted/created early enough. This led to a
segfault in the set code when attempting to add -1 to the set.
It turns out the recent dead-block code "broke" vector component access
from objects. The breakage is really highlighting a problem with temporary
operands and aliasing. The problem was hiding behind a basic-block split
that the recent dead-block work mended and thus exposed the bug.
type_id is implemented as a pointer to "struct obj_object" (ie, not really
a class), so the correct check is to ensure the type is:
1 a pointer
2 to a struct
3 using the same symbol table as type_obj_object
Empty structs are now (correctly) invalid. The hack of using an empty
struct to represent a handle returned from a builtin has been unnecessary
since opaque structs were implemented: now a pointer to an opaque struct
can be used. This is mostly safe as handles are aways negative and thus
attempting to dereference such a pointer should result in a VM error. It
will be even safer once const is implemented and the pointers can be made
constant (eg, typedef struct handle * const handle;)
void foo (int); is fine for a prototype (or, presumably, a qc function
variable), but not for an actual function body. This fixes the segmentation
fault when the parameter name is omitted.
This is needed to allow compile-time protocol conformance checks, though
nothing along those lines has been implemented yet.
id has been changed from TYPE to OBJECT, required to allow id <proto> to be
parsed. OBJECT uses symbol, allowing id to be redefined once suitable work
has been done on the parser.
It uses the new block merge code. Now forgotten return statements are
detected properly (naive dead block removal) and all unreachable code is
eliminated (flow analysis unreachable node removal).
This reverts commit 83ead0842f.
Note: does not compile.
It turns out basic dead block removal is needed for the "control reaches
end of non-void function" warning to work correctly.
Empty sblocks are removed (unless it's the only sblock), and blocks that
are split unnecessarily are merged.
This mostly fixes bogus "no return" warnings.
Unreachable nodes will cause the first elements of the array to remain
unwritten by df_search. This fixes the segfaults caused by unreachable
nodes (the reason they were an internal error before).
The current implementation probably needs more work, but for the case where
I needed it, it does the job.
grid.r💯 vector size = {range, range, 0};
0115 store.f range, size
0116 store.f range, [$2ac]
0117 store.f .zero, [$2ad]
After all that effort getting the class def initialized early enough for
type encodings to work, it proved to be a problem: just including a header
with an interface in it would cause linker errors if there was no
implementation available (even if the class is never used).
qfcc now does local common subexpression elimination. It seems to work, but
is optional (default off): use -O to enable. Also, uninitialized variable
detection is finally back :)
The progs engine now has very basic valgrind-like functionality for
checking pointer accesses. Enable with pr_boundscheck 2
Temps aren't supported yet :P
The alias defs themselves aren't killed (still want any assignments to
occur) but rather, their nodes are. Also, edges to the alias defs' nodes
are added to the assigning node. Fixes structlive.r :)
I got fed up with using "int" types, but the members being "integer"
(hold-over from before the int rename).
Also, correct the names of those types and @va_list (error reporting was
chopping off part of the name).
MOVE (static move) and MOVEP to a pointer constant know exactly where their
data is going, so treat them similarly to assignments: save their
distination operands (the addressed def for MOVEP) and mark them as
defined.
The live var flow analysis doesn't check for aliases. Rather than changing
it to check for aliases (which might break uninitialized var analysis, as
it uses "use" from the live var analysis), make dag_remove_dead_vars do the
check. Fixes the misplaced text in the menus.
Nifty: if you pass a struct via reference to a function, and a field of
that struct may be both set and not set (eg, set only in an if statement),
gcc will report that field assuming that fields that are never set will be
set by the function (my interpretation).
* taniwha ponders the flow analysis for that
Nifty: if you pass a struct via reference to a function, and a field of
that struct may be both set and not set (eg, set only in an if statement),
gcc will report that field assuming that fields that are never set will be
set by the function (my interpretation).
* taniwha ponders the flow analysis for that
At the statement level, all pointer types are the same, so just return the
op obtained from the sub-expression when the low-level type of the alias
expression matches the low-level type of the type of type sub-expression
operand.
With this, the alias of a value code can be removed (I always thought it
was wrong), which is what broke calling obj_msgSend_super (type &.super
param lost the &).
Now I have to deal with pointer values in the optimizer :/
When an alais def (or aliased def) is used, any overlapping aliases that
have previously been assigned need to be marked as live, and edges to the
aliases added to the new node. However, when assigned to, live-forcing
needs to be turned off.
This fixes the lost assignments to .super.
This fixes the bogus temps for "*to = *from++;", but qfcc ices due to the
operand types being lost. It seems alias operands need to be resurrected,
if only for code output by dags.
I forgot to add func->num_statements :P. Fixes the weirdness where only
some alias temps were being (bogusly) detected as uninitialized. Now they
all are.
When the naive uninitialized variable detection finds a node with possible
uses of uninitialized variables, the statements in the node are scanned one
at a time checking each usage and removing uninitialized definitions as
appropriate. vectest.r now compiles without warnings. As an added bonus,
accurate line number information is reported for uninitialized variables.
Unfortunately, there is still a problem with uninitialized temps in
switch.r, but that might just be poor handling of temp op aliases.
Only definitions for the def used in the current statement (whether an
alias or not) are suitable for killing. Doing otherwise defeats the purpose
of this work :P
Fixes the false negatives found in a modified quattest.r (commented out the
"tq.s = 0;" line).
Nicely, the use sets from live_variable analysis can be used too, though
there are some problems with the naive implementation. For:
vector foo (float x, float y, float z)
{
vector v;
v.x = x;
v.y = y;
v.z = z;
return v;
}
qfcc thinks v is uninitialized, but if "if (x) return nil;" (or any other
basic-block splitter) is put just before the return v; qfcc correctly
detects that v is initialized. The reason is that the inits are in the same
basic block as the return, and thus aren't affecting the reaching
definitions, which are stored per-block.
The naive implementation should be good for a fast-cull before doing a
per-statement check.
The exit dummy block is setup to provide dummy uses of global variables to
the live variable analysis doesn't miss global variables. Much cleaner than
the previous code :) There may be some issues with aliases, though.
The entry dummy block is setup to provide dummy definitions of local
variables so the reaching definitions analysis can be used to detect
uninitialized variables (not implemented yet). Fake statement numbers
(func->num_statements + X) are used to represent the definitions. Local
variables (ie, not temp ops) use their offsets (ie, the offset range they
cover) for X. Temp ops use their flowvar number + the size of the
function's defspace for X. flow_kill_aliases() should take care of temp op
aliasing, while the use of the actual offsets spanned by the variable's def
should take care of any wild aliasing so structures and unions should
become a non-issue.
The dummy nodes are for detectining uninitialized variables (entry dummy)
and making globals live at function exit (exit dummy). The reaching defs
and live vars code currently seg because neither node has had its sets
initialized.
Fixed aliases are those that will never change through the life of the
code. They are generated from structure accesses and thus what they alias
is always known.
Also move the ALLOC/FREE macros from qfcc.h to QF/alloc.h (needed to for
set.c).
Both modules are more generally useful than just for qfcc (eg, set
builtins for ruamoko).
Set of everything is implemented by inverting the meaning of bits in the
bitmap: 1 becomes non-member, 0 member. This means that set_size and
set_first/set_next become inverted and represent non-members as counting
members becomes impossible :)
Aliasing the jump table to an integer broke statement_get_targetlist with
the new alias def handling, and was really wrong anyway. I probably did
that due to being fed up with things and wanting to get qfcc working again
rather than spending time getting jumpb right.
With the need to handle aliasing in the optimizer, it has become apparent
that having the flow data attached to symbols is not nearly as useful as
having it attached to defs (which are views of the actual variables).
This also involves a bit of a cleanup of operand types: op_pointer and
op_alias are gone (this seems to greatly simplify the optimizer)
There is a bit of a problem with enums in switch statements, but this might
actually be a sign that something is not quite right in the switch code
(other than enums not being recognized as ints for jump table
optimization).
Turns out there was only one place to fix (for qc, anyway: I don't have
tests for qp yet). func-static now passes :)
Hmm, how to test for static var naming... (not implemented yet)
Simply "backed" and "virutal". Backed spaces have memory allocated to them
while virtual spaces do not. Virtual spaces are intended for local
variables and entity fields.
It turns out they're not getting allocated properly (they're put in the
function's locals defspace rather than near data), but fixing it proved a
little more problematic than expected, so the test is marked as XFAIL for
now to remind me to fix it.
With this, alias defs become singletons based on the def they alias and the
type and offset of the alias. Thus, the removal of the free_def call in
emit.c.
alias_def now always creates an offset def (though the usual case has an
offset of 0). The if the alias escapes the bounds of the base def, an
internal error will be generated.
It really doesn't seem wise to allow the compiler to do so as it would
overwrite unrelated defs. The only time such a thing is valid is the return
statement (silly vm design), and that's read-only.
Also remove the extern for current_storage as it belongs in shared.h.
I'm not satisfied with the documentation for initialize_def, but it will do
for now. I probably have to rewrite the thing as it's a bit of a beast.
With the intoduction of the statement type enum came a prefix clash. As
"st" makes sense for "statement type", I decided that "storage class"
should be "sc". Although there haven't been any problems as of yet, I
decided it would be a good idea to clean up the clash now. It also helps
avoid confusion (I was a bit surprised after working with st_assign etc to
be reminded of st_extern etc).
qfcc isn't meant to be long running, so I'm not super worried about memory
usage, but definitely lost memory blocks when compiling just a single
function seems a tad sloppy.
It doesn't quite work yet, but...
It has proven necessary to know what type .return has at any point in the
function. The segfault in ctf is caused by the return statement added to
the end of the void function messing with the expr pointer stored in the
daglabel for .return. While this is actually by design (though the
statement really should have a valid expr pointer rather than), it actually
highlights a bigger problem: there's no stable knowledge of the current
type of .return. This is not a problem in expression statements as the
dagnodes for expression statements store the desired types of all operands.
However, when assigning from .return to attached variables in a leaf node,
the type of .return is not stored anywhere but the expression last
accessing .return.
Now information like dags or live variables are dumped separately, and the
live variable information replaces the flow node in the diagram (like dags
have recently).
They really should have been in statements.[ch] in the first place
(actually, they sort of were: is_goto etc, so some redundant code has been
removed, too).
Modifying the existing alias chain proved to be a bad idea (in retrospect,
I should have known better:P). Instead, just walk down any existing alias
chain to the root operand and build a new alias from that.
The goto for the default expression is the source of the mis-counted label
users: the label was being counted by the goto, but the goto was never
being inserted into the code (only v6 progs or "difficult" types insert the
goto).
Such nodes are unreachable code (ie, dead blocks), but the dead block
removal code failed to remove them (current known cause: miscounted label
userrs). As such blocks cause problems for data flow analysis, ignoring
them is not a good idea. Thus make them an internal error.
vectors, quaternions and structs are a little tricky. I need to think about
how to get them working, but I also want qfcc to get through as much code
as possible.
The evil comment is not just "pragmas are bad, ok?", but switching between
advanced, extended and tradtitional modes when compiling truly is evil and
not guaranteed to work. However, I needed it to make building test cases
easier (it's mostly ok to go from advanced to extended or tradtional, but
going the other way will probably cause all sorts of fun).
In the process, opcode_init now copies the opcode table data rather than
modifying it.
After running across a question about lists of animation frames and states,
I decided giving qfcc the ability to generate such lists might be a nice
distraction from the optimizer :) Works for both progs.src and separate
compilation. No frame file is generated if no macros have been created.
It really should be impossible, but I'm not sure where the bug is yet
(though there are uninitialized variables that are false positives that
most definitely are initialized, might be related)
Pointing to aliases of the var causes all sorts of problems, but this time
it was causing the uninitialized variable detector to miss certain
parameters.
It is necessary to know if a def is a function parameter so it can be
treated as initialized by the flow analyzer. The support for the flag in
object files is, at this stage, purely for debugging purposes.
The structvar2 = structvar1 is implemented as a move expresion, which
address_expr didn't like. Return the address of the source. For indirect
move expressions, this is just the source expression itself.
Constant/label nodes should never be killed because they can (in theory)
never change. While constants /can/ change in the Quake VM, it's not worth
worrying about as there would be much more important things to worry about
(like 2+2 not giving 4).
Due to the hoops one would have to jump through, it is assumed that a
pointer or an offset from that pointer will never overwrite the pointer.
Having the source operand of a pointer assignment available to later
instrctions can make for more efficient code as the value does not need to
be dereferenced later. For this purpose, pointer dereference dag nodes now
store the source operand as their value, and dagnode_match will match x=a.b
with *(a+b)=y so long as both a and b are the same in both nodes. x and y
are irrelevant to the match. The resulting code will be the equivalent of:
*(a+b) = y;
x = y;
.return and .param_N are not classed as global variables for data flow
analysis. .return is taken care of by return statements, and .param_N by
call statements.
With this, the menus work up to attempting to load the menu plist.
Something is corrupting zmalloc's blocks.
Accessing the final statement of an sblock via tail doesn't work in an
empty sblock because tail points to sblock->statements and thus the cast is
invalid. This bug has be lurking for a long time, but for some reason the
cse stuff tickled it (thankfully!!!).
Function calls need to ensure .param_N actually get assigned, and so the
params must be seen as live by the dead variable removal code. However, it
is undesirable to modify the live vars data of the flow node, so make a
local copy.
With temp types changing and temps being reused within the one instruction,
the def type is no longer usable for selecting the opcode. However, the
operand types are stable and more correct.
The main void defs are .return and .param_N. If the source operand is void,
use the destination operand's type to alias the source operand rather than
the source operand's type to alias the destination's operand (the usual
case).
The dags code isn't the only place that creates temporary variables, so
count them as they go into a statement rather than when they're created.
This fixes the temp underflows.
Nicely, the need for dag_gencode to recurse seems to have been removed.
At least for a simple case, correct code is generated :)
switch.r:49: case 1: *to = *from++;
003b loadbi.i *(from + 0), .tmp10
003c add.i from, .imm, from
003d storep.i .tmp10, *to
A node that writes to a var must be evaluated after any node that reads
that var, so for any node reading var, add that node to the edges of the
node currently associated with the var (unless the node is a child of the
node reading the var).
It doesn't make any difference yet, but that's because I need to add extra
edges indicating iter-node dependencies. However, the sort does seem to
work for its limited input.
Not adding them while creating the dag completely broke the dag as
node(deadvar) always returned null. Code quality is back to where it was
before the dags rewrite.
While things are quite broken now (very incorrect code is being generated),
the dag is much easier to work with. The dag is now stored in an array of
nodes (the children pointers are still used for dagnode operands), and sets
are used for marking node parents, attached identifiers and (when done,
extra edges).
Instead of storing the generating statement in the dagnode, the generating
expression is stored in the daglabel. The daglabel's expression pointer is
updated each time the label is attached to a node. Now I know why debugging
optimized code can be... interesting.
It now seems to generate correct code for each node. However, node order is
still incorrect in places (foo++ is being generated as ++foo). quattest.r
actually executes and produces the right output :)
flow_analyze_statement uses the statement type to quickly determin which
operands are inputs and which are outputs. It takes (optional) sets for
used variables, defined variables and killed variables (only partially
working, but I don't actually use kill sets yet). It also takes an optional
array for storing the operands: index 0 is the output, 1-3 are the inputs.
flow_analyze_statement clears any given sets on entry.
Live variable analysis now uses the sets rather than individual vars. Much
cleaner code :).
Dags are completely broken.
The types are expression, assignment, pointer assignment (ie, write to a
dereferenced pointer), move (special case of pointer assignment), state,
function call/return, and flow control. With this classification, it will
be easier (less code:) to determine which operands are inputs and which are
outputs.
Using "=" was rather confusing, so changing it to "<CONV>" seems to be a
good idea. As the string is used only for selecting opcodes at compile
time, only qfcc is affected.
Using "=" was rather confusing, so changing it to "<CONV>" seems to be a
good idea. As the string is used only for selecting opcodes at compile
time, only qfcc is affected.
Surprisingly, I don't yet have to "throw one out", but things are still
problematic: rcall1 is getting two arguments, goto and return get lost,
rcall2 got an old temp rather than the value it was supposed to, but
progress :)
This allows temporary variables that are used in multiple nodes to remain
in the dag, but also will allow more freedom when generating code from the
dag.
The root nodes of the dag need to be evaluated in execution order as some
roots may depend on the results of earlier roots (but then, this might also
be related to the problem of function calls not specifying all of their
parameters to the dag).
An instruction that both reads and writes the same variable will read the
variable before writing to it, so the instruction uses the variable rather
than defines it (for live-variable purposes).
First, it turns out using daglabels wasn't such a workable plan (due to
labels being flushed every sblock). Instead, flowvars are used. Each actual
variable (whether normal or temp) has a pointer to the flowvar attached to
that variable.
For each variable, the statements that use or define the variable are
recorded in the appropriate set attached to each (flow)variable.
The flow graph nodes are now properly separated from the graph, and edge
information is stored in the graph struct. This actually made for much
cleaner code (partly thanks to the use of sets and set iterators).
Flow graph reduction has been (temporarily) ripped out as the entire
approach was wrong. There was also a bug in that I didn't really understand
the dragon book about selecting nodes and thus messed things up. The
depth-first search tree "fixed" the problem, but was really the wrong
solution (sledge hammer :P).
Also, now that I understand that dot's directed graphs must be acyclic, I
now have much better control over the graphs (back edges need to be
flipped).
It turns out dot does not like cyclic graphs (thus some of the weird
layouts), but fixing it by flipping back-edges requires proper recording of
edge info (I guess that's what T is for in the dragon book).
The reduction is performed itteratively until the graph is irreducible, but
such that each reduction wraps the previous graph. Unfortunately, due
depth-first searching not being implemented, graphs that should be reduced
(ie, those with natural loops).
set_first() now returns a pointer to a setstate_t struct that holds the
state necessary for scanning a set. set_next() will automatically delete
the state block when the end of the set is reached. set_delstate() is also
provided to allow early termination of the scan.
They're now dot_sblock.c and print_sblock. The new names both better
reflect their purpose and free up "flow" for outputting the real flow
analysis graphs.
Much of the data recently added to sblock_t has been moved to flownode_t.
No graph reduction is carried out yet, but the initial (innermost level)
graph has been built.
Dot interprets escape sequences in non-html string, and needs the quotes to
be escaped, so quote the result of operand_string. Unfortunately,
operand_string uses quote_string and quote_string returns a static pointer,
so some hoop-jumping is necessary.
It seems the dag creation algorithm doesn't like "a = a op a", so use
"b = a op a" instead. Since I plan on fixing temp leaks anyway, this won't
be a problem (also, few people even use qfcc's v6 float modulo :P).
Because of the way it is used, the data in the type encodings space needs
to always be correct (ie, relocated), even for partially linked object
files.
Rather, only that it is neither external nor local. The idea was to catch
myself swapping the arguments to resolve_external_def, but for some reason
I decided type encoding defs would not be global (save game reasons?).
Fixes the bogus redefined errors when entity fields are used.
Also, rename extern_defs and defined_defs to extern_data_defs and
defined_data_defs (more consistent with the other tables).
The problem was caused by add_relocs and process_loose_relocs adjusting the
reloc offset based on the reloc's space's base address. This is fine for
most relocs, but as relocs for the type space have already been adjusted by
process_type_space, those relocs must be left alone by add_relocs and
process_loose_relocs. As a bonus, the duplicate code has been refactored
into a separate function :)
Now each encoding is copied across def by def using memcpy, with the
expectation that any references to other types will be handled via the
reloc system. Unfortunately, it seems there's an off-by-4 (hmm, suspicious
number...) in the reloc offsets, but I'll look into that after I get some
sleep.
defspace_alloc_loc can cause a realloc which will break the work qfo space
data pointers, so wrap it with alloc_data, which updates the appropriate
pointers and sizes.
The field/data def handling has been moved into process_data_def and
process_field def. The code for handling external defs has been moved into
its own function (extern_def()),
In passing, rename add_space to add_data_space, since it is limited to
handling data spaces.
For now, no other change has been made, but I'll be able to split up
process_def for data def vs field def processing and add a function for
processing type encoding defs.
First, the class def needed to be created before the class type, then the
def space indices had to be set early, otherwise the relocs wound up with
space 0 instead of the correct space.
All internal structs now have "proper" names, and fit the naming convention
(eg, obj_module (like objective-c's types, but obj instead of objc). Some
redundant types got removed (holdovers from before proper struct tag
handling).
Also, it has proven to be unnecessary to build internal classes, so
make_class and make_class_struct are gone, too.
When encoding a type to a qfo file, the type's encoding string is written
and thus needs to be valid prior to actually doing the encoding. The
problem occurs mostly in self-referential structs (particularly, obj_class)
because the struct is being encoded prior to the pointer to the struct.
This is similar to the problem with infinite recursion when encoding types.
The problem is with structs with self-referential pointers (eg, struct foo
{struct foo *bar}). The solution is to copy the type data to a buffer and
mark the buffer as transfered before actually processing the type. Further
processing of the type is done via the buffer.
As id and Class do not point to real objects as such, trying to get the
class from their types doesn't work, so instead send the message to a
"null" class that skips the method checks.
Now the classes are built "properly" (using the same tools as the parser
itself), and the structs (obj_object, obj_class and obj_protocol) are built
separately, but using the class ivars.
Even just before, type_obj_object, type_obj_class and type_obj_protocol
were a bit bogus (still are), but now the arrays used to list their ivars
are correct. I plan to create the above mentioned types using
class_to_struct to do it properly.
Type names are cleaned up, as is the creation. Also, the class pointer in
the type encoding now gets emitted. However, Still need to actually create
_OBJ_CLASS_Class and fix the type encoding reloc handling in the linker.
With the previous commit, the structures were being created before a valid
source file name was available and thus qfcc would segfault when trying to
generate a tag. Now, the tags look better anyway :).
Now, for each compilation, or before linking, only InitData needs to be
called. Fixes the double chaining internal error when compiling and linking
in the same command.
This required throwing out the primary rules that snax did up to help me
with conflicts many years ago, but they were now getting in the way. Now
the productions from primary are merged in with unary_expr.
It turns out no code was being generated for x = *y. Ouch. I suspect I need
to take a better look at expr_deref at some time in the not too distant
future.
Conflicts:
tools/qfcc/source/statements.c
The base of the type encodings block is given by the .type_encodings def.
The block begins with a "null" type (4 words of 0), followed by the first
type encoding.
At some stage, I will need to add information for extended def information
(32 bit offset, type encoding, other?), but this is good for initial
testing.
Structures (especially hard-coded ones) can be really nasty as they can
refer to themselves. Avoid the recursion by setting the type_def field of
the type before doing the recursive encodings in the structure encoder.
The encodings of static types were getting corrupted because their defs
were not necessarily in the same places between compilations when compiling
multiple files.
The result of parse_params needs to be passed through find_type before
actually being used. I guess I'd missed this back when I got things working
for qc.
Since gnu bison and flex are required anyway, no harm in using their api
prefix options. Now, qfcc can compile both QC/Ruamoko and Pascal files
(Pascal is (currently?) NOT supported in progs.src mode), selecting the
language based on the extension: .r, .qc and .c select QC/Ruamoko, .pas and
.p select Pascal, while anything else is treated as an object file (as
before).
Return statements never flow to the next block (or any other block, for
that matter), so drawing arrows leaving them not only messes up dot's
graphs, but is quite missleading.
When mering if/goto (ie, if skipping a goto), the rest of the dead code
remover is used to delete the goto. That part of the code unuses the goto's
label. The if was getting the goto's label without the lable's used count
being incremented (the usaged temporarily increases by one). I have no idea
why the problem showed up randomly, but this seems to fix it (it fixes /a/
bug, anyway).
The naive implementation of the if/goto merging was letting the old target
of the if get dropped because the block would lose its label and thus be
judged unreachable because the preceeding goto block was still in the list.
Instead, when the if/goto are "merged", mark the goto block as unreachable,
the following block as reachable, and break out of the analysis loop to
force the removal of the goto block. Since the dead block removal function
loops until no action is taken, all other dead blocks will be removed.
The output can be controlled via --block-dot (not yet documented). The
files a named <sourcefile>.<function>.<stage>.dot. Currently, stage will be
one of "initial" (after expression to statement conversion), "thread"
(after jump threading), "dead" (after dead block removal), "final" (final
state before actual code emission).
Labels can be shared between multiple flow-control instructions, so use the
label's used counter to determine when to remove the label. This was
causing problems with the jump threading.
The common cause seems to be casting a cast (very common, and I'm not sure
just realiasing the expression would be right). It does't cause any harm
(particularly, it doesn't trigger alias def chains), so I won't worry about
it.
The actual bug might still be elsewhere, but at least now I know the alias
chains were coming from accessing .return and .param_N, which are unions
(not directly usable by the progs engine). Emitting a reference to a union
(or struct) would create an alias def, but an alias expression was created
in the expression tree to simplify return/param access. The double layer
(sometimes 3 or 4) alias isn't really neaded, so rather than layering the
aliases, just re-alias the alaised def.
It is inteded for flagging buggy conditions in the compiler, particularly
after having fixed the original bug (in case something comes back from the
dead).
v6 progs expects .zero to be only 1 word. The code actually tried to keep
vector out of .zero, but it seems I'd rearranged the structure defintion
without updating the code that kills the vector field. Problem spotted by
divVerent.
o All instances of LIBADD/LDADD have a corresponding DEPENDENCIES
specificatiion.
o libraries now use a lib_ldflags macro to keep things consistent
o duplication of source/lib names has been minimized (particularly in
the libraries; more work needs to be done for the executables)
o automake spec blocks have been organized (again, more work needs to be
done for the executables)
Despair has things locked down such that running qfcc during a build fails
due to lack of read access to /usr/local/lib. This is actually a good
thing as accidentally hitting old includes/libs (when a file gets deleted
in the tree) hides bugs. Thus, --no-default-paths to turn off default
search paths.
The special token __INFINITY__, like __FILE__ and friends, will expand to
a floating-point expression containing a value the C compiler considers
infinite. Obviously, this assumes that the system has relatively modern
float hardware -- but if it doesn't, having Ruamoko be able to represent
float infinity is the least of your problems. :)