I've decided that setting pr.max_edicts and pr.zone_size as part of the
local progs initialization rather than in PR_LoadProgsFile makes more
sense. For one, it is unlikely for the limits to change every time progs is
reloaded. Also, they seem to be a property of the VM rather than the progs.
However, there is nothing stopping the caller from updating max_edicts and
zone_size every call.
It seems gcc doesn't care if the & is present when calculating field
offsets, but it not being there bothered me very much and might as well use
our "standard" macro anyway.
While scan-build wasn't what I was looking for, it has proven useful
anyway: many of the sizeof errors were just noise, but a few were actual
bugs (allocating too much or too little memory).
If the kahan triangle area method breaks, I did something wrong with qfcc's
handling of parentheses (ie, floating point math is not truly associative).
This fixes a segfault when optimizing the empty-body test. The label was
getting moved, but the statement block to which it pointed was not updated
and thus it pointed to dead data.
It returns the rest of the line (minus // style comments) as the token. I
needed it in another project but this is my central repository for
script.py.
Saw a discussion of such in #qc and that gcc implemented it. I realized it
would be pretty easy to detect and very useful (I've made such mistakes at
times).
It is now in its own file and uses table lookups to check for valid type
and operator combinations, and also the resulting type of the expression.
This probably breaks multiple function calls in the one expression.
This is a bit of a workaround to ensure the operands have their types
setup correctly. Really, binary_expr needs to handle expression types
properly.
This fixes the bogus error for comparing the result of pointer subtraction
with an integer.
Currently, they can represent either vectors or quaternions, and the
quaternions can be in either [s, v] form or [w, x, y, z] form.
Many things will not actual work yet as the vector expression needs to be
converted into the appropriate form for assigning the elements to the
components of the "vector" type.
It's sometimes more useful to have direct access to each individual
component of the imaginary part of the quaternion, and then for
consistency, alias w and s.
This is a nice feature found in fteqcc (also a bit of a challenge from
Spike). Getting bison to accept the new expression required rewriting the
state expression grammar, so this is mostly for the state expression. A
test to ensure the state expression doesn't break is included.
I think it may have been for compatibility with a certain qcc variant (no
idea which one, though). While the shift/reduce conflict is fixable using
"%prec IFX" on the const:string rule, the colon breaks test?"a":"b".
Putting parentheses around "a" allows such a construct, requiring them
breaks comatibility with C. I think this feature just isn't worth that.
This goes towards complementing the "if not" logic extension. I need to
check if fteqcc supports "not" with "while" (the version I have access to
at the moment does not), and also whether it would be good to support
"not" with "for", and if so, what form the syntax should take.
It is syntactic sugar for if (!(foo)), but is useful for avoiding
inconsistencies between such things as if (string) and if (!string), even
though qcc can't parse if not (string). It also makes for easier to read
code when the logic in the condition is complex.
It turns out this is required for compatibility with qcc (and C, really).
Once string to boolean conversions are sorted out completely (not that
simple as qcc is inconsistent with if (string) vs if (!string)), Qgets can
be implemented :)
It looks like I had forgotten that the compare function is supposed to
return true/false (unlike memcmp's sorting ability). Also, avoid the
pointers in the value struct as they can change without notice.
Using enums in switches now works nicely, including warnings for unused
enum values.
Either I had gotten confused while writing the code and mixed up line and
offset, or I had changed offset to line at one stage but missed a place.
This fixes the segfault when compiling chewed-alias.r and return-ivar.r
For the most part, it's just refactoring the code so the plane creation and
testing are in separate functions, but there is one important difference:
the plane test now checks only the two points on either side of the point
used to create the plane.
Because the portal winding is guaranteed to be convex and planar, if both
points are on the plane, all points are, and if neither point is behind the
plane, no points are.a
This shaved about 5 seconds off the level 4 run using 4 threads (~198s to
~193s) and about 12s from the single threaded run (~682s to ~670s (hmm,
gained some time in recent changes)).
qsort is used to sort the queue by nummightsee. At ~4ms for 20k portals, I
think it's affordable. Using a queue rather than scanning the portal list
each time loses the dynamic sorting when mightsee gets updated, but it
seemed to shave off 4s anyway (~207s to ~203s (maybe, yay random times)).
Another step towards threaded base-vis.
This reverts commit 1ea79e8626.
Conflicts:
tools/qfvis/include/vis.h
tools/qfvis/source/flow.c
I've decided to do reentrant versions of the set allocators and I didn't
particularly like the invasiveness of allocating sets this way.
The old variable names were confusing ("target" winding comes from
"portal"?), and the comments were from when I really didn't understand
concepts like separating planes. While they weren't wrong, they were quite
inadequate and I want to write new ones.
This bypasses set_new, but completely removes the use of the global lock
from within RecursiveClusterFlow. This seems to give a small speedup: 203
seconds threaded.
This was testing an idea I had to remove the plane flips. It seems to have
been good for the initial plane orientation, but was a slight slowdown for
the pass-portal test. However, this makes the code a little easier to work
with for my idea on improving the algorithm itself.
Since the stack structure in the thread data is a linked list, move the
stack blocks off the program stack and into malloced memory. More
importantly, when the stack block is allocated, the mightsee working set is
allocated too, and as neither are freed, this greatly reduces contention
for the lock. Also, because the memory is kept, single threaded time for
gmsp3v2 dropped from 695s to 670s. Threaded is now about 207s (down from
350).
While using set operators was clearer, it was rather expensive (about 25s
for gmsp3v2). qfvis now completes the map in about 695s (single threaded).
About 15s faster than tyr for the same conditions (1 thread, level 4).
This is the second part of the separator search optimization from tyrutils
vis. With this, qfvis is getting close to tyrutils vis when
running single threaded (qfvis is suffering some nasty thread contention
and thus can't get below about 350 seconds with 4 threads). 808s vs 707s.
Interesting, it makes very little (maybe faster) difference to find all the
separators for levels 3 and 4. This might be due to the higher levels using
most of the planes to fully clip source away. Anyway, it makes the code a
little clearer (one function, one task).
I had forgotten to skip the refined tests when the sphere was entirely on
the relevant side of the plane. Now BasePortalVis for gmsp3v2 takes 11s on
my machine (it was 13 with the previous optimization and 15.9 before that).
Also, write some comments describing how BasePortalVis works.
Representing the side of the plane on which the sphere lies is much more
useful as more complicated tests can be done using just the one call.
-1: the sphere is entirely on the back side of the plane
0: the sphere is intersecting the plane
1: the sphere is entirely on the front side of the plane
It was supposed to be 2, but for some reason I neglected to set it when I
set up the options parsing. However, level 4 is the standard for production
maps, and it happens to be faster than level 2 (at least for gmsp3v2.bsp)
I think the reason I didn't think of that when I tried to improve qfvis's
performance many years ago is I just simply did not understand
ClipToSeparators. However, the difference caching the separators makes is
phenomenal. Before the change, single threaded qfvis would get stuck on one
particular portal for at least a day (I gave up waiting), but now even a
debug build will complete gmsp3v2.bsp in less than 12 minutes (4 threads on
my quad-core). And that's at level 2! Getting stuck for a day was at level
0.
Rather than prefixing free_ to the supplied name, suffix _freelist to the
supplied name. The biggest advantage of this is it allows the free-list to
be a structure member. It also cleans up the name-space a little.
While noticeably slower than the previous expanded set manipulation code,
this is much easier to read. I can worry about optimizing the set code when
I get qfvis behaving better.
I'd forgotten that ED_ConvertToPlist mangled light into light_lev and
single component angle values into a vector. This fixes much of the
breakage in qflight (but not the light levels)
This removes a lot of redundant code from qflight (though it does become
dependent of libQFgamecode *shrug*). The nice thing is qflight now uses the
exact same code to load entities as does the server.
Something is funny with Ubuntu such that -ldl needs to be specifically
added even though QFutil's .la specifies it. I don't know if it's a libtool
issue or not, but this does work.
More will probably be necessary, but this was sufficient to get prover to
the point where qfcc segged building qwaq (0.7.2).
type_obj_class is no longer a class, so its ivars are not stored in
type_obj_class.t.class->ivars but rather type_obj_class.t.symtab.
This fixes the segfault Spirit and Randy were experiencing.
In passing, correct the unneeded emission of meta class ivars for non-root
classes. This should make for much smaller progs that use classes.
MOVEP's opc itself is always known and used, whether it's a constant
pointer or variable doesn't matter. This fixes the lost pointer calculation
for va_list.list[j] = object_from_plist (item);
Dead nodes are those that generate unused values (unassigned leaf nodes,
expressions or destinationless move(p) nodes). The revoval is done by the
flow analysis code (via the dags code) so that any pre and post removal
flow analysis and manipulation may be done (eg, available expressions).
assign_expr mangles the destination expression for dereferenced
assignments into something that is invalid as an lvalue, so simply use
new_binary_expr with the same opcode.
It turns out expression trees are (mostly?) valid DAGs, so all edges being
constrained works, though the graphs get a little tall (but easier to read).
This fixes the infinite loop in if ((x = self.heat && x))
Really, I think I need to revisit the whole expression tree code. It's
proving to be rather fragile.
The source of the assignment is used as the value to test, and the
assignment itself is inserted into the boolean expressions's block. This
fixes the inernal error for "if ((x = 0))".