So far, alias model rendering is the only victim, but things are working,
even if only color map lookup and fog blending are broken out at this
stage.
I expect the effect naming scheme will go through some changes until I'm
happy with it.
Again, based on The OpenGL Shader Wrangler. The wrangling part is not used
yet, but the shader compiler has been modified to take the built up shader.
Just to keep things compiling, a wrapper has been temporarily created.
The idea comes from The OpenGL Shader Wrangler
(http://prideout.net/blog/?p=11). Text files are broken up into chunks via
lines beginning with -- (^-- in regex). The chunks are optionally named
with tags of the form: [0-9A-Za-z._]+. Unnamed chunks cannot be found.
Searching for chunks looks for the longest tag that matches the beginning
of the search tag (eg, a chunk named "Vertex" will be found with a search
tag of "Vertex.foo"). Note that '.' forms the units for the searc
("Vertex.foo" will not find "Vertex.f").
Unlike glsw, this implementation does not have the concept of effects keys
as that will be separate. Also, this implementation takes strings rather
than file names (thus is more generally useful).
The offset to compensate for st++ was missing.
Obviously, the code has never been tested. Found while looking at the
jump code and thinking about using 32-bit addresses for the jump tables.
set_bits_t is now 64 bits for x86_64 machines (in linux, anyway). This gave
qfvis a huge speed boost: from ~815s to ~720s.
Also, expose some of the set internals so custom set operators can be
created.
Now we can get tight (<1e-6 * radius_squared error) bounding spheres. More
importantly (for qfvis, anyway) very quickly: 1.7Mspheres/second for a 5
point cloud on my 2.33GHz Core 2 :)
It "works" for lines, triangles and tetrahedrons. For lines and triangles,
it gives the barycentric coordinates of the perpendicular projection of the
point onto to features. Only tetrahedrons are guaranteed to reproduce the
original point.
Rather than prefixing free_ to the supplied name, suffix _freelist to the
supplied name. The biggest advantage of this is it allows the free-list to
be a structure member. It also cleans up the name-space a little.
I'd forgotten that ED_ConvertToPlist mangled light into light_lev and
single component angle values into a vector. This fixes much of the
breakage in qflight (but not the light levels)
Sys_LongTime returns time in microseconds as a 64-bit int. Sys_DoubleTime
uses Sys_LongTime, converts to double and offsets 0 time by 4G (2**32).
This gives us consistent sub-microsecond precision for a very long time.
See http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/
Once and for all: remove the default and move the Sys_Error outside the
switch (changing appropriate breaks to returns). Now gcc will let me know
when I forget to update the switch statements.
I'd missed a set of bit->lightnum conversions that resulting in lightnum
becoming much greater than 128 and thus trashing memory when the surface
was marked.
This give much better control over individual joystick axes. They now have
per-axis pre and post amplification, the linear controller mappings are
more intuitive, and axes can now bet setup as buttons using thresholds.
Many thanks to Johnny on Flame for his work on the "user interface".
Johnny's number->J_AXISn mapping is preserved, but I had intended for any
key to be supported (J_AXISn was just to ensure free keys were available).
This gives both methods (and some range checking on the axis button
number).
First, this completely smashes joystick input: it will not work (though it
doesn't crash). This is because there is, as of yet, no means to configure
the system.
Each joystick axis has:
- per-axis amplification (both pre and post).
- per-axis offset (offset applied after pre-amp but before post amp)
- selectable destination:
- linear delta: position and angles (as before)
- axis button: if the value crosses the threshold, the given key is
pressed or released as appropriate.
The axis amplification still uses joy_amp and joy_pre_amp (and
in_amp/in_pre_amp), but now also has the per-axis settings.
The per-axis offset is most useful for axis buttons. For example, the xbox
360 controller triggers are analong but go "all the way to negative on 0
state". Offsetting the input keeps axis button thresholds simple.
Amplification and offset is applied before anything is done with the axis
value. The formula is:
joy_amp * in_amp * axis-amp *
(offset + value * joy_pre_amp * in_pre_amp * axis-pre_amp)
Axis button thresholds are very simple: if the sign of the value is the
same as the sign of the threshold and abs(value) >= abs(threshold), the
button is pressed. While multiple thresholds and keys can be placed on an
axis, only one can be pressed at a time. The threshold furthest from 0
wins.
The ~ gets expanded to CSIDL_LOCAL_APPDATA, $HOME, $USERPROFILE or just
".", whichever succeeds first. The usual location will be:
"C:\windows\profiles\<user>\Local Settings\Application Data".
"." is now the fallback for *nix systems too.
The seed is currently 0xdeadbeef, but I intend on fixing that soon. Now the
particle velocities and origins use fully independent bits (though a big
chunk is wasted right now).
This gives QF a consistent qualilty PRNG on all platforms. The
implementation is slightly different from the standard, but gives the same
results for the same speed (details in mersenne.c).
This is a quick fix until I get a random number generator into QF.
Mingw's RAND_MAX is only 0x7fff and so the (((rnd >> 10) & 63) - 31.5) / 63.0
used for the z component of origin and velocity would never go positive.
For now, change the 10 to 9 (reusing another bit from Y). I plan on
implementing a full 32-bit PRNG in QF so we always have a reliable
generator.
It was pointed out by Blub\w (gmqcc) that OP_MUL_FV and friends were buggy
when the operands overlapped (eg, x = x.x * x) as the result would become
'x.x*x.x x.y*x.x*x.x x.z*x.x*x.x' (note the x.x squared for y and z). On
testing, sure enough the bug was present (and is a nice demonstration that
QF's VM does NOT have strict-aliasing bugs). As a very nice benefit: the
code produced by the fixes is actually faster than the broken version :).
The ruamoko code used for testing:
void (string fmt, ...) printf = #0;
vector foo (vector x)
{
x = x * x.x;
return x;
}
vector bar (vector x)
{
x = x.x * x;
return x;
}
int main ()
{
vector x = '2 3 4';
vector y = foo (x);
vector z = bar (x);
printf ("x=%v y=%v z=%v 2*x=%v\n", x, y, z, 2*x);
return 0;
}
Now the user can create and destroy IMTs at will, though currently
destroying IMTs is currently all or nothing (imt_drop_all).
An IMT is created via imt_create which takes the keydest name (key_game
etc), the name of the IMT (must be unique for all IMTs) and optionally the
name of the IMT to which the key binding search will fall back if there is
no binding in the current IMT, but must be already defined and on the same
keydest. This means that IMTs now have user determined fallback paths. The
requirements for the fallback IMT prevent loops and other weird behaviour.
Actual key binding via in_bind is unaffected. This is why the IMT name must
be unique across all IMTs.
The "imt" command works with the key_game keydest, but imt_keydest is
provided for specifying the active IMT for a specific keydest.
At startup, default IMTs are setup to emulate the previous static IMTs so
old configs will continue to work (mostly). New config files will be
written with commands to drop all of the current IMTs and build new ones,
with the bindings and active IMT set as well.
This fixes the status bar refresh issues in sw. The problem was that with
two viddef's hanging around, things got a little confused and recalc_refdef
wasn't getting into the renderer.
in_clear <imt>... where each argument to in_clear is an imt identifier. If
any identifiers are incorrect, the incorrect ones will be displayed and no
tables will be cleared. All or nothing.
It seems that SDL_SetColors causes a page flip, so VID_SetPalette only
queue a palette change (by checking for the need to change and storing the
requested palete) and VID_Update now checks for a queued palette change and
updates SDL's palette if required. This fixes the flickering console in sw
-sdl introduced by the cshift/centerprint change.
Need to up the precision by one due to the difference between g and e, but
much prettier. Might need to rename that function :P I wish I'd thought to
check if g would work, but thanks to divVerent for the suggestion.
It seems the code expected octal escapes to always start with 0. This is
not the case. Also, octal escapes are limited to 3 digits (and hex to 2).
This fixes the garbled bold text in ITS.
It was properly cleared after drawing water chains and sky chains, but I
had missed normal surfaces. It took the use of the same texture for both
normal surfaces and water surfaces to trigger the bug. Thanks go to Simon
'Sock' O'Callaghan and his In The Shadows mod.
vidmode is starting to show its age. Modern X doesn't need a config file,
and when one is not available, the list of available resolutions is quite
strange. Time to look into randr support.
The bsp2 header is not necessarily correct (or even present), but the bsp29
header is: it was setup via set_bsp32_write. This fixes the bsp corruption
when vising a map (and, I expect, any problems with qfbsp on a big-endian
machine).
Since the hull depth needs to be set for the hull to be useful, it makes
sense to move the code into the same place that allocates new hulls (to me,
anyway).
It seems they were written before quaternion support was added and were not
updated to take into account the variable size of parameters. Now at least
Object's -error: works.
Normally, the order doesn't matter, but when tracing code, it becomes very
difficult to tell where the trace ends and the dump begins. Printing the
message first puts the message between the trace and the dump: much easier
:)
qfcc now does local common subexpression elimination. It seems to work, but
is optional (default off): use -O to enable. Also, uninitialized variable
detection is finally back :)
The progs engine now has very basic valgrind-like functionality for
checking pointer accesses. Enable with pr_boundscheck 2
Getting everything right with an enum proved to be too difficult if not
impossible. Also use better tests for equivalence and intersection.
Many more tests have been added. All pass :)
Also move the ALLOC/FREE macros from qfcc.h to QF/alloc.h (needed to for
set.c).
Both modules are more generally useful than just for qfcc (eg, set
builtins for ruamoko).
Aliasing the jump table to an integer broke statement_get_targetlist with
the new alias def handling, and was really wrong anyway. I probably did
that due to being fed up with things and wanting to get qfcc working again
rather than spending time getting jumpb right.
The depth limits in the gl and glsl renderers and in the trace code really
bothered me, but then the fix hit me: at load-time, recurse the trees
normally and record the depth in the appropriate place. The node stacks can
then be allocated as necessary (I chose to add a paranoia buffer of 2, but
I expect the maximum depth will rarely be used).
While accessing short foo[2][4]; as foo[0][0..7] should work in theory, who
knows what gcc does with it when optimizing. I don't know if this will fix
johnnyonflame's bsp loading problem, but no point in having rhinodemonic
code hanging around.
This necessitated disabling the id2 padding, but it's only commented out
incase there's more growth. Now the (compiler) error in -addObjectNoRetain
is caught ealier.
Using "=" was rather confusing, so changing it to "<CONV>" seems to be a
good idea. As the string is used only for selecting opcodes at compile
time, only qfcc is affected.
Using "=" was rather confusing, so changing it to "<CONV>" seems to be a
good idea. As the string is used only for selecting opcodes at compile
time, only qfcc is affected.
Going by "standard" Objective-C, retainCount really doesn't belong in
Object itself. The way GNUStep does it is to stash retainCount in memory
just below the object by allocating extra bytes for the count and returning
a pointer just beyond those extra bytes. Now Ruamoko does the same. This
fixes the inconsistencies in structure layouts for Protocol and class
structs between qfcc generated (internal) structs and user visible structs.
The attached patch (against quakeforge git) changes the [con]width,
[con]height, and most importantly the rowbytes members of viddef_t
from unsigned to signed int, like in q2. This allows for a properly
negative vid.rowbytes which may be needed in, e.g. a DIB sections
windows driver if needed. Along with it, I changed a few places
where unsigned int is used along with comparisons against the relevant
vid.* members.
One thing I am not 100% sure is the signedness requirements of
d_zrowbytes and d_zwidth: q2 has them as unsigned but I am not sure
whether that is because they are needed as unsigned or it was just an
oversight of the id developers. They do look like they should be OK
as signed int to me, though: comments?
==
Note from Bill Currie: I had to do some extra changes as many
signed/unsigned comparisons were somehow missed.
This fixes the horribly different results between optimized and unoptimized
qfbsp (there is still a difference of 1 brushface). Unfortunately, it also
severely limits the maximum size of a map.
It turns out the expected orientation of the sky cube is exactly that of
Blender's default cube looked at from the front view (num-1) and the front
face being the nearest face. This put's Marcher's sun nicely in the view
when exiting the cave.
Rearrange the sky_suffix and sky_coords arrays and remove the sky_target
array such that the faces can be loaded using
GL_TEXTURE_CUBE_MAP_POSITIVE_X + i (apparently certain drivers break if
the faces aren't loaded in the correct order).
Also, the nomalization of the direction vector in the fragment isn't
necessary.
Need to subtract the size of the bsp_t/bsp29_t struct. Now old and new
qfbsp produce identical bsps (so long as they're both unoptimized, or
(probably) both optimized).
All of the nastiness is hidden in bspfile.c (including the old bsp29
specific data types). However, the conversions between bsp29 and bsp2 are
implemented but not yet hooked up properly. This commit just gets the data
structures in place and the obvious changes necessary to the rest of the
engine to get it to compile, plus a few obvious "make it work" changes.
Certain versions of qcc (fteqcc comes to mind :P) strip the names from
builtin functions. This breaks saved games that happen to have a builtin
function in a saved function variable. The earlier builtin name
reconstruction patch happened to fix the writing of save games for such
progs, and this one fixes the reading.
The setup had been lost at some stage, thus shadows were always directly
under the entity. Unlike the original quake shadow code, the vector is
correctly transformed into the entity's space.
I finally found the cause of Despair's gl shadows non-rendering+segfault...
the shadow code expected triangle fans and strips but was getting simple
triangles. Oops.
Nothing in the main program currently uses Key_Progs_Init, so the object
file wasn't getting pulled into the link. However, it's quite necessary for
the client console plugin :/
That is, the descriptors loaded from the progs file. Some compilers (eg,
fteqcc :P) strip builtin names from the progs, which makes debugging
difficult.
LordHavoc had made lighting positive for sw32, but I had done something in
the plugin code that broke that (probably something to do with the
colormap loading). Going back to id's original code fixes the issue.
This reverts commit e170f4ee75.
It turns out I messed up something in the patch. I noticed the problem
while playing digs04.bsp: many sub-model surfaces, particularly those with
animated textures, were not being transformed correctly. As this patch did
not make a large performance difference, it's probably better to just
revert it. I might revisit it later.
Since the backtile is loaded into a scrap and used as a subtexture, we
can't use GL's texture wrapping, thus do the wrapping ourselves. There are
some minor issues with the wrong part of the scrap being drawn: need to
investigate where the bug is (vrect, make_quad, etc).
Rather than checking the raw edict count in the entities file against the
progs' max_edicts, check the allocated entity's number. This allows loading
of sophisticated maps (eg, digs04) that prune many of their entities.
In the end, it turns out this is the correct fix for the gl seg on
overkill, because build_skin will correctly use the fully setup player skin
if the glskin doesn't have a texture associated with it.
It turns out glsl, sw and sw32 weren't getting any benefit from R_CullBox
because the frustum wasn't setup :P. Get another 8% out of bigass1
(174->184fps). bigass1 now runs 2x as fast as it did before I started this
optimisation run :)
This severely reduces the calles to BindTexture, and more importantly,
glUseProgram, EnableVertexAttribArray etc. The biggest changes are:
o icons and text are all in the one giant texture
o icons and text are mixed in the one queue
This gave ~9% speedup for bigass1 (159->174fps).
It seems recent(?) 64-bit strcpy implementations of strcpy don't work
properly for overlapping regions even when moving down. Quite the
surprise, as I thought that would always work. *shrug*
I didn't like the way client/server code was poking around at the
implementation. Instead, provide a couple of accessor functions for the
same information.
There will normally be only one unnamed field (if any), and it's always the
null field. This will put an eventual end to the "'' is not a field"
messages.