The main goal was to get visframe out of mnode_t to make it thread-safe
(each thread can have its own visframe array), but moving the plane info
into mnode_t made for better data access patters when traversing the bsp
tree as the plane is right there with the child indices. Nicely, the
size of mnode_t is the same as before (64 bytes due to alignment), with
4 bytes wasted.
Performance-wise, there seems to be very little difference. Maybe
slightly slower.
The unfortunate thing about the change is the plane distance is negated,
possibly leading to some confusion, particularly since the box and
sphere culling functions were affected. However, this is so point-plane
distance calculations can be done with a single 4d dot product.
This gives a rather significant speed boost to timedemo demo1: from
about 2300-2360fps up to 2520-2600fps, at least when using
multi-texture.
Since it was necessary for testing the scrap, gl got the ability to set
the console background texture, too.
I doubt the calls were ever actually made in a normal map due to the
node actually being a node when breaking out of the loop, but when I
experimented with an empty world model (no nodes, one infinite empty
leaf) I found that visit_leaf was getting called twice instead of once.
This is an extremely extensive patch as it hits every cvar, and every
usage of the cvars. Cvars no longer store the value they control,
instead, they use a cexpr value object to reference the value and
specify the value's type (currently, a null type is used for strings).
Non-string cvars are passed through cexpr, allowing expressions in the
cvars' settings. Also, cvars have returned to an enhanced version of the
original (id quake) registration scheme.
As a minor benefit, relevant code having direct access to the
cvar-controlled variables is probably a slight optimization as it
removed a pointer dereference, and the variables can be located for data
locality.
The static cvar descriptors are made private as an additional safety
layer, though there's nothing stopping external modification via
Cvar_FindVar (which is needed for adding listeners).
While not used yet (partly due to working out the design), cvars can
have a validation function.
Registering a cvar allows a primary listener (and its data) to be
specified: it will always be called first when the cvar is modified. The
combination of proper listeners and direct access to the controlled
variable greatly simplifies the more complex cvar interactions as much
less null checking is required, and there's no need for one cvar's
callback to call another's.
nq-x11 is known to work at least well enough for the demos. More testing
will come.
Still work with gcc, of course, and I still need to fix them properly,
but now they're actually slightly easier to find as they all have vec_t
and FIXME on the same line.
Move r_pcurrentvertbase into the sw renderer, cleaning up gl's use of
(not really needed there). Not ready to move r_bsp into the main bin yet
as there are linking issues since only the low-level code references any
of its symbols.
While the scheme of using our own allocated did work just fine, fisheye
rendering uses glGenTextures which caused a texture id clash and thus
invalid operations (the cube map texture happened to be the same as the
console background texture). Sure, I could have just "fixed" the fisheye
init code, but this brings gl closer in line with glsl (which makes
extensive use of glGenTextures and glDeleteTextures). This doesn't fix
any texture leaks gl has (plenty, I imagine), but it's a step in the
right direction.
Finally. I never liked it (felt bad adding it in the first place), and
it has caused confusion with function and global variable names, but it
did let me get the render plugins working.
This moves the common camera setup code out of the individual drivers,
and completely removes vup/vright/vpn from the non-software renderers.
This has highlighted the craziness around AngleVectors with it putting
+X forward, -Y right and +Z up. The main issue with this is it requires
a 90 degree pre-rotation about the Z axis to get the camera pointing in
the right direction, and that's for the native sw renderer (vulkan needs
a 90 degree pre-rotation about X, and gl and glsl need to invert an
axis, too), though at least it's just a matrix swizzle and vector
negation. However, it does mean the camera matrices can't be used
directly.
Also rename vpn to vfwd (still abbreviated, but fwd is much clearer in
meaning (to me, at least) than pn (plane normal, I guess, but which
way?)).
So far, in gl and glsl, but viewposition is much clearer than r_origin
(despite being the same thing), and modelorg is just confusing (I think
it's the view position relative to the current model).
GL still has its own functions for enabling and disabling fog while
rendering, but GLSL doesn't need such (thanks to the shaders), nor will
vulkan (and the software renderers don't support fog).
This is a step towards high-level unification of the renderers, as far
as possible keeping only actual low-level implementation details in the
individual renderers (some higher level stuff, eg shadows, is expected
to be per-renderer as some things are just not feasible to implement in
all renderers). However, the idea is to move the high-level
functionality into scene rendering.
Drawing sky chains first ensures that sky surfaces correctly block parts
of the map that should not be visible (by writing the correct depth to
the depth buffer when doing box or dome skies). Writing brush models
first means that the models (ammo boxes etc) could be visible when they
should not be.
While there's currently only the one still, this will allow the entities
to be multiply queued for multi-pass rendering (eg, shadows). As the
avoidance of putting an entity in the same queue more than once relies
on the entity id, all entities now come from the scene (which is stored
in cl_world in the client code for nq and qw), thus the extensive
changes in the clients.
While I doubt the difference is all that significant, this should speed
up entity rendering because it cuts out a lot of branching, and
eliminates scanning the same list multiple times only to not do anything
for large chunks of the list.
For now, the functions check for a null hunk pointer and use the global
hunk (initialized via Memory_Init) if necessary. However, Hunk_Init is
available (and used by Memory_Init) to create a hunk from any arbitrary
memory block. So long as that block is 64-byte aligned, allocations
within the hunk will remain 64-byte aligned.
Its sole purpose was to pass the newly allocated instsurf when chaining
an instance model (ammo box, etc) surface, but using expresion
statements removes the need for such shenanigans, and even makes
msurface_t that little bit smaller (though a separate array would be
much better for cache coherence).
More importantly, the relevant code is actually easier to understand: I
spent way too long working out what tinst was for and why it was never
cleared.
This is the first step towards component-based entities.
There's still some transform-related stuff in the struct that needs to
be moved, but it's all entirely client related (rather than renderer)
and will probably go into a "client" component. Also, the current
components are directly included structs rather than references as I
didn't want to deal with the object management at this stage.
As part of the process (because transforms use simd) this also starts
the process of moving QF to using simd for vectors and matrices. There's
now a mess of simd and sisd code mixed together, but it works
surprisingly well together.
This is a big step towards a cleaner api. The struct reference in
model_t really should be a pointer, but bsp submodel(?) loading messed
that up, though that's just a matter of taking more care in the loading
code. It seems sensible to make that a separate step.
This cleans up texture_t and possibly even improves locality of
reference when running through texture chains (not profiled, and not
actually the goal).
I'd missed a set of bit->lightnum conversions that resulting in lightnum
becoming much greater than 128 and thus trashing memory when the surface
was marked.
The depth limits in the gl and glsl renderers and in the trace code really
bothered me, but then the fix hit me: at load-time, recurse the trees
normally and record the depth in the appropriate place. The node stacks can
then be allocated as necessary (I chose to add a paranoia buffer of 2, but
I expect the maximum depth will rarely be used).
Where possible, symbols have been made static, prefixed with gl_/GL_ or
moved into the code shared by all renderers. This will make doing plugins
easier but done now for link testing.
Still nothing being rendered: still in the process of building the display
lists, but I'm making good progress. Get this into git before something
goes wrong :)
The links are now in "instance surfaces". For non-instanced models (world,
doors, plats etc (ie, world and its sub-models)), there will be one
instance surface per model surface. However, for instanced models (ammo
boxes etc), there will be many, dynamically allocated (not yet
implemented). This commit gets the static instance surfaces working.
This has several benifits:
o The silly issue with alias model pitches being backwards is kept out
of the renderer (it's a quakec thing: entites do their pitch
backwards, but originally, only alias models were rotated. Hipnotic
did brush entity rotations in the correct direction).
o Angle to frame vector conversions are done only when the entity's
angles vector changes, rather than every frame. This avoids a lot of
unnecessary trig function calls.
o Once transformed, an entity's frame vectors are always available.
However, the vectors are left handed rather than right handed (ie,
forward/left/up instead of forward/right/up): just a matter of
watching the sign. This avoids even more trig calls (flag models in
qw).
o This paves the way for merging brush entity surface rendering with the
world model surface rendering (the actual goal of this patch).
o This also paves the way for using quaternions to represent entity
orientation, as that would be a protocol change.