The cvar setup for particles is a bit wonky in that the arrays get
initialized using the default max particle count but never updated.
Though things could be improved some more, this solution works (and has
been more or less copied to gl, but I couldn't reproduce the crash
there, or even the valgrind error).
The code dealing with state is a bit of a mess, but everything is
working nicely. Get around 400fps when all 6 faces need to be rendered
(no surprise: it should be about 1/6 of that for normal rendering). The
messy state handling code did not come as a surprise as I suspected
there were various mistakes in my scene rendering "recipe", and fisheye
highlighted them nicely (I'm sure getting this stuff working in Vulkan
will highlight even more issues).
Finally, after a decade :P Looks pretty good, too, and is (almost)
properly scaled to the resolution (almost because the effect is a little
squashed, but I think the sw renderer does the same).
The GLSL compiler requires any #version lines to be the first (real)
line of the program, even #line causes an error, so if the first line of
the chunk starts with #version, insert the #line directive as the second
line.
Again, gl/vulkan not working yet (on the assumption that sw would be
trickier).
Fisheye overrides water warp because updating the projection map every
frame is far too expensive.
I've added a post-process pass to the interface in order to hide the
implementation details, but I'm not sure I'm happy about how the
multi-pass rendering for cube maps is handled (or having the frame
buffers as exposed as they are), but mainly because Vulkan will make
implementation interesting.
For now, OpenGL and Vulkan renderers are broken as I focused on getting
the software renderer working (which was quite tricky to get right).
This fixes a couple of issues: the segfault when warping the screen (due
to the scene rendering move invalidating the warp buffer), and warp
always having 320x200 resolution. There's still the problem of the
effect being too subtle at high resolution, but that's just a matter of
updating the tables and tweaking the code in D_WarpScreen.
Another issue is the Draw functions should probably write directly to
the main frame buffer or even one passed in as a parameter. This would
remove the need for binding the main buffer at the beginning and end of
the frame.
This used to be handled by R_RenderView (encompassing all of the
rendering) before the scene rendering was moved out to r_screen. This
fixes the stuck time in 32-bit nq-win.
Its guts have been moved to D_Init temporarily while I work on the
frame buffer design. This is actually a big part of that work as it
moves most of the frame buffer creation into the one place, making it
easier to ensure I get all the sub-buffers and caches created.
With what I have planned for frame buffers etc, GL 3.0 will be needed
even for the fixed-function GL renderer, and then I might even take the
GLSL renderer to 4.6 (dunno yet). This means that wgl will need to be
updated too, and I've found the info I need for that, but it's a bit
much to take on just yet.
I think the widespread use of recalc_refdef (and force_fullscreen) was
the result of a rushed merge of the renderer and video code (I do seem
to remember sprinkling them around). This cleans the two out of the
client code.
This avoids the possibility of a singularity (and thus the temptation to
use Sys_Error). While the rendering is rubbish, 0 degrees is allowed
because values less than 1 should be allowed, but where does one stop?
170 is the maximum in order to avoid any issues with (near) parallel or
inverted frustum planes (or other fun things) in the low level code.
Other than the view model (undecided on the approach) this has
R_RenderView pretty much pulled out of the low level renderers. With
this, I'll be able to focus on scene handling for a bit then getting
shadows and fisheye working (again for fisheye).
r_screen isn't really the right place, but it gets the scene rendering
out of the low-level renderers and will make it easier to sort out
later, and hopefully easier to figure out a good design for vulkan.
gl_overbright_f shouldn't need to run through any entity queues to
update the light maps as only the world model has light maps, and
hitting the world model should hit all its sub-models.
The change to using separate per-model-type entity queues resulted in
the lighting vector used for alias and iqm models being in an ephemeral
location (in the shared setup_lighting function's stack frame). This
resulted in the model rendering code getting a garbage vector due to it
being overwritten by another stack frame. What I don't get is why the
garbage varied from run to run for the same demo (demo2, the first scrag
behind the start door showed the bad lighting nicely), which made
tracking down the offending commit (and thus the code) rather
troublesome, though once I found it, it was a bit of a face-palm moment.
Move r_pcurrentvertbase into the sw renderer, cleaning up gl's use of
(not really needed there). Not ready to move r_bsp into the main bin yet
as there are linking issues since only the low-level code references any
of its symbols.
The code is really part of scene (not a typo wrt r_screen: that is
misnamed as such, or at least SCR_UpdateScreen needs to be split into
screen (2d overlay, really) and scene updates).
This breaks fisheye rendering as the fisheye code calls the actual scene
render code multiple times, but the fisheye code is called by said scene
render code via a diversion. The fisheye needs to be moved out to the
high level scene render, but that will takes some extra work for frame
buffer setup.
The two aren't compatible (but warping might be doable in the fisheye
code). The whole frame setup code needs a rework, and really, even the
buffer handling.
It being on the stack was a bad idea as R_RenderWorld returns before the
scans are rendered and thus the entity pointer winds up pointing to
abandoned stack space.
While the scheme of using our own allocated did work just fine, fisheye
rendering uses glGenTextures which caused a texture id clash and thus
invalid operations (the cube map texture happened to be the same as the
console background texture). Sure, I could have just "fixed" the fisheye
init code, but this brings gl closer in line with glsl (which makes
extensive use of glGenTextures and glDeleteTextures). This doesn't fix
any texture leaks gl has (plenty, I imagine), but it's a step in the
right direction.
Only for gl and sw at the moment (want to merge things further before I
do anything for glsl or vulkan). However, with with I've learned getting
gl and sw to work, glsl and vulkan will be trivial.
R_RecursiveLightUpdate has been obsolete for a very long time, and
R_Mirror is just wrong (needs envmaps etc, wonder if it can be done in
the fixed function code using skyclip?)
Finally. I never liked it (felt bad adding it in the first place), and
it has caused confusion with function and global variable names, but it
did let me get the render plugins working.
They're still slightly confusing, but the situation itself is confusing,
but the comments should be a little more helpful now as they are more
explicit about the orientation of the matrices and just which axis
points where.
This moves the common camera setup code out of the individual drivers,
and completely removes vup/vright/vpn from the non-software renderers.
This has highlighted the craziness around AngleVectors with it putting
+X forward, -Y right and +Z up. The main issue with this is it requires
a 90 degree pre-rotation about the Z axis to get the camera pointing in
the right direction, and that's for the native sw renderer (vulkan needs
a 90 degree pre-rotation about X, and gl and glsl need to invert an
axis, too), though at least it's just a matrix swizzle and vector
negation. However, it does mean the camera matrices can't be used
directly.
Also rename vpn to vfwd (still abbreviated, but fwd is much clearer in
meaning (to me, at least) than pn (plane normal, I guess, but which
way?)).
I'd been considering it for a while, but in the end, all the issues it
presented made me decide it wasn't worth merging and was never really
worth keeping: it was a neat proof of concept but of little actual use,
especially now everyone either has an OK GPU or would want to stick to
8-bit rendering anyway (sorry L-Havoc).
However, both it and my merge work are preserved in git history :)
16 and 32 bit rendering are disabled at the moment because there's a
weird segfault I need to fix, but the 8-bit dynamic lights are doing
weird things (for x11, too) when updating the light maps.
I got tired of having to maintain two separate software renderers, but
didn't want to just nuke sw32, so its core changes are merged into sw.
Alias model rendering is broken, but I know exactly what's wrong and how
to fix it, just need to take care due to asm.
So far, in gl and glsl, but viewposition is much clearer than r_origin
(despite being the same thing), and modelorg is just confusing (I think
it's the view position relative to the current model).
GL still has its own functions for enabling and disabling fog while
rendering, but GLSL doesn't need such (thanks to the shaders), nor will
vulkan (and the software renderers don't support fog).
This is a step towards high-level unification of the renderers, as far
as possible keeping only actual low-level implementation details in the
individual renderers (some higher level stuff, eg shadows, is expected
to be per-renderer as some things are just not feasible to implement in
all renderers). However, the idea is to move the high-level
functionality into scene rendering.
As qwaq doesn't yet do any 3d rendering, it doesn't use efrags and thus
wasn't pulling in the object file, but the various renderers were trying
to access it. And I thought plugin builds were more difficult (I had
forgotten).
Only CaptureBGR is per-renderer as the rest of the screenshot code uses
it to do the actual capture (which is target dependent). Vulkan is
currently broken due to capture being an asynchronous process and the
rest of the code expecting capture to be synchronous (also, bgr vs rgb).
The best thing is all renderers now write the same format (currently
png).
I'm not sure what the author of that code was thinking (maybe trying to
do 4 pixels at a time?), but the resulting code still did only one.
Better to remove all the casts, use the right pointer type, and keep the
code clear.
Drawing sky chains first ensures that sky surfaces correctly block parts
of the map that should not be visible (by writing the correct depth to
the depth buffer when doing box or dome skies). Writing brush models
first means that the models (ammo boxes etc) could be visible when they
should not be.
While there's currently only the one still, this will allow the entities
to be multiply queued for multi-pass rendering (eg, shadows). As the
avoidance of putting an entity in the same queue more than once relies
on the entity id, all entities now come from the scene (which is stored
in cl_world in the client code for nq and qw), thus the extensive
changes in the clients.
The root transform of each hierarchy can be extracted from the first
transform of the list in the hierarchy, so no information is lost. The
main reason for the change is I discovered (obvious in hindsight) that
deleting root transforms was O(n) due to keeping them in an array, thus
the use of a linked list (I don't expect a hierarchy to be in more than
one such list), and I didn't want the transforms to be in a linked list.
GL and GLSL were drawing the view model after particles instead of
before. For GL, this is likely due to avoiding fog affecting the view
model (which I think is not the right thing to do), and GLSL due to
copying GL (because I had no idea at the time). This makes the two
renderers consistent with the software renderers, and might even speed
things up a little as that's one less set of blends to do when the
particles are covered by the view model (I don't expect much
difference).
While I doubt the difference is all that significant, this should speed
up entity rendering because it cuts out a lot of branching, and
eliminates scanning the same list multiple times only to not do anything
for large chunks of the list.
Since transforms now know the scene to which they belong, and they know
when they are root and when not, getting the transform code to manage
the scene roots is the best way to keep the list of root transforms
consistent.
It turns out cam_controls is for pointing the player model in the
direction of movement rather than controlling the camera (I should add
proper camera controls).
I finally spent the time to work out what it was trying to say. Still
not sure it's clear, but what is clear is that there was probably some
disagreement at Id about the orientation of the world.
They no longer spin like crazy. I don't know how, but I must have broken
something over the years as I'm sure Seth had the code working (and I
seem to remember seeing it working). In the process, clean up a lot of
the angle mess.
It's a lot easier to read (and see the difference between modes 2 and 3)
with all the ifs removed, and the state is properly is chasestate_t now
(though not handled properly on level reset etc).
The more advanced modes are rather broken (continuous spinning), but
they may have been for a while. The bulk of the various changes were due
to renaming viewstate's origin and angles to make their meaning more
explicit.
They've been near-identical for years, now they're only one. It proved
necessary to start merging the HUD code which for now is just a few cvar
declarations (not even init), but that should be a separate set of
commits.
The actual view and projection matrices are now consistent with vulkan,
with the vulkan-gl disparity moved into adjustment matrices. The goal is
to allow the same camera data and code to be used in all renderers. The
extra matrix multiplication shouldn't be too expensive as it occurs only
when the field of view (not often, under user control) or near and far
clip distances (very rarely) change.
It holds the data for a basic 3d camera (transform, fov, near and far
clip). Not used yet as there is much work to be done in cleaning up the
client code.
Handling of view angles is a little hacky at the moment, but this gets
the chase camera code and most of the common input code into one place,
which will make cleaning up the camera code much easier.
While both matrices had positive determinants in the first place, I find
the projection matrix easier to understand without all the negatives,
and having quake-x/vulkan-z positively parallel in the z-up matrix makes
that a lot easier to think about.
Regardless of whether the sky is spinning or not, the matrix needs to be
updated with the current origin in order to get the direction vector
right in the shader. Also, it's in the update that the required x-y
plane rotation gets in so the skies move in the correct direction.
This actually has at least two benefits: the transform id is managed by
the scene and thus does not need separate management by the Ruamoko
wrapper functions, and better memory handling of the transform objects.
Another benefit that isn't realized yet is that this is a step towards
breaking the renderers free of quake and quakeworld: although the
clients don't actually use the scene yet, it will be a good place to
store the rendering information (functions to run, etc).
I've run into a bit of an issue with transform management (really, just
need to make them owned by the scene, but that means creating a scene
for quake and quakeworld).
This is the bulk of the work for recording the resource pointer with
with builtin data. I don't know how much of a difference it makes for
most things, but it's probably pretty big for qwaq-curses due to the
very high number of calls to the curses builtins.
Closes#26
The zone memory block header is 64 bytes, so allocating a single 8 byte
selector is rather wasteful. Instead, allocate selectors in large chunks
(currently 64) and divvy them out as needed. Significantly reduces
memory pressure in large Ruamoko progs.
These add legacy support for basic float bitops (& | ^ ~). Avoiding the
instructions would require tot only the source to be converted, but also
the servers (as they do access those fields), and this seemed to be too
much.
It's not enforced a this stage, and it would be easy enough to handle,
but it turns out all the standard quake and quakeworld progs never used
... for the print functions: the behavior of PF_VarString was
undocumented and so... tough :P.
I had forgotten that unsigned division was different from signed
division (rather silly of me). However, with some testing and analysis,
unsigned true modulo is not needed as it's not possible to have
negative inputs and thus it's the same as remainder.
It now takes the function name to print in error message (passed on to
PR_Sprintf) and the argument number of the format string. The variable
arguments (in ...) are assumed to be immediately after the format
argument.
This loads the current return pointer into the specified register. No
offset is used (should make that an error, but for now any offset is
simply ignored). This is part of the fix for getting obj_msg_sendv to
work with return values.
With the return buffer in progs_t, it could not be addressed by the
progs on 64-bit machines (this was intentional, actually), but in order
to get obj_msg_sendv working properly, I needed a way to "bounce" the
return address of a calling function to the called function. The
cleanest solution I could think of was to add a mode to the with
instruction allowing the return pointer to be loaded into a register and
then calling the function with a 0 offset for the return value but using
the relevant register (next few commits). Testing promptly segfaulted
due to the 64-bit offset not fitting into a 32-bit value.
This gets message forwarding apparently working, though something isn't
quite right as qwaq-app doesn't update properly when I try to step
through the program, but that could be an error elsewhere.
The plan is to use the types to extract the number of parameters for a
selector when it is necessary to know the count. However, it'll probably
become useful for something else alter (these things seem to always do
so).
This takes care of the problems with PR_RESET_PARAMS (which has recently
become just a wrapper for PR_SetupParams) changing the stack and causing
PR_CallFunction to save the wrong stack pointer. Message forwarding is
currently broken for Ruamoko ISA progs, but that is due to not having a
valid pr_argc. However, I do have a plan involving extracting the
parameter count from the selector, but that's something for a later
commit. Everything else seems to be ok (my little game is working
nicely).
rua_obj was skipped because that looks to be a bit more work and should
be a separate commit.
This is to avoid the stack getting mangled when calling progs functions
with parameters.
I suppose having one builtin call another was a neat idea at the time,
and really could have been fixed by simply wrapping the calls with
push/pop frame, but this is probably faster.
obj_msg_sendv needs to push the parameters onto the stack for Ruamoko
progs, but this causes problems because PR_CallFunction winds up
recording the wrong stack pointer for progs functions, and nothing
restores the stack for builtins. The handling is basically the same as
for the return pointer.
It's a bit disconcerting seeing a builtin in the top 10 when builtins
are counted by call while progs functions are counted by instruction.
Also, show the total profile after the function top-10 list.
pr_argc cannot be used in Ruamoko progs because nothing sets it. This
fixes the parse errors and resulting segfault when trying to parse the
Vulkan pipeline config.
It's currently only 4 (or even 3 for v6) words, but this fixes false
positives when checking for null pointers in Ruamoko progs due to
pr_return pointing to the return buffer and thus outside the progs
memory map resulting in an impossible to exceed value.
Since Z_Malloc uses Z_TagMalloc to do the work, this ensures the check
is always run.
Also, add the check to Z_Realloc when it needs to adjust an existing
block.
Builtins that call progs with parameters now must always wrap the call
to PR_ExecuteProgram so that the data stack is properly preserved across
the call.
I need to do an audit of all the calls to PR_ExecuteProgram.
It turns out the return pointer still needs to be saved even when a
builtin sets up a chain call to progs, but rather than the pointer being
simply restored, it needs to be saved in the call stack exactly as if
the function was called directly by progs. This fixes the invalid self
issue quite thoroughly: parameter state seems to be correct across all
calls now.
I should set up an automated test now that I know and understand the
situation.
In Ruamoko ISA progs, the param pointers point to the stack and
generally must most be manipulated by builtins, and there is no need
anyway as Ruamoko doesn't have RCALL. Fixes the mangling of .super.
When calling a builtin, normally the return pointer needs to be
restored, but if the builtin changes the call depth (usually by
effecting "return foo()" as in support for objects, but possibly
setjmp/longjmp when they are implemented), then the return pointer must
not be restored. This gets vkgen past object allocation, but it dies
when trying to send messages to super. This appears to be a compiler
bug.
Since the operand types sort out the difference between asr and shr, no
need to give them different opnames. Means qfcc doesn't need to worry
about which one it's searching for.
Yet another redundant addressing mode (since ptr + 0 can be used), so
replace it with a variable-indexed array (same as in v6p). Was forced
into noticing the problem when trying to compile Machine.r.