Of course, it's not as correct as glsl or sw due to using polygons and
uvs rather than a fragment shader (not that such is out of the question
since GL 3.0 is requested, but I don't feel like getting shaders going
just for a couple of post-processing effects in an obsolete renderer).
While it's not where I want it to be, it at least now no longer messes
with frame buffer binding or the view ports. This involved switching
around buffers in D_WarpScreen so that the main buffer could be bound
before post-processing.
The code dealing with state is a bit of a mess, but everything is
working nicely. Get around 400fps when all 6 faces need to be rendered
(no surprise: it should be about 1/6 of that for normal rendering). The
messy state handling code did not come as a surprise as I suspected
there were various mistakes in my scene rendering "recipe", and fisheye
highlighted them nicely (I'm sure getting this stuff working in Vulkan
will highlight even more issues).
Finally, after a decade :P Looks pretty good, too, and is (almost)
properly scaled to the resolution (almost because the effect is a little
squashed, but I think the sw renderer does the same).
Again, gl/vulkan not working yet (on the assumption that sw would be
trickier).
Fisheye overrides water warp because updating the projection map every
frame is far too expensive.
I've added a post-process pass to the interface in order to hide the
implementation details, but I'm not sure I'm happy about how the
multi-pass rendering for cube maps is handled (or having the frame
buffers as exposed as they are), but mainly because Vulkan will make
implementation interesting.
For now, OpenGL and Vulkan renderers are broken as I focused on getting
the software renderer working (which was quite tricky to get right).
This fixes a couple of issues: the segfault when warping the screen (due
to the scene rendering move invalidating the warp buffer), and warp
always having 320x200 resolution. There's still the problem of the
effect being too subtle at high resolution, but that's just a matter of
updating the tables and tweaking the code in D_WarpScreen.
Another issue is the Draw functions should probably write directly to
the main frame buffer or even one passed in as a parameter. This would
remove the need for binding the main buffer at the beginning and end of
the frame.
I think the widespread use of recalc_refdef (and force_fullscreen) was
the result of a rushed merge of the renderer and video code (I do seem
to remember sprinkling them around). This cleans the two out of the
client code.
Other than the view model (undecided on the approach) this has
R_RenderView pretty much pulled out of the low level renderers. With
this, I'll be able to focus on scene handling for a bit then getting
shadows and fisheye working (again for fisheye).
r_screen isn't really the right place, but it gets the scene rendering
out of the low-level renderers and will make it easier to sort out
later, and hopefully easier to figure out a good design for vulkan.
Move r_pcurrentvertbase into the sw renderer, cleaning up gl's use of
(not really needed there). Not ready to move r_bsp into the main bin yet
as there are linking issues since only the low-level code references any
of its symbols.
While the scheme of using our own allocated did work just fine, fisheye
rendering uses glGenTextures which caused a texture id clash and thus
invalid operations (the cube map texture happened to be the same as the
console background texture). Sure, I could have just "fixed" the fisheye
init code, but this brings gl closer in line with glsl (which makes
extensive use of glGenTextures and glDeleteTextures). This doesn't fix
any texture leaks gl has (plenty, I imagine), but it's a step in the
right direction.
Finally. I never liked it (felt bad adding it in the first place), and
it has caused confusion with function and global variable names, but it
did let me get the render plugins working.
This moves the common camera setup code out of the individual drivers,
and completely removes vup/vright/vpn from the non-software renderers.
This has highlighted the craziness around AngleVectors with it putting
+X forward, -Y right and +Z up. The main issue with this is it requires
a 90 degree pre-rotation about the Z axis to get the camera pointing in
the right direction, and that's for the native sw renderer (vulkan needs
a 90 degree pre-rotation about X, and gl and glsl need to invert an
axis, too), though at least it's just a matrix swizzle and vector
negation. However, it does mean the camera matrices can't be used
directly.
Also rename vpn to vfwd (still abbreviated, but fwd is much clearer in
meaning (to me, at least) than pn (plane normal, I guess, but which
way?)).
So far, in gl and glsl, but viewposition is much clearer than r_origin
(despite being the same thing), and modelorg is just confusing (I think
it's the view position relative to the current model).
GL still has its own functions for enabling and disabling fog while
rendering, but GLSL doesn't need such (thanks to the shaders), nor will
vulkan (and the software renderers don't support fog).
This is a step towards high-level unification of the renderers, as far
as possible keeping only actual low-level implementation details in the
individual renderers (some higher level stuff, eg shadows, is expected
to be per-renderer as some things are just not feasible to implement in
all renderers). However, the idea is to move the high-level
functionality into scene rendering.
Only CaptureBGR is per-renderer as the rest of the screenshot code uses
it to do the actual capture (which is target dependent). Vulkan is
currently broken due to capture being an asynchronous process and the
rest of the code expecting capture to be synchronous (also, bgr vs rgb).
The best thing is all renderers now write the same format (currently
png).
While there's currently only the one still, this will allow the entities
to be multiply queued for multi-pass rendering (eg, shadows). As the
avoidance of putting an entity in the same queue more than once relies
on the entity id, all entities now come from the scene (which is stored
in cl_world in the client code for nq and qw), thus the extensive
changes in the clients.
The root transform of each hierarchy can be extracted from the first
transform of the list in the hierarchy, so no information is lost. The
main reason for the change is I discovered (obvious in hindsight) that
deleting root transforms was O(n) due to keeping them in an array, thus
the use of a linked list (I don't expect a hierarchy to be in more than
one such list), and I didn't want the transforms to be in a linked list.
While I doubt the difference is all that significant, this should speed
up entity rendering because it cuts out a lot of branching, and
eliminates scanning the same list multiple times only to not do anything
for large chunks of the list.
It holds the data for a basic 3d camera (transform, fov, near and far
clip). Not used yet as there is much work to be done in cleaning up the
client code.
Regardless of whether the sky is spinning or not, the matrix needs to be
updated with the current origin in order to get the direction vector
right in the shader. Also, it's in the update that the required x-y
plane rotation gets in so the skies move in the correct direction.
This actually has at least two benefits: the transform id is managed by
the scene and thus does not need separate management by the Ruamoko
wrapper functions, and better memory handling of the transform objects.
Another benefit that isn't realized yet is that this is a step towards
breaking the renderers free of quake and quakeworld: although the
clients don't actually use the scene yet, it will be a good place to
store the rendering information (functions to run, etc).
This is the bulk of the work for recording the resource pointer with
with builtin data. I don't know how much of a difference it makes for
most things, but it's probably pretty big for qwaq-curses due to the
very high number of calls to the curses builtins.
Closes#26
It's not enforced a this stage, and it would be easy enough to handle,
but it turns out all the standard quake and quakeworld progs never used
... for the print functions: the behavior of PF_VarString was
undocumented and so... tough :P.
With the return buffer in progs_t, it could not be addressed by the
progs on 64-bit machines (this was intentional, actually), but in order
to get obj_msg_sendv working properly, I needed a way to "bounce" the
return address of a calling function to the called function. The
cleanest solution I could think of was to add a mode to the with
instruction allowing the return pointer to be loaded into a register and
then calling the function with a 0 offset for the return value but using
the relevant register (next few commits). Testing promptly segfaulted
due to the 64-bit offset not fitting into a 32-bit value.
The plan is to use the types to extract the number of parameters for a
selector when it is necessary to know the count. However, it'll probably
become useful for something else alter (these things seem to always do
so).
It's currently only 4 (or even 3 for v6) words, but this fixes false
positives when checking for null pointers in Ruamoko progs due to
pr_return pointing to the return buffer and thus outside the progs
memory map resulting in an impossible to exceed value.
Thanks to the size of the type encoding being explicit in the encoding,
anything that tries to read the encodings without expecting the width
will simply skip over the width, as it is placed after the ev type in
the encoding.
Any code that needs to read both the old encodings and the new can check
the size of the basic encodings to see if the width field is present.
I abandoned the reason for doing it (adding a pile of vector types), but
I liked the cleanup. All the implementations are hand-written still, but
at least the boilerplate stuff is automated.
This cleans up dprograms_t, making it easier to read and see what chunks
are in it (I was surprised to see only 6, the explicit pairs made it
seem to have more).
PR_SetupParams is new and sets up the parameter pointers so older code
that expects only up to 8 parameter will work with both v6p and Ruamoko
progs without having to check what progs are running. PR_SetupParams is
useful even when Ruamoko progs are expected as it reserves the required
space (respecting alignment) on the stack and returns a pointer to the
top (bottom? confusing) of the stack. PR_PushFrame and PR_PopFrame
need to be used around PR_SetupParams, regardless of using temp strings,
to avoid a stack leak (need to do an audit).
This is part of the work for #26 (Record resource pointer with builtin
function data). Currently, the data pointer gets as far as the
per-instance VM function table (I don't feel like tackling the job of
converting all the builtin functions tonight). All the builtin modules
that register a resources data block pass that block on to
PR_RegisterBuiltins.
The builtin and progs function data is overlaid so the extra data
doesn't cause too much memory to be used (it's actually 8 bytes smaller
now). The plan is to pre-compute the offsets based on the parameter
size and alignment data.
This will make it possible for the engine to set up their parameter
pointers when running Ruamoko progs. At this stage, it doesn't matter
*too* much, except for varargs functions, because no builtin yet takes
anything larger than a float quaternion, but it will be critical when
double or long vec3 and vec4 values are passed.
Just 32-bit rounding to next higher power of two, and base 2 logarithm.
Most importantly, they are suitable for use in initializers as they are
constant in, constant out.
I found the docs in PR_ExecuteProgram and PR_CallFunction to be a little
confusing, so making it explicit that PR_ExecuteProgram calls
PR_CallFunction and that PR_CallFunction should be called only in a
builtin seemed like a good idea.
And fix an incorrect definition for RETURN_QUAT.
Prefixed MAX_STACK_DEPTH and LOCALSTACK_SIZE (and LOCALSTACK_SIZE got an
extra _).
The rest is just edits to documentation comments.
Due to how OP_RETURN works, a destination is required for any function
returning data, but the caller may not have allocated any space for the
value. Thus the VM maintains a buffer into which the data can be put and
ignored. It also makes a good place for return values when the engine
calls Ruamoko code as trusting progs code with return sizes seems like a
recipe for disaster, especially if the return location is on the C
stack.
And provide a table for such for qfcc and the like. With this, using
pr_double_t (for example) in C will cause the double value to always be
8-byte aligned and thus structures shared between gcc and qfcc will be
consistent (with a little fuss to take care of the warts).
And other related fields so integer is now int (and uinteger is uint). I
really don't know why I went with integer in the first place, but this
will make using macros easier for dealing with types.
They are both gone, and pr_pointer_t is now pr_ptr_t (pointer may be a
little clearer than ptr, but ptr is consistent with things like intptr,
and keeps the type name short).
This required delaying the setting of the return pointer by call until
after the current pointer had been saved, and thus passing the desired
pointer into PR_CallFunction (which does have some advantages for C
functions calling progs functions, but some dangers too (should ensure a
128 byte (32 word) buffer when calling untrusted code (which is any,
really)).
This fixes the issue of the data stack not being restored properly
because the returning function needs to return a value from its local
variables (stored on the stack) and accessing stack data below the stack
pointer is a bad idea (sure, no interrupts yet, but who knows...).
I don't know why they were ever signed (oversight at id and just
propagated?). Anyway, this resulted in "unsigned" spreading a bit, but
all to reasonable places.
This has been a long-held wishlist item, really, and I thought I might
as well take the opportunity to add the instructions. The double
versions of STATE require both the nextthink field and time global to be
double (but they're not resolved properly yet: marked with
"FIXME double time" comments).
Also, the frame number for double time state is integer rather than
float.
While it doesn't cover the addressing modes, it does match the bit
pattern used in the Ruamoko instruction set. It will make selecting
branch instructions easier (especially for Ruamoko).
In some cases, gcc-11 does a good enough job translating normal looking
C expressions so use just those, but other times need to dig around for
an appropriate intrinsic.
Also, now need to disable psapi warnings when compiling for anything
less than avx.
And partial implementations in qfcc (most places will generate an
internal error (not implemented) or segfault, but some low-hanging fruit
has already been implemented).
As I expect to be tweaking things for a while, it's part of the build
process. This will make it a lot easier to adjust mnemonics and argument
formats (tweaking the old table was a pain when conventions changed).
It's not quite done as it still needs arg widths and types.
While working on the new opcode table, I decided a lot of the names were
not to my liking. Part of the problem was the earlier clash with the
v6p opcode names, but that has been resolved via the v6p tag.