While it's not where I want it to be, it at least now no longer messes
with frame buffer binding or the view ports. This involved switching
around buffers in D_WarpScreen so that the main buffer could be bound
before post-processing.
The cvar setup for particles is a bit wonky in that the arrays get
initialized using the default max particle count but never updated.
Though things could be improved some more, this solution works (and has
been more or less copied to gl, but I couldn't reproduce the crash
there, or even the valgrind error).
The code dealing with state is a bit of a mess, but everything is
working nicely. Get around 400fps when all 6 faces need to be rendered
(no surprise: it should be about 1/6 of that for normal rendering). The
messy state handling code did not come as a surprise as I suspected
there were various mistakes in my scene rendering "recipe", and fisheye
highlighted them nicely (I'm sure getting this stuff working in Vulkan
will highlight even more issues).
Finally, after a decade :P Looks pretty good, too, and is (almost)
properly scaled to the resolution (almost because the effect is a little
squashed, but I think the sw renderer does the same).
The GLSL compiler requires any #version lines to be the first (real)
line of the program, even #line causes an error, so if the first line of
the chunk starts with #version, insert the #line directive as the second
line.
Again, gl/vulkan not working yet (on the assumption that sw would be
trickier).
Fisheye overrides water warp because updating the projection map every
frame is far too expensive.
I've added a post-process pass to the interface in order to hide the
implementation details, but I'm not sure I'm happy about how the
multi-pass rendering for cube maps is handled (or having the frame
buffers as exposed as they are), but mainly because Vulkan will make
implementation interesting.
For now, OpenGL and Vulkan renderers are broken as I focused on getting
the software renderer working (which was quite tricky to get right).
This fixes a couple of issues: the segfault when warping the screen (due
to the scene rendering move invalidating the warp buffer), and warp
always having 320x200 resolution. There's still the problem of the
effect being too subtle at high resolution, but that's just a matter of
updating the tables and tweaking the code in D_WarpScreen.
Another issue is the Draw functions should probably write directly to
the main frame buffer or even one passed in as a parameter. This would
remove the need for binding the main buffer at the beginning and end of
the frame.
This used to be handled by R_RenderView (encompassing all of the
rendering) before the scene rendering was moved out to r_screen. This
fixes the stuck time in 32-bit nq-win.
Its guts have been moved to D_Init temporarily while I work on the
frame buffer design. This is actually a big part of that work as it
moves most of the frame buffer creation into the one place, making it
easier to ensure I get all the sub-buffers and caches created.
I think the widespread use of recalc_refdef (and force_fullscreen) was
the result of a rushed merge of the renderer and video code (I do seem
to remember sprinkling them around). This cleans the two out of the
client code.
This avoids the possibility of a singularity (and thus the temptation to
use Sys_Error). While the rendering is rubbish, 0 degrees is allowed
because values less than 1 should be allowed, but where does one stop?
170 is the maximum in order to avoid any issues with (near) parallel or
inverted frustum planes (or other fun things) in the low level code.
Other than the view model (undecided on the approach) this has
R_RenderView pretty much pulled out of the low level renderers. With
this, I'll be able to focus on scene handling for a bit then getting
shadows and fisheye working (again for fisheye).
r_screen isn't really the right place, but it gets the scene rendering
out of the low-level renderers and will make it easier to sort out
later, and hopefully easier to figure out a good design for vulkan.
gl_overbright_f shouldn't need to run through any entity queues to
update the light maps as only the world model has light maps, and
hitting the world model should hit all its sub-models.
The change to using separate per-model-type entity queues resulted in
the lighting vector used for alias and iqm models being in an ephemeral
location (in the shared setup_lighting function's stack frame). This
resulted in the model rendering code getting a garbage vector due to it
being overwritten by another stack frame. What I don't get is why the
garbage varied from run to run for the same demo (demo2, the first scrag
behind the start door showed the bad lighting nicely), which made
tracking down the offending commit (and thus the code) rather
troublesome, though once I found it, it was a bit of a face-palm moment.
Move r_pcurrentvertbase into the sw renderer, cleaning up gl's use of
(not really needed there). Not ready to move r_bsp into the main bin yet
as there are linking issues since only the low-level code references any
of its symbols.
The code is really part of scene (not a typo wrt r_screen: that is
misnamed as such, or at least SCR_UpdateScreen needs to be split into
screen (2d overlay, really) and scene updates).
This breaks fisheye rendering as the fisheye code calls the actual scene
render code multiple times, but the fisheye code is called by said scene
render code via a diversion. The fisheye needs to be moved out to the
high level scene render, but that will takes some extra work for frame
buffer setup.
The two aren't compatible (but warping might be doable in the fisheye
code). The whole frame setup code needs a rework, and really, even the
buffer handling.
It being on the stack was a bad idea as R_RenderWorld returns before the
scans are rendered and thus the entity pointer winds up pointing to
abandoned stack space.
While the scheme of using our own allocated did work just fine, fisheye
rendering uses glGenTextures which caused a texture id clash and thus
invalid operations (the cube map texture happened to be the same as the
console background texture). Sure, I could have just "fixed" the fisheye
init code, but this brings gl closer in line with glsl (which makes
extensive use of glGenTextures and glDeleteTextures). This doesn't fix
any texture leaks gl has (plenty, I imagine), but it's a step in the
right direction.
Only for gl and sw at the moment (want to merge things further before I
do anything for glsl or vulkan). However, with with I've learned getting
gl and sw to work, glsl and vulkan will be trivial.
R_RecursiveLightUpdate has been obsolete for a very long time, and
R_Mirror is just wrong (needs envmaps etc, wonder if it can be done in
the fixed function code using skyclip?)
Finally. I never liked it (felt bad adding it in the first place), and
it has caused confusion with function and global variable names, but it
did let me get the render plugins working.
This moves the common camera setup code out of the individual drivers,
and completely removes vup/vright/vpn from the non-software renderers.
This has highlighted the craziness around AngleVectors with it putting
+X forward, -Y right and +Z up. The main issue with this is it requires
a 90 degree pre-rotation about the Z axis to get the camera pointing in
the right direction, and that's for the native sw renderer (vulkan needs
a 90 degree pre-rotation about X, and gl and glsl need to invert an
axis, too), though at least it's just a matrix swizzle and vector
negation. However, it does mean the camera matrices can't be used
directly.
Also rename vpn to vfwd (still abbreviated, but fwd is much clearer in
meaning (to me, at least) than pn (plane normal, I guess, but which
way?)).
I'd been considering it for a while, but in the end, all the issues it
presented made me decide it wasn't worth merging and was never really
worth keeping: it was a neat proof of concept but of little actual use,
especially now everyone either has an OK GPU or would want to stick to
8-bit rendering anyway (sorry L-Havoc).
However, both it and my merge work are preserved in git history :)
I got tired of having to maintain two separate software renderers, but
didn't want to just nuke sw32, so its core changes are merged into sw.
Alias model rendering is broken, but I know exactly what's wrong and how
to fix it, just need to take care due to asm.
So far, in gl and glsl, but viewposition is much clearer than r_origin
(despite being the same thing), and modelorg is just confusing (I think
it's the view position relative to the current model).
GL still has its own functions for enabling and disabling fog while
rendering, but GLSL doesn't need such (thanks to the shaders), nor will
vulkan (and the software renderers don't support fog).
This is a step towards high-level unification of the renderers, as far
as possible keeping only actual low-level implementation details in the
individual renderers (some higher level stuff, eg shadows, is expected
to be per-renderer as some things are just not feasible to implement in
all renderers). However, the idea is to move the high-level
functionality into scene rendering.
As qwaq doesn't yet do any 3d rendering, it doesn't use efrags and thus
wasn't pulling in the object file, but the various renderers were trying
to access it. And I thought plugin builds were more difficult (I had
forgotten).
Only CaptureBGR is per-renderer as the rest of the screenshot code uses
it to do the actual capture (which is target dependent). Vulkan is
currently broken due to capture being an asynchronous process and the
rest of the code expecting capture to be synchronous (also, bgr vs rgb).
The best thing is all renderers now write the same format (currently
png).
I'm not sure what the author of that code was thinking (maybe trying to
do 4 pixels at a time?), but the resulting code still did only one.
Better to remove all the casts, use the right pointer type, and keep the
code clear.
Drawing sky chains first ensures that sky surfaces correctly block parts
of the map that should not be visible (by writing the correct depth to
the depth buffer when doing box or dome skies). Writing brush models
first means that the models (ammo boxes etc) could be visible when they
should not be.
While there's currently only the one still, this will allow the entities
to be multiply queued for multi-pass rendering (eg, shadows). As the
avoidance of putting an entity in the same queue more than once relies
on the entity id, all entities now come from the scene (which is stored
in cl_world in the client code for nq and qw), thus the extensive
changes in the clients.
GL and GLSL were drawing the view model after particles instead of
before. For GL, this is likely due to avoiding fog affecting the view
model (which I think is not the right thing to do), and GLSL due to
copying GL (because I had no idea at the time). This makes the two
renderers consistent with the software renderers, and might even speed
things up a little as that's one less set of blends to do when the
particles are covered by the view model (I don't expect much
difference).
While I doubt the difference is all that significant, this should speed
up entity rendering because it cuts out a lot of branching, and
eliminates scanning the same list multiple times only to not do anything
for large chunks of the list.
The actual view and projection matrices are now consistent with vulkan,
with the vulkan-gl disparity moved into adjustment matrices. The goal is
to allow the same camera data and code to be used in all renderers. The
extra matrix multiplication shouldn't be too expensive as it occurs only
when the field of view (not often, under user control) or near and far
clip distances (very rarely) change.
While both matrices had positive determinants in the first place, I find
the projection matrix easier to understand without all the negatives,
and having quake-x/vulkan-z positively parallel in the z-up matrix makes
that a lot easier to think about.
Regardless of whether the sky is spinning or not, the matrix needs to be
updated with the current origin in order to get the direction vector
right in the shader. Also, it's in the update that the required x-y
plane rotation gets in so the skies move in the correct direction.
This is the bulk of the work for recording the resource pointer with
with builtin data. I don't know how much of a difference it makes for
most things, but it's probably pretty big for qwaq-curses due to the
very high number of calls to the curses builtins.
Closes#26
This is part of the work for #26 (Record resource pointer with builtin
function data). Currently, the data pointer gets as far as the
per-instance VM function table (I don't feel like tackling the job of
converting all the builtin functions tonight). All the builtin modules
that register a resources data block pass that block on to
PR_RegisterBuiltins.
This will make it possible for the engine to set up their parameter
pointers when running Ruamoko progs. At this stage, it doesn't matter
*too* much, except for varargs functions, because no builtin yet takes
anything larger than a float quaternion, but it will be critical when
double or long vec3 and vec4 values are passed.
long is ignored for double, and v6p progs are stuck with 32 bits for
longs (don't feel like extending v6p any further), but the basics are
there for Ruamoko.
short is ignored for ints because the minimum size is 32, and signed is
just noise for ints anyway (and no chars, so...).
unsigned, however, is finally implemented properly (or at least seems to
be working correctly: tests pass after getting things compiling again,
and lt.u is used where it should be :)
And other related fields so integer is now int (and uinteger is uint). I
really don't know why I went with integer in the first place, but this
will make using macros easier for dealing with types.
Forgetting to invoke [super dealloc] in a derived class's -dealloc
method has caused me to waste far too much time chasing down the
resulting memory leaks and crashes. This is actually the main focus of
issue #24, but I want to take care of multiple paths before I consider
the issue to be done.
However, as a bonus, four cases were found :)
This takes care of the global variables to a point (there is still the
global struct shared between the non-vulkan renderers), but it also
takes care of glsl's points-only rendering.
After yesterday's crazy marathon editing all the particles files, and
starting to do another big change to them today, I realized that I
really do need to merge them down. All the actual spawning is now in the
client library (though particle insertion will need to be moved). GLSL
particle rendering is semi-broken in that it now does only points (until
I come up with a way to select between points and quads (probably a
context object, which I need anyway for Vulkan)).
This may seem a little contradictory, but it's due to the difference
between a high level (engine) render pass and a Vulkan render pass
object (and quite likely a poor choice in names for the high level
object). This is necessary for supporting compute shader dispatches as
they cannot be submitted inside a Vulkan render pass.
This has the advantage of getting entity_t out of the particle system,
and much easier to read math. Also, it served as a nice test for my
particle physics shaders (implemented the ideas in C). There's a lot of
code that needs merging down: all but the actual drawing can be merged.
There's some weirdness with color ramps, but I'll look into that later.
They should increment by one for each pic, not 4 (I think some fluff
remaining from copying glsl's draw code).
I noticed the problem when I saw large gaps of 0s in the vertex data in
renderdoc.
This gets the crosshair working in Vulkan (next commit) and fixes issues
with changing the palette (though I've never seen a different palette
for quate, there's still the change from "all black" to an actual
palette).
This means color, emission, and translucent. Fixes the HOM issues on my
VersaPro (but halves the frame-rate... definitely need to bring back the
forward renderer as an option).
This gets the pipelines loaded (and unloaded on shutdown). Probably the
easy part :P. Still need to sort out the command buffers,
synchronization, and particle generation (and probably a bunch else
that's not coming to mind).
This needed changing Vulkan_CreatePipeline to
Vulkan_CreateGraphicsPipeline for consistency (and parsing the
difference from a plist seemed... not worth thinking about).
It turned out the bindless approach wouldn't work too well for my design
of the sprite objects, but I don't think that's a big issue at this
stage (and it seems bindless is causing problems for brush/alias
rendering via renderdoc and on my versa pro). However, I have figured
out how to make effective use of descriptor sets (finally :P).
The actual normal still needs checking, but the sprites are currently
unlit so not an issue at this stage.
I'm not at all sure what I was thinking when I designed it, but I
certainly designed it wrong (to the point of being fairly useless). It
turns out memory requirements are already aligned in size (so just
multiplying is fine), and what I really wanted was to get the next
offset aligned to the given requirements.
This adds the shaders and the pipeline specs. I'm not sure that the
deferred rendering side of the render pass is appropriate, but I thought
I'd give it a go, since quake sprites are really cutoff rather than
translucent.
With the switch to multi-layer textures for brush models, the bsp and
alias texture descriptor sets became identical and thus the definitions
shareable. However, due to complications I don't want to address yet,
they're still separately identified, but I should be able to use the
texture set for most, if not all, pipelines.
The vertices and frame images are loaded into the one memory object,
with the vertices first followed by the images.
The vertices are 2D xy+uv sets meant to be applied to the model
transform frame, and are pre-computed for the sprite size (this part
does support sprites with varying frame image sizes).
The frame images are loaded into one image with each frame on its own
layer. This will cause some problems if any sprites with varying frame
image sizes are found, but the three sprites in quake are all uniform
size.
As much as it can be since the texture data is interleaved with the
model data in the files (I guess not that bad a design for 25 years ago
with the tight memory constraints), but this paves the way for
supporting sprites in Vulkan.
As the sw renderer's implementation was the closest to id's, it was used
as the model (thus a fair bit of cleanup is still needed). This fixes
some incorrect implementations in glsl and gl.
I'd forgotten (when doing the original brush texture loader) that
turbulent surfaces were unlit and thus always full-bright, then never
wrote the turb shader to take care of it. The best solution seems to be
to just mix the two colors in the shader as it will allow turb surfaces
to be lit in the future (probably with severely limited light counts due
to being a forward renderer).
This gets the alias pipeline in line with the bsp pipeline, and thus
everything is about as functional as it was before the rework (minus
dealing with large texture sets).
I guess it's not quite bindless as the texture index is a push constant,
but it seems to work well (and I may have fixed some full-bright issues
by accident, though I suspect that's just my imagination, but they do
look good).
This should fix the horrid frame rate dependent behavior of the view
model.
They are also in their own descriptor set so they can be easily shared
between pipelines. This has been verified to work for Draw.
BSP textures are now two-layered with the albedo and emission in the two
layers rather than two separate images. While this does increase memory
usage for the textures themselves (most do not have fullbright pixels),
it cuts down on image and image view handles (and shader resources).
Smashing everything in the process :P (need to work on the C side).
However, while bindless is supposedly good for performance, the biggest
gain this will bring is portability: the texture counts are
automatically limited to what the hardware can handle, and the reliance
on push descriptors is removed (though they were nice and did help get
things up and running).
I had forgotten that the parameters are in reverse order, and even if I
had remembered, I forgot to reset offset before the second loop.
Pre-decrementing offset takes care of both issues at once.
My VersaPro doesn't support more than 32 per-stage samplers (lavapipe).
This is a small part of getting Vulkan to run on lavapipe and even in
itself is rather incomplete.
Fixes the warning about parse_fixed_array not being used (oops, the
problem with partial commits), but more importantly, gives access to
things like maxDescriptorSetSamplers.
This will make property list expressions easier to work with. The
library is rather limited right now (trig, dot, min/max/bound) but even
just min adds a lot of functionality.
I want to support reading VkPhysicalDeviceLimits but it has some arrays.
While I don't need to parse them (VkPhysicalDeviceLimits should be
treated as read-only), I do need to be able to access them in property
list expressions, and vkgen generates the cexpr type descriptors too.
However, I will probably want to parse arrays some time in the future.
This ensures that unused parser blocks do not get emitted. In the
testing of the upcoming support for fixed arrays, the blend color
constants were being double emitted (both as custom and normal parser)
due to being an array. gcc did not like that (what with all those
warning flags).
Multiple render passes are needed for supporting shadow mapping, and
this is a huge step towards breaking the Vulkan render free of Quake,
and hopefully will lead the way for breaking the GL renderers free as
well.
This is actually a better solution to the renderer directly accessing
client code than provided by 7e078c7f9c.
Essentially, V_RenderView should not have been calling R_RenderView, and
CL_UpdateScreen should have been calling V_RenderView directly. The
issue was that the renderers expected the world entity model to be valid
at all times. Now, R_RenderView checks the world entity model's validity
and immediately bails if it is not, and R_ClearState (which is called
whenever the client disconnects and thus no longer has a world to
render) clears the world entity model. Thus R_RenderView can (and is)
now called unconditionally from within the renderer, simplifying
renderer-specific variants.