While every possible subsystem needs an initialization call, all that
does is add the actual initialization task to the render graph system.
This allows the render graph to be fully configurable, initializing only
those subsystems that the graph needs.
Scripted initialization is still separated from startup as render graph
creation needs various resources (eg, attachments) defined before
creating render and compute passes, but all those need to be created
before the subsystems can actually start up.
This makes a possible improvement to e1m3, only barely affects ad_tears,
but makes about 30% difference to gmsp3v2 (21fps to 27, and from 3300
leafs to 2700).
The results of the occlusion queries give the lights that don't have a
visible hull, but unfortunately that includes any lights which the
camera is inside, but simple distance checks sort that out (with a
fudge-factor for the icosahedron vertices (1.583 (3(2+p)/(2+3p), p is
golden ratio)).
My efforts (especially the collect zone (what was I thinking)) got
tracy's knickers in a twist resulting in vanishing zones in the server.
It looks like there are some synchronisation issues between cpu and gpu,
but I'm not *too* worried about it at this stage.
The info isn't used yet, but this shows that vulkan's occlusion queries
are at least somewhat useful. However, the technique isn't perfect:
infinite radius lights (1/r and 1/r^2) are difficult to cull, and all
lights can poke through thin enough walls, and then lights containing
the camera get culled incorrectly (will need a separate test). Still, it
looks like it will help once everything is tied together.
This doesn't make much of a difference on the GPU, but it drastically
cuts down CPU usage, especially for ad_tears: shadow map drawing is down
from 16.3ms to 3.7ms thanks to no having to run the alias model queues
as often.
Batching shadow map rendering needs be able to reference matrices for
multiple lights in a single batch, but the only input is the view index,
so use that to look up the matrix index rather than using it to index
the matrices directly (modulo the base index that's still there).
I had missed that upping max lights to 2048 meant that up to 12288
matrices are needed for all the possible lights. This made it so the
light type could not be encoded in id_data, but the shaders never used
it anyway. This leaves one bit free.
While QFV_PacketScatterBuffer works on only one destination buffer, it
turns out it's still useful for scattering to multiple buffers, just
with multiple calls. This makes it pretty easy to combine multiple
buffer updates into a single staging buffer packet, resulting in
reducing lighting's packet use from up to 7 to just one, drastically
reducing the pressure on the stating buffer packet pool, and thus
reducing the chances of QFV_PacketAcquire stalling.
This seems excessive, but gmsp3v2 map has 1399 lights. Worse, it has a
lot of different light sizes that go up by small increments (generally
around 10) resulting in 33 shadow map images (1 too many). Quantizing
the sizes to 32 drops this nicely to 20, and reduces memory consumption
slightly too (image buffer overhead, I guess).
This covers only the rendering of the shadow maps (actual use still
needs to be implemented). Working with orthographic projection matrices
is surprisingly difficult, partly because creating one includes the
translations needed to get the scene into the view (and depth range),
which means care needs to be taken with the view (camera) matrix in
order to avoid double-translating depending on just how the orthographic
matrix is set up (if it's set up to focus on the origin, then the camera
matrix will need translation, otherwise the camera matrix needs to avoid
translation).
This also fixes the segfault in the previous commit.
Dynamic light shadow sizes are fixed, but can be controlled via the
dynlight_size cvar (defaults to 250).
This takes care of the type punning issue by each pass using the correct
sampler type with the correct view types bound. Also, point light and
spot light shadow maps are now guaranteed to be separated (it was just
luck that they were before) and spot light maps may be significantly
smaller as their cone angle is taken into account. Lighting is quite
borked, but at least the engine is running again.
This gets everything but the actual shadow map bindings working: the
validation layers don't like my type punning (which may well be the
right thing) and specialization constants don't help (yet, anyway) but I
want to get things into git.
It turns out bsp faces are still back-face culled despite the null point
being on the front of every possible plane... or really, because it's on
the front of every possible plane: sometimes the back face is the front
face, and this breaks the face selection code (a separate traversal
function will be needed for non-culling rendering).
Despite that, other than having to deal with different pipelines,
getting the model renderers working went better than expected.
This involved rewriting the descriptor update code too, but that now
feels cleaner.
The matrices are loaded into a storage buffer as it can get quite big at
6 matrices per light (and the current max lights is 768).
The parameter will be passed on to the pipeline tasks in their task
context, allowing for communication between the subsystem calling
QFV_RunRenderPass and the pipeline tasks (for the case of lighting,
passing the current matrix base index).
Gotta be sure :)
With the new system mostly up and running (just bsp rendering and
descriptor sets/layout handling to go, and they're independent of the
old render pass system), the old system can finally be cleared out.
This is the minimum maximum count for sampled images, and with layered
shadow maps (with a minimum of 2048 layers supported), that's really way
more than enough.
Another step towards moving all resource creation into the one place.
The motivation for doing the change was getting my test scene to work
with only ambient lights or no lights at all.
There are some issues with the light renderers getting mangled, and only
the first light is even touched (just begin and end of render pass), but
this gets a lot of the framework into place.
For now, at least (I have some ideas to possibly reduce the numbers and
also to avoid the need for actual limits). I've seen gmsp3v2 use over
500 lights at once (it has over 1300), and I spent too long figuring out
that weird light behavior was due to the limit being hit and lights
getting dropped (and even longer figuring out that more weird behavior
was due to the lack of shadows and the world being too bright in the
first place).
The parsing of light data from maps is now in the client library, and
basic light management is in scene. Putting the light loading code into
the Vulkan renderer was a mistake I've wanted to correct for a while.
The client code still needs a bit of cleanup, but the basics are working
nicely.
This leaves only the one conditional in the shader code, that being the
distance check. It doesn't seem to make any noticeable difference to
performance, but other than explosion sprites being blue, lighting
quality seems to have improved. However, I really need to get shadows
working: marcher is just silly-bright without them, and light levels
changing as I move around is a bit disconcerting (but reasonable as
those lights' leaf nodes go in and out of visibility).
Multiple render passes are needed for supporting shadow mapping, and
this is a huge step towards breaking the Vulkan render free of Quake,
and hopefully will lead the way for breaking the GL renderers free as
well.
Modern maps can have many more leafs (eg, ad_tears has 98983 leafs).
Using set_t makes dynamic leaf counts easy to support and the code much
easier to read (though set_is_member and the iterators are a little
slower). The main thing to watch out for is the novis set and the set
returned by Mod_LeafPVS never shrink, and may have excess elements (ie,
indicate that nonexistent leafs are visible).
Any sun (a directional light) is in the outside node, which due to not
having its own PVS data is visible to all nodes, but that's a tad
excessive. However, any leaf node with sky surfaces will potentially see
any suns, and leaf nodes with no sky surfaces will see the sun only if
they can see a leaf that does have sky surfaces. This can be quite
expensive to calculate (already known to be moderately expensive for
just the camera leaf node (singular!) when checking for in-map lights)
Standard quake has just linear, but the modding community added inverse,
inverse-square (raw and offset (1/(r^2+1)), infinite (sun), and
ambient (minlight). Other than the lack of shadows, marcher now looks
really good.
This gets the shaders needed for creating shadow maps, and the changes
to the lighting pipeline for binding the shadow maps, but no generation
or reading is done yet. It feels like parts of various systems are
getting a little big for their britches and I need to do an audit of
various things.
That was... easier than expected. A little more tedious that I would
have liked, but my scripting system isn't perfect (I suspect it's best
suited as the output of a code generator), and the C side could do with
a little more automation.
Other than dealing with shader data alignment issues, that went well :).
Nicely, the implementation gets the explicit scaling out of the shader,
and allows for a directional flag.
Still "some" more to go: a pile to do with transforms and temporary
entities, and a nasty one with host_cbuf. There's also all the static
block-alloc lists :/
Light styles and shadows aren't implemented yet.
The map's entities are used to create the lights, and the PVS used to
determine which lights might be visible (ie, the surfaces they light).
That could do with some more improvements (eg, checking if a leaf is
outside a spotlight's cone), but the concept seems to work.
It's not used yet as work needs to be done to better support generic
entities, but this is the next step to real-time lighting (though, to be
honest, I expect it will be too slow to be usable).
Static lights are yet to come (so the screen is black most of the time),
but dynamic lights work very nicely (and look very good) despite the
falloff being incorrect.