While every possible subsystem needs an initialization call, all that
does is add the actual initialization task to the render graph system.
This allows the render graph to be fully configurable, initializing only
those subsystems that the graph needs.
Scripted initialization is still separated from startup as render graph
creation needs various resources (eg, attachments) defined before
creating render and compute passes, but all those need to be created
before the subsystems can actually start up.
And clean up the resulting errors. While some were tricky, there weren't
all that many: just some attachment issues and the multi-stage image
copy for scraps.
Fixing scraps required a barrier between copies. It might be overkill,
but a transfer_dst to transfer_dst image barrier worked.
Fixing attachments was a bit trickier:
- depth needed early and late fragment tests to be treated as one stage
- all attachments that were read later needed storeOp = none (using the
extension)
- and then finalLayout needed to be correct to avoid ghost transitions
- as well, for some reason the deffered gbuffer subpass needed a depth
dependency on the translucent pass even though neither one writes to
the depth attachment (possibly a validation bug, needs more
investigation).
I think has been one of the biggest roadblocks to breaking free of
quake, so having dual render paths and thus the different new scene load
sequence has proven to be unexpected helpful. There's a lot more to be
done to make the render graph actually usable by anyone but me, but just
making scene load configurable frees up a lot. I think there needs to be
renderer startup/shutdown configuration too, but this seems to be enough
for now.
Some of them were actual leaks, but tracking memory should be a lot
easier now. However, there's a lot of room for optimization of
allocations (eg, recylcling of hierarchies. There is now 1 active
allocation (according to tracy) when nq exits: Qgetline's string buffer
(I think an api change is in order).
And make it callable directly (needed to be able to submit the command
buffer separately from the main commands (though this does mess with
tracy a little).
This relies on my fork of tracy: https://github.com/taniwha/tracy
on the wip-c-vulkan branch. Everything is still rather flaky though.
This necessitated the jump to vulkan 1.2 as a requirement.
Tracy is a frame profiler: https://github.com/wolfpld/tracy
This uses Tracy's C API to instrument the code (already added in several
places). It turns out there is something very weird with the fence
behavior between the staging buffers and render commands as the
inter-frame delay occurs in a very strangle place (in the draw code's
packet acquisition rather than the fence waiter that's there for that
purpose). I suspect some tangled dependencies.
The parameter will be passed on to the pipeline tasks in their task
context, allowing for communication between the subsystem calling
QFV_RunRenderPass and the pipeline tasks (for the case of lighting,
passing the current matrix base index).
If a step has process tasks, any render or compute
pipelines/renderpasses are **not** run automatically: the idea is the
process tasks need to run the relevant pipelines in a custom manner but
needs the objects to be created.
The biggest change was splitting up the job resources into
per-render-pass resources, allowing individual render passes to
reallocate their resources without affecting any others. After that, it
was just getting translucency and capture working after a window resize.
Samplers have no direct relation to render passes or pipelines, so
should not necessarily be in the same config file. This makes all the
old config files obsolete, and quite a bit of support code in vkparse.c.
This fixes the insta-death of particles. Interestingly, other than
particles (due to the ring of buffers not being used correctly),
everything else worked nicely, so I guess 1-frame rendering got tested.
The particles die instantly due to curFrame not updating (next commit),
but otherwise work nicely, especially sync is better (many thanks to
Darian for his help with understanding sync scope).
The output system's update_input takes a parameter specifying the render
step from which it is to get the output view of that step and updates
its descriptors as necessary.
With this, the full render job is working for alias models (minus a few
glitches).
Many thanks to Peter and Darian for clearing up my misunderstanding of
how vkResetCommandPool works. The manager creates command buffers from
the command pool on an as-needed basis (when the queue of available
buffers is empty), and keeps track of those buffers in a queue. When the
pool is reset, the queues (one each for primary and secondary command
buffers) are reset such that the tracked buffers are available again.
Imageless framebuffers would probably be easier and cleaner, but this
takes care of the validation error attempting to present the second
frame (because rendering was being done to the first frame's swapchain
image instead of the second frame's).
The new render system now passes validation for the first frame (but
no drawing is done by the various subsystems yet). Something is wrong
with how swap chain semaphores are handled thus the second frame fails.
Frame buffer attachments can now be defined externally, with
"$swapchain" supported for now (in which case, the swap chain defines
the size of the frame buffer).
Also, render pass render areas and pipeline viewport and scissor rects
are updated when necessary.
I don't like the current name (update_framebuffer), but if the
referenced render pass doesn't have a framebuffer, one is created. The
renderpass is referenced via the active renderpass of the named render
step. Unfortunately, this has uncovered a bug in the setup of renderpass
objects: main.deferred has output's renderpass, and main.deferred_cube
and output have bogus renderpass objects.
The creation of the render jobs doesn't really belong with the running
of those jobs. This makes the code a little easier to navigate as it was
too easy to lost between load-time and run-time code.
This is with the new render job scheme. I very much doubt it actually
works (can't start testing until everything passes, and it's disabled
for the moment (define in vid_render_vulkan.c)), but it's helping iron
out what more is needed in the render system.
Really, a bit more than stub as the basic code is there, but nothing
works properly yet due to missing resources (especially descriptor sets
and pools), and the frame buffer creation is still disabled.
The step dependencies are not handled yet as threading isn't used at
this stage, but since I'll require dependencies to always come earlier,
this shouldn't cause a problem.
The jobs will become the core of the renderer, with each job step being
one of a render pass, compute pass, or processor (CPU-only) task. The
steps support dependencies, which will allow for threading the system in
the future.
Currently, just the structures, parse support, and prototype job
specification (render.plist) have been implemented. No conversion to
working data is done yet, and many things, in particular resources, will
need to be reworked, but this gets the basic design in.