This let me keep clearValue's simple default rgba float interpretation,
but also have full control over access to the float32, int32 and uint32
fields.
This is necessary because fisheye rendering draws the scene up to 6
times per frame, which results in many of the limits being hit
prematurely, but updating r_framecount that often breaks dynamic lights.
Really? More to clean up before (vulkan) bsp rendering is thread-safe?
However, R_MarkLeaves was pretty close: just oldviewleaf and
visframecount, but that's still too much. Also, the reliance on
r_refdef.worldmodel irked me.
While there will be some GPU resources to sort out for multi-pass bsp
processing, I think this is the last piece required before shadow passes
can be implemented.
They were an interesting idea and might be useful in the future, but
they don't work as well as I had hoped for quake's maps due to the
overlapping light volumes causing contention while doing the additive
blends in the frame buffer. The cause was made obvious when testing in
the marcher map: most of its over 400 lights have infinite radius thus
require full screen passes: all those passes fighting for the frame
buffer did very nasty things to performance. However, light splats might be
useful for many small, non-overlapping light volumes, thus the code is
being kept (and I like the cleanups that came with it).
Move things around a bit so I can restore the previous behavior of doing
all lights in a single full screen pass but keep the code improvements
from trying to do splatted lighting.
The old system used just "views", but I had at some time decided that I
might want to support specifying buffers and buffer views, but forgot to
change the name in vkparse.c.
Samplers have no direct relation to render passes or pipelines, so
should not necessarily be in the same config file. This makes all the
old config files obsolete, and quite a bit of support code in vkparse.c.
This gets screenshots working again. As the implementation is now a
(trivial) state machine, the pause when grabbing a screenshot is
significantly reduced (it can be reduced even further by doing the png
compression in a separate thread).
The new system seems to work quite nicely with brush models, which was
the intent, but it's nice to see. Hopefully, it works well when it comes
to shadows. There's still water warp and screen shots to fix, and
fisheye to get working, as well.
Gotta be sure :)
With the new system mostly up and running (just bsp rendering and
descriptor sets/layout handling to go, and they're independent of the
old render pass system), the old system can finally be cleared out.
This fixes the insta-death of particles. Interestingly, other than
particles (due to the ring of buffers not being used correctly),
everything else worked nicely, so I guess 1-frame rendering got tested.
The particles die instantly due to curFrame not updating (next commit),
but otherwise work nicely, especially sync is better (many thanks to
Darian for his help with understanding sync scope).
bsp_draw_queue isn't the right place, but it's just place-holder code to
help get the rest of the renderers up and running before I tackle bsp
rendering. Fixes the segfault in demo1 when the zombies get gibbed,
resulting in zombie entities.
This was necessary to get the 2d elements drawn after the fence had been
fired (thus indicating descriptors could be updated) but before actual
rendering of the 2d elements (which is how it was done before the switch
to the new system).
It turns out there was a bug in the old iqm push constants spec (I still
need to figure out how to use layouts in the new system so I can
completely delete the old).
The output system's update_input takes a parameter specifying the render
step from which it is to get the output view of that step and updates
its descriptors as necessary.
With this, the full render job is working for alias models (minus a few
glitches).
When creating a new command buffer and appending it to a queue, the
active buffer count needs to be incremented too otherwise the new
command buffer will be accidentally reused prematurely. Not noticed
earlier because only one buffer was being created.
Many thanks to Peter and Darian for clearing up my misunderstanding of
how vkResetCommandPool works. The manager creates command buffers from
the command pool on an as-needed basis (when the queue of available
buffers is empty), and keeps track of those buffers in a queue. When the
pool is reset, the queues (one each for primary and secondary command
buffers) are reset such that the tracked buffers are available again.
Imageless framebuffers would probably be easier and cleaner, but this
takes care of the validation error attempting to present the second
frame (because rendering was being done to the first frame's swapchain
image instead of the second frame's).
Command buffer pools can't be reset until the commands have all been
executed. Having per-frame pools makes keeping track of pool lifetime
fairly easy.
Interleaving Vulkan objects with stucts containing vec4f_t results in
the vectors becoming unaligned when there is an odd number of objects in
a set, thus producing a segfault. Putting all the structs first prevents
any such issue.
The new render system now passes validation for the first frame (but
no drawing is done by the various subsystems yet). Something is wrong
with how swap chain semaphores are handled thus the second frame fails.
Frame buffer attachments can now be defined externally, with
"$swapchain" supported for now (in which case, the swap chain defines
the size of the frame buffer).
Also, render pass render areas and pipeline viewport and scissor rects
are updated when necessary.
I don't like the current name (update_framebuffer), but if the
referenced render pass doesn't have a framebuffer, one is created. The
renderpass is referenced via the active renderpass of the named render
step. Unfortunately, this has uncovered a bug in the setup of renderpass
objects: main.deferred has output's renderpass, and main.deferred_cube
and output have bogus renderpass objects.
Being able to specify the types in the push constant ranges makes it a
lot easier to get the specification correct. I never did like having to
do the offsets and sizes by hand as it was quite error prone. Right now,
float, int, uint, vec3, vec4 and mat4 are supported, and adheres to
layout std430.
This allows the likes of:
qfv_pushconstantrangeinfo_s = {
.name = qfv_pushconstantrangeinfo_t;
.type = (QFDictionary);
.dictionary = {
.parse = {
type = (labeledarray, qfv_pushconstantinfo_t, name);
size = num_pushconstants;
values = pushconstants;
};
stageFlags = $name.auto;
};
stageFlags = auto;
};
Leading to:
pushConstants = {
vertex = { Model = mat4; blend = float; };
fragment = { colors = uint; base_color = vec4; fog = vec4; };
};
Where the label of the labeled array (which pushConstants is) is
actually an enum flag and the dictionary value is another labeled array.
The up-coming changes to push constant handling has qfv_float etc type
enum values and using "float" instead of "qfv_float" is highly desirable
as the names match the glsl type names.
The creation of the render jobs doesn't really belong with the running
of those jobs. This makes the code a little easier to navigate as it was
too easy to lost between load-time and run-time code.
This is with the new render job scheme. I very much doubt it actually
works (can't start testing until everything passes, and it's disabled
for the moment (define in vid_render_vulkan.c)), but it's helping iron
out what more is needed in the render system.
I never liked it, but with C2x coming out, it's best to handle bools
properly. I haven't gone through all the uses of int as bool (I'll leave
that for fixing when I encounter them), but this gets QF working with
both c2x (really, gnu2x because of raw strings).
I never liked the various hacks I had come up with for representing
resource handles in Ruamoko. Structs with an int were awkward to test,
pointers and ints could be modified, etc etc. The new @handle keyword (@
used to keep handle free for use) works just like struct, union and
enum in syntax, but creates an opaque type suitable for a 32-bit handle.
The backing type is a function so v6 progs can use it without (all the
necessary opcodes exist) and no modifications were needed for
type-checking in binary expressions, but only assignment and comparisons
are supported, and (of course) nil. Tested using cbuf_t and QFile: seems
to work as desired.
I had considered 64-bit handles, but really, if more than 4G resource
objects are needed, I'm not sure QF can handle the game. However, that
limit is per resource manager, not total.
Really, a bit more than stub as the basic code is there, but nothing
works properly yet due to missing resources (especially descriptor sets
and pools), and the frame buffer creation is still disabled.
The step dependencies are not handled yet as threading isn't used at
this stage, but since I'll require dependencies to always come earlier,
this shouldn't cause a problem.
I had somehow missed vkfieldignore in a consistency pass, or just messed
up its initialization (and thus deallocation) resulting in a double-free
of the strings.
This fixes a Sys_Error when loading the level for the first demo (and
probably many other times). It was mod_numknown getting set to 0 that
triggered the issue, but that seems to be necessary for the other
renderers. I think the whole model loading and caching system needs an
overhaul as this doesn't feel quite right due to removing part of the
advantage of caching the model data.
While the previous cleanup took care of the C side, it turns out vkgen
was leaking property list items all over the place, but they were
cleaned up by the shutdown code.
The jobs will become the core of the renderer, with each job step being
one of a render pass, compute pass, or processor (CPU-only) task. The
steps support dependencies, which will allow for threading the system in
the future.
Currently, just the structures, parse support, and prototype job
specification (render.plist) have been implemented. No conversion to
working data is done yet, and many things, in particular resources, will
need to be reworked, but this gets the basic design in.
Flushing memory requires nonCoherentAtomSize alignment, but this can
cause the flush range to go out of bounds of an improperly sized buffer.
However, only host-visible (and probably really only cached, but all
three covered) needs flushing, so no rounding up is done for
device-local memory.
It should have been this way all along, and it seems I thought they were
when I did rua_gui.c as it already freed its resource block, which would
have been a double free (oops). Fixes an invalid write when shutting
down progs in qwaq-cmd (relevant change not committed).