This is an extremely extensive patch as it hits every cvar, and every
usage of the cvars. Cvars no longer store the value they control,
instead, they use a cexpr value object to reference the value and
specify the value's type (currently, a null type is used for strings).
Non-string cvars are passed through cexpr, allowing expressions in the
cvars' settings. Also, cvars have returned to an enhanced version of the
original (id quake) registration scheme.
As a minor benefit, relevant code having direct access to the
cvar-controlled variables is probably a slight optimization as it
removed a pointer dereference, and the variables can be located for data
locality.
The static cvar descriptors are made private as an additional safety
layer, though there's nothing stopping external modification via
Cvar_FindVar (which is needed for adding listeners).
While not used yet (partly due to working out the design), cvars can
have a validation function.
Registering a cvar allows a primary listener (and its data) to be
specified: it will always be called first when the cvar is modified. The
combination of proper listeners and direct access to the controlled
variable greatly simplifies the more complex cvar interactions as much
less null checking is required, and there's no need for one cvar's
callback to call another's.
nq-x11 is known to work at least well enough for the demos. More testing
will come.
The prefix gives more context to the error messages, making the system a
lot easier to use (it was especially helpful when getting my cvar revamp
into shape).
Based on the flags type used in vkparse (difference is the lack of
support for plists). Having this will make supporting named flags in
cvars much easier (though setting up the enum type is a bit of a chore).
This allows for easy (and safe) printing of cexpr values where the type
supports it. Types that don't support printing would be due to being too
complex or possibly write-only (eg, password strings, when strings are
supported directly).
Surprisingly, only two, but they were caught by the different value
fields being used, thus the cvar was checked in multiple places. I
imagine that's not really all that common, so there may be some
inconsistencies between default value and use.
This is progress towards #23. There are still some references to
host_time and host_client (via nq's server.h), and a lot of references
to sv and svs, but this is definitely a step in the right direction.
This allows a single render pass description to be used for both
on-screen and off-screen targets. While Vulkan does allow a VkRenderPass
to be used with any compatible frame buffer, and vkparse caches a
VkRenderPass created from the same description, this allows the same
description to be used for a compatible off-screen target without any
dependence on the swapchain. However, there is a problem in the caching
when it comes to targeting outputs with different formats.
As I had suspected, it's due to a synchronization problem between the
scrap and drawing. There's actually a double problem in that data
uploaded to the scrap isn't flushed until the first frame is rendered
causing a quick init-shutdown sequence to take at least five seconds due
to the staging buffer waiting (and timing out) on a stuck fence.
Rendering just one frame "fixes" the problem (draw was one of the
earliest subsystems to get going in vulkan).
Surprisingly, only two, but they were caught by the different value
fields being used, thus the cvar was checked in multiple places. I
imagine that's not really all that common, so there may be some
inconsistencies between default value and use.
This is progress towards #23. There are still some references to
host_time and host_client (via nq's server.h), and a lot of references
to sv and svs, but this is definitely a step in the right direction.
Since it is updated every frame, it needs to be as fast as possible for
the cpu code. This seems to make a difference of about 10us (~130 ->
~120) when testing in marcher. Not a huge change, but the timing
calculation was wrapped around the entire base world pass, so there was
a fair bit of overhead from bsp traversal etc.
The improved allocation overheads have been implemented for gl and sw,
and glsl no longer uses malloc. Using array textures will have to wait
as the current texture loading code doesn't support them.
Really, this won't make all that much difference because alias models
with more than one skin are quite rare, and those with animated skin
groups are even rarer. However, for those models that do have more than
one skin, it will allow for reduced allocation overheads, and when
supported (glsl, vulkan, maybe gl), loading all the skins into an array
texture (since all skins are the same size, though external skins may
vary), but that's not implemented yet, this just wraps the old one skin
at a time code.
While looking at the deferred attachment images with using a template in
mind, I noticed that the opaque attachment was using 8-bit color. The
problem is, it's meant to be HDRI with the compose pass crunching it
down to LDRI. Switching to 16-bit float does seem to have made a subtle
difference (hey, it's still quake data, not much HDRI in there).
That certainly makes it nicer to work with large sets, and shows one way
to be careful with allocated resources: don't allocate them in the
inherited data and use a template that needs a few things filled in to
be valid. Also, it seems that overriding values in sub-structures "just
works" :)
It simply parses the referenced plist dictionary (via @inherit =
plist.path;) into the current data block, then allows the data to be
overwritten by the current plist dictionary. This may be a bit iffy for
any allocated resources, so some care must be taken, but it seems to
work nicely.
This allows a single render pass description to be used for both
on-screen and off-screen targets. While Vulkan does allow a VkRenderPass
to be used with any compatible frame buffer, and vkparse caches a
VkRenderPass created from the same description, this allows the same
description to be used for a compatible off-screen target without any
dependence on the swapchain. However, there is a problem in the caching
when it comes to targeting outputs with different formats.
This makes much more sense as they are intimately tied to the frame
buffer on which a render pass is working. Now, just the window width
and height are stored in vulkan_ctx_t. As a side benefit,
QFV_CreateSwapchain no long references viddef (now just palette and
conview in vulkan_draw.c to go).
While I have trouble imagining it making that much performance
difference going from 4 verts to 3 for a whopping 2 polygons, or even
from 2 triangles to 1 for each poly, using only indices for the vertices
does remove a lot of code, and better yet, some memory and buffer
allocations... always a good thing.
That said, I guess freeing up a GPU thread for something else could make
a difference.
I think I had gotten lucky with captures not being corrupt due to them
being much bigger than all but the L3 cache (and then they're over 1/2
the size), so the memory was being automatically invalidated by other
activity. Don't want to trust such luck, though.
It makes a significant difference to level load times (approximately
halves them for demo1 and demo2). Nicely, it turns out I had implemented
the rest of the staging buffer code (in particular, flushing) correctly
in that it seems there's no corruption any of the data.