I got a sync validation error on a scatter command (I think) thus the
setting was probably wrong. Most of the parameters are still what they
were, but I'll be able to tweak the barriers as necessary.
Unfortunately, it didn't help with the hang on fetching the light cull
query data when starting in fisheye mode (no hang when enabling fisheye
after startup). I'm not sure what's going on there other than the
queries aren't getting updated: the counts seem to be fine so maybe the
commands aren't running. I've probably got a tangled mess of
pseudo-parallel command buffers: I need to go through my system and
clean everything up.
Only matrix-vector, vector-matrix and vector-vector outer products (no
more room), but that's enough to get decent performance out of
matrix-matrix and matrix-scalar (both of which can be done as a set of
matrix-vector or vertex-scalar products).
Progs version bumped because I found that I'd put the swizzle and 2d
wedge ops in the wrong spot (compared to both intention and docs) and
rather than adjust the docs, I took advantage of the opportunity to get
a nicer layout for the wedge products (nestled into the spare slots left
by the 2x2 matrix ops, which seems fitting as the 2d wedge is the
determinant of a 2x2 matrix).
Instead of forcing it via unnecessary alignment. I didn't feel like
using aligned malloc to silence ubsan when really the alignment was just
to force the size to be aligned.
It seemed like a good idea since vulkan and gltf resources use JSON.
However, plist and json parsing and writing are separate: there's no
auto-detection for parsing, and the appropriate function must be used
for writing (though reading one then writing the other will work, but
may result in some information loss, or even invalid json (binary)).
Escape characters aren't handled quite to spec yet (eg, no \uxxxx).
The tests are pretty lame, but they're taken from rfc7159 and round-trip
correctly (which is a surprise for the fp numbers).
I don't remember why I thought it was a good idea to respect embedded
nul characters, but doing so makes appendstr O(N) instead of O(1) (or
O(N^2) instead of O(N) for multiple appends of n chars (N = sum(n)).
Also, this makes appendstr consistent with dasprintf.
I've decided that I want reserve to mean only allocate backing memory,
not modify the size of the string, but I didn't want to rework much code
in the process. I might eventually get right of the open functions, but
I wouldn't be surprised if that's another decade or two in the future.
Builtins calling other functions that call back into progs can get their
parameter pointers messed up resulting in all sorts of errors. Thus wrap
all callbacks to progs in PR_SaveParams/PR_RestoreParams.
Also, ditch PR_RESET_PARAMS in favor of using PR_SetupParams and move
setting pr_argc into PR_SetupParams.
Thanks to validation layers showing command buffer debug regions, it was
pretty easy to find the offending buffers. Did need to modify
QFV_PacketCopyBuffer to take a source barrier as well as the destination
barrier, but this is probably for the best.
Now all pipelines and any tasks that have a command buffer attached get
a region using their names (tasks use the function name). I don't know
when it happened, or if I failed to notice last time, but (sync)
validation layers now include the debug region for command buffers: very
nice.
It doesn't really affect anything since it's just the parameter name in
an empty function-type macro, but "event" instead of "type" isn't
particularly conducive to self-documenting code.
And point the return pointer at the return buffer. And, of course,
restore it. This fixes a really subtle (ie, difficult to find) bug
caused by the recent optimization improvements in qfcc: the optimizer
had decided to set the return value of a message call to the parameter
for the next call, but because the message was to the receiver class for
the first time, the class's +initialize was called. The +initialize
method returned self, which of course when into the parameter for the
*next* call, but the first call hadn't been made, so its parameter got
corrupted.
The particle renderer uses the palette texture in the vertex shader, so
updating the palette needs the vertex shader stage included in the
barrier, but I imagine not all texture updates will need it, so add a
parameter to Vulkan_UpdateTex to select inclusion.
Handle type encodings aren't actually compatible with basic type
encodings as their width is always one and thus the tag field collides
with the basic type encoding's width field.
It turns out the parameter pointer save/restore I had done for detoured
functions is required for all nested calls. However, I had actually
completely forgotten about it. I updated the docs for that section.
With this, it is a little easier to make qwaq independent of quake. The
default dirconf is still meant for quake, and fs_dirconf can still be
used to override the configuration.
While every possible subsystem needs an initialization call, all that
does is add the actual initialization task to the render graph system.
This allows the render graph to be fully configurable, initializing only
those subsystems that the graph needs.
Scripted initialization is still separated from startup as render graph
creation needs various resources (eg, attachments) defined before
creating render and compute passes, but all those need to be created
before the subsystems can actually start up.
Finally. However, it has effect only when no render config is provided.
When a config is provided, things will break currently as nothing is
done yet, but getting a config in will take some work in qwaq and also
the render graph system as I want to make the startup functions
configurable.
The config is a pre-parsed property list. Currently unsupported by
anything but Vulkan (but only a warning is given, not a hard error at
this stage), and Vulkan doesn't use it yet.
And clean up the resulting errors. While some were tricky, there weren't
all that many: just some attachment issues and the multi-stage image
copy for scraps.
Fixing scraps required a barrier between copies. It might be overkill,
but a transfer_dst to transfer_dst image barrier worked.
Fixing attachments was a bit trickier:
- depth needed early and late fragment tests to be treated as one stage
- all attachments that were read later needed storeOp = none (using the
extension)
- and then finalLayout needed to be correct to avoid ghost transitions
- as well, for some reason the deffered gbuffer subpass needed a depth
dependency on the translucent pass even though neither one writes to
the depth attachment (possibly a validation bug, needs more
investigation).
It's not perfect (double fog on translucent surfaces, the
scatter/absorption isn't right, and no local lighting on the fog
itself), but it at least seems to look ok.
I think has been one of the biggest roadblocks to breaking free of
quake, so having dual render paths and thus the different new scene load
sequence has proven to be unexpected helpful. There's a lot more to be
done to make the render graph actually usable by anyone but me, but just
making scene load configurable frees up a lot. I think there needs to be
renderer startup/shutdown configuration too, but this seems to be enough
for now.
The lightmaps aren't updated at all yet, so everything is static.
Figuring out how lightmap data gets to the gpu was a chore thanks to the
spaghetti in the bsp data, and then I'd forgotten that I was
pre-expanding the light data to rgb so wound up with weird lightmaps,
but without water or particles, demo1 is getting 5000fps at 800x450, and
it seems to be CPU limited.