This takes care of rockets and lava balls casting shadows when they
shouldn't (rockets more because the shadow doesn't look that nice, lava
balls because they glow and thus shouldn't cast shadows). Same for
flames, though the small torches lost their cool sconce shadows (need to
split up the model into flame and sconce parts and mark each
separately).
The use of a static set makes Mod_LeafPVS not thread safe and also means
that the set is not usable with the set iterators after going to a
smaller map from a larger map.
This also fixes the segfault in the previous commit.
Dynamic light shadow sizes are fixed, but can be controlled via the
dynlight_size cvar (defaults to 250).
While the insertion of dlights into the BSP might wind up being overly
expensive, the automatic management of the component pool cleans up the
various loops in the renderers.
Unfortunately, (current bug) lights on entities cause the entity to
disappear due to how the entity queue system works, and the doubled
efrag chain causes crashes when changing maps, meaning lights should be
on their own entities, not additional components on entities with
visible models.
Also, the vulkan renderer segfaults on dlights (fix incoming, along with
shadows for dlights).
This takes care of the type punning issue by each pass using the correct
sampler type with the correct view types bound. Also, point light and
spot light shadow maps are now guaranteed to be separated (it was just
luck that they were before) and spot light maps may be significantly
smaller as their cone angle is taken into account. Lighting is quite
borked, but at least the engine is running again.
I guess it's kind of UB, but it's handy for images that will be
conditionally written by the GPU but need to be in shader-read-only for
draw calls and the validation layers can't tell that the layers won't be
used.
This gets everything but the actual shadow map bindings working: the
validation layers don't like my type punning (which may well be the
right thing) and specialization constants don't help (yet, anyway) but I
want to get things into git.
It turns out bsp faces are still back-face culled despite the null point
being on the front of every possible plane... or really, because it's on
the front of every possible plane: sometimes the back face is the front
face, and this breaks the face selection code (a separate traversal
function will be needed for non-culling rendering).
Despite that, other than having to deal with different pipelines,
getting the model renderers working went better than expected.
This involved rewriting the descriptor update code too, but that now
feels cleaner.
The matrices are loaded into a storage buffer as it can get quite big at
6 matrices per light (and the current max lights is 768).
The parameter will be passed on to the pipeline tasks in their task
context, allowing for communication between the subsystem calling
QFV_RunRenderPass and the pipeline tasks (for the case of lighting,
passing the current matrix base index).
They're now qfv_* and shared within the vulkan renderer. qfv_z_up cannot
be shared across renderers as they have their own ideas for the world
frame. qfv_box_rotations currently can't be shared across renderers
because if the Y-axis flip and the way it's handled, but sharing should
be achievable by modifying the other renderers to handle the sides
correctly (glsl and gl need to do lookups for the side enums, sw just
needs to be shown which way is up).
This eliminates the O(N^2) (N = map leaf count) operation of finding
visible lights and will later allow for finer culling of the lights as
they can be tested against the leaf volume (which they currently are
not as this was just getting things going). However, this has severely
hurt ad_tears' performance (I suspect due to the extreme number of
leafs), but the speed seems to be very steady. Hopefully, reconstructing
the vis clusters will help (I imagine it will help in many places, not
just lights).
The grid calculations are modified from those of Inigo Quilez
(https://iquilezles.org/articles/filterableprocedurals/), but give very
nice results: when thin enough, the lines fade out nicely instead of
producing crazy moire patterns. Though currently disabled, the default
planes are the xy, yz and zx planes with colored axes.
The idea is the UI system can call into the renderer without knowing
anything about the renderer, and the renderer can do what it pleases to
create UI elements in the correct context (passes as a parameter).
Panels can be anchored to a widget in another hierarchy, allowing for
things like cascading menus. They can also be extended via referencing
them by name, allowing for subsystems to add items to an already panel
(eg, extending a menu).
SCR_UpdateScreen_Legacy now takes only the screen functions pointer (it
didn't need camera or realtime), and the camera sizzle code has been
moved into one place to make cleaning it up easier (when I get around to
auditing AngleVectors etc).
The intent is to use them for menus, tooltips and anything else along
those lines, but windows was a good starting point (and puts a border
along the top of the window too).
Canvas draw order is sorted by group then order within the group. As a
fallback, the canvas entity id is used to keep the sort stable, but
that's only as stable as the ids themselves (if the canvases are
destroyed and recreated, the ids may switch around).
This has use when the order of components in the pool affects draw order
(or has other significance), especially at the subpool level. I plan to
use it for fixing overlapping windows in imui.
Shaped text is cached using all the shaping parameters as well as the
text itself as a key. This makes text shaping a non-issue for imui when
the text is stable, taking my simple test from 120fps to 1000fps
(optimized build).
As I had long suspected, building large hierarchies is fiendishly
expensive (at least O(N^2)). However, this is because the hierarchies
are structured such that adding high-level nodes results in a lot of
copying due to the flattened (breadth-first) layout (which does make for
excellent breadth-first performance when working with a hierarchy).
Using tree mode allows adding new nodes to be O(1) (I guess O(N) for the
size of the sub-tree being added, but that's not supported yet) and
costs only an additional 8 bytes per node. Switching from flat mode to
tree mode is very cheap as only the additional tree-related indices need
to be fixed up (they're almost entirely ignored in flat mode). Switching
from tree to flat mode is a little more expensive as the entire tree
needs to be copied, but it seems to be an O(N) (size of the tree).
With this, building the style editor window went from about 25% to about
5% (and most of that is realloc!), with a 1.3% cost for switching from
tree mode to flat mode.
There's still a lot of work to do (supporting removal and tree inserts).
By default, horizontal and vertical layouts expand to fill their parent
in their on-axis direction (horizontally for horizontal layouts), but
fit to their child views in their off-axis.
Flexible space views take advantage of auto-expansion, pushing sibling
views such that the grandparent view is filled on the parent view's
on-axis, and the parent view is filled by the space in the parent view's
off-axis. Flexible views currently have a background fill, allowing them
to provide background filling of the overall view with minimal overdraw
(ancestor views don't need to have any fill at all).
The biggest change was splitting up the job resources into
per-render-pass resources, allowing individual render passes to
reallocate their resources without affecting any others. After that, it
was just getting translucency and capture working after a window resize.
This takes care of element order stability. It did need reworking the
mouse tracking code (including adding an active flag to views), but now
buttons seem to work correctly.
TextContent seems redundant at this stage since a text view is always
sized to its content, and PercentOfParent doesn't work yet. Pixels
definitely works and Null seems to work in that it does no sizing or
positioning. Vertical layout is supported but not yet tested, similar
for ChildrenSum, but I can have two buttons side by side.
It does almost nothing (just puts a non-function button on the screen),
but it will help develop the IMUI code and, of course, come to help with
debugging in general.
Both passage and simple text are supported, but only simple text has
been tested at this stage. However, as passage text was taken directly
from rua_gui.c and formed the basis for simple text rendering, I expect
it's at least close to working.
The same underlying mechanism is used for both simple text strings and
passages, but without the intervening hierarchy of paragraphs etc.
Results in only the one view for a simple text string.
I'm not sure I like fontconfig (docs are...), but it is pretty standard,
and I was able to find some reasonable examples on stackexchange
(https://stackoverflow.com/questions/10542832/how-to-use-fontconfig-to-get-font-list-c-c).
Currently, only the one font is handled, no font sets for fall backs
etc. It's meant for the debug UI I'm working on, so that shouldn't be a
big deal.
It's usually desirable to hide the cursor when playing quake, but when
using the console, or in various other states, being able to see the
cursor can be quite important.
It's currently very simplistic (visible, not visible), but it gets
things started for making QF more usable in a windowed environment (not
having a visible cursor was fine in DOS, or when full screen, but not
when windowed (and not actively playing).
While there will be some GPU resources to sort out for multi-pass bsp
processing, I think this is the last piece required before shadow passes
can be implemented.
Samplers have no direct relation to render passes or pipelines, so
should not necessarily be in the same config file. This makes all the
old config files obsolete, and quite a bit of support code in vkparse.c.
This gets screenshots working again. As the implementation is now a
(trivial) state machine, the pause when grabbing a screenshot is
significantly reduced (it can be reduced even further by doing the png
compression in a separate thread).
The new system seems to work quite nicely with brush models, which was
the intent, but it's nice to see. Hopefully, it works well when it comes
to shadows. There's still water warp and screen shots to fix, and
fisheye to get working, as well.
Gotta be sure :)
With the new system mostly up and running (just bsp rendering and
descriptor sets/layout handling to go, and they're independent of the
old render pass system), the old system can finally be cleared out.
The particles die instantly due to curFrame not updating (next commit),
but otherwise work nicely, especially sync is better (many thanks to
Darian for his help with understanding sync scope).
This was necessary to get the 2d elements drawn after the fence had been
fired (thus indicating descriptors could be updated) but before actual
rendering of the 2d elements (which is how it was done before the switch
to the new system).
It turns out there was a bug in the old iqm push constants spec (I still
need to figure out how to use layouts in the new system so I can
completely delete the old).
The output system's update_input takes a parameter specifying the render
step from which it is to get the output view of that step and updates
its descriptors as necessary.
With this, the full render job is working for alias models (minus a few
glitches).
Many thanks to Peter and Darian for clearing up my misunderstanding of
how vkResetCommandPool works. The manager creates command buffers from
the command pool on an as-needed basis (when the queue of available
buffers is empty), and keeps track of those buffers in a queue. When the
pool is reset, the queues (one each for primary and secondary command
buffers) are reset such that the tracked buffers are available again.
Imageless framebuffers would probably be easier and cleaner, but this
takes care of the validation error attempting to present the second
frame (because rendering was being done to the first frame's swapchain
image instead of the second frame's).
The new render system now passes validation for the first frame (but
no drawing is done by the various subsystems yet). Something is wrong
with how swap chain semaphores are handled thus the second frame fails.
Frame buffer attachments can now be defined externally, with
"$swapchain" supported for now (in which case, the swap chain defines
the size of the frame buffer).
Also, render pass render areas and pipeline viewport and scissor rects
are updated when necessary.
I don't like the current name (update_framebuffer), but if the
referenced render pass doesn't have a framebuffer, one is created. The
renderpass is referenced via the active renderpass of the named render
step. Unfortunately, this has uncovered a bug in the setup of renderpass
objects: main.deferred has output's renderpass, and main.deferred_cube
and output have bogus renderpass objects.
The string type is useful for passing around strings (the only thing
that they can do, currently), particularly as arguments to functions.
The voidptr type is (currently) never generated by the core cexpr
system, but is useful for storing pointers via cexpr (probably a bit of
a hack, but it seems to work well in my current use).
Being able to specify the types in the push constant ranges makes it a
lot easier to get the specification correct. I never did like having to
do the offsets and sizes by hand as it was quite error prone. Right now,
float, int, uint, vec3, vec4 and mat4 are supported, and adheres to
layout std430.
This is with the new render job scheme. I very much doubt it actually
works (can't start testing until everything passes, and it's disabled
for the moment (define in vid_render_vulkan.c)), but it's helping iron
out what more is needed in the render system.
I never liked it, but with C2x coming out, it's best to handle bools
properly. I haven't gone through all the uses of int as bool (I'll leave
that for fixing when I encounter them), but this gets QF working with
both c2x (really, gnu2x because of raw strings).
The warning flag check worked too well: it enabled the warning and
autoconf's default main wanted the const attribute. The bug has been
floating around for a while, it seems.
set_while checks the iterator's current element membership and skips to
the first element with different membership. ie, if the current element
is in the set, then set_while returns the next element *not* in the set,
but if the current is not in the set, then set_while returns the next
element that *is* in the set. Rather handy for dealing with clusters of
set elements.
I never liked the various hacks I had come up with for representing
resource handles in Ruamoko. Structs with an int were awkward to test,
pointers and ints could be modified, etc etc. The new @handle keyword (@
used to keep handle free for use) works just like struct, union and
enum in syntax, but creates an opaque type suitable for a 32-bit handle.
The backing type is a function so v6 progs can use it without (all the
necessary opcodes exist) and no modifications were needed for
type-checking in binary expressions, but only assignment and comparisons
are supported, and (of course) nil. Tested using cbuf_t and QFile: seems
to work as desired.
I had considered 64-bit handles, but really, if more than 4G resource
objects are needed, I'm not sure QF can handle the game. However, that
limit is per resource manager, not total.
When using SET_STATIC_INIT, the set size needs to be the same as what
set_new() would create for the same number of bits, otherwise the set
will possibly get resized incorrectly (which is bad news when the array
was allocated using alloca). While this is really a symptom of
set_bits_t not getting the right size, getting weird segfaults is not a
good way to diagnose the problem, and set_bits_t being the wrong size is
just a minor pessimism.
Really, a bit more than stub as the basic code is there, but nothing
works properly yet due to missing resources (especially descriptor sets
and pools), and the frame buffer creation is still disabled.
The step dependencies are not handled yet as threading isn't used at
this stage, but since I'll require dependencies to always come earlier,
this shouldn't cause a problem.
Requiring top-level {} or () for (usually) hand-written files is awkward
and even a little error prone, and certainly ugly at times. With this,
loaders that expect a particular format can specify the format a little
more directly.
The jobs will become the core of the renderer, with each job step being
one of a render pass, compute pass, or processor (CPU-only) task. The
steps support dependencies, which will allow for threading the system in
the future.
Currently, just the structures, parse support, and prototype job
specification (render.plist) have been implemented. No conversion to
working data is done yet, and many things, in particular resources, will
need to be reworked, but this gets the basic design in.
The hierarchy leak was particularly troublesome to fix, but now the
hierarchies get updated (and freed) automatically just by removing the
hierarchy reference component from the entity. I suspect there will be
issues with entities that are on multiple hierarchies, but I'll sort
that out later.
I'm not 100% sure this is the best fix for the issue, but the way the
cbuf interpreter stack works (especially in the console code) meant that
the stack was built in the order opposite to how it could be safely
deleted with the existing function. Yeah, more leaks :P
Finally, hash links can be freed when the hash context is no longer
relevant. The context is created automatically when needed, and the
owner can delete the context when its done with the relevant hash
tables.
Render passes and subpasses are now mostly initialized, just command
buffers and frame buffer related info to go (including view/scissor for
pipelines).
Not only does this quieten the validation layers, it ensures that all
the object handles are named and where they need to be. Also fixes only
one pipeline being created instead of the 15 or so.
The render passes seem to be created successfully, but pipelines fail
due to not having layout set, resulting in a segfault (bug in validation
layers?).
I don't remember why I kept the abbreviated configs for images and image
views, but it because such that I need to be able to specify them
completely. In addition, image views support external images.
The rest was just cleaning up after the changes to qfv_resobj_t.
.dictionary can ask for standard parsing via a .parse key (value is
ignored currently).
Fields can use $auto to use standard parsing for that field.
If either is used, the plist field descriptors are written.
They're currently just stubs, but this gets the render info loading
working without any errors. The next step is to connect up pipelines and
create the image resources, then implementing the task functions will
have meaning.
This gets an empty (no tasks or pipelines connected) render context
initialized and available for other subsystems to register their task
functions. Nothing is using it yet, but the test parse of rp_main_def
fails gracefully (needs those tasks).
This just sets up the memory block and cexpr descriptors for the
parameters, parameter parsing is separate (and next). The parameters are
aligned to their size.
A bunch of missed struct members, incorrect parse types, and some logic
errors in the parse setup. Still not working due to problems with
vectors from plist string references and some other errors, but getting
there.
There's still a lot of work to do, but the basics are in. The spec will
be parsed into info structs that can then be further processed to
generate all the actual structs, generally making things a little less
timing dependent (eg, image view info refers to its image by name).
The new render pass and subpass structs have their names mangled for now
until I can switch over to the new system.
While the old system did get things going, it felt clunky to set up,
especially when it came to variations on render passes (eg, flat vs
cube-mapped). Also, much of it felt inside-out, especially the
separation of pipelines and render passes: having to specify the render
pass and subpass in the pipeline spec made the spec feel overly coupled
to the render pass setup. While this is the case in Vulkan, it is not
reflected properly in the pipeline spec. The new system will adjust the
render pass and subpass parameters of the pipeline spec as needed,
making the pipeline specs more reusable, and hopefully less error prone
as the pipelines are directly referenced by the subpasses that use them.
In addition, subpass dependencies should be much easier to set up as
only the dependent subpass specifies the dependency and the subpass
source dependency is mentioned by name. Frame buffer attachments also
get a similar treatment.
The new spec "format" isn't quite finalized (needs to meet the enemy
known as parsing) but it feels like a good starting place.
There are some missing parts from this commit as these are the fairly
clean changes. Missing is building a separate set of pipelines for the
new render pass (might be able to get away from that), OIT heads texture
is flat rather than an array, view matrices aren't set up, and the
fisheye renderer isn't hooked up to the output pass (code exists but is
messy). However, with the missing parts included, testing shows things
mostly working: the cube map is rendered correctly even though it's not
displayed correctly (incorrect view). This has definitely proven to be a
good test for Vulkan's multiview feature (very nice).
I ran into the need to get at the label of labeled array element and the
best way seemed to be by setting the name field of the plfield_t item
passed to the parser function, and then found that PL_ParseSymtab
already does this. I then decided passing the array index would also be
good, and the offset field made sense.
Canvas_SortComponentPool now takes the raw canvas component id as it is
specialized to the canvas subpools.
Canvas_SetLen resizes the root view and then updates the hierarchy for
every canvas in the system.
Canvas_InitSys sets up the component system with the systems it needs
(canvas, view, text). This is required to ensure view_href is just past
the canvas components as it is needed for retrieving the actual canvas
component (and thus sub-pool range ids) from arbitrary views in the
canvas.
Entities are fetched with the correct offset from the pool entities.
This will make it easy for client code to set up data needed by the
console before the console initializes. It already separates console
cvar setup and initialization, which has generally been a good thing.
This is a bit of a hack to allow me to work on vulkan's screen update
"pipeline" without having to mess with the other renderers, since it
turns out they're (currently) fundamentally incompatible.
The pic is scaled to fill the specified rect (then clipped to the
screen (effectively)). Done just for the console background for now, but
it will be used for slice-pics as well.
Not implemented for vulkan yet as I'm still thinking about the
descriptor management needed for the instanced rendering.
Making the conback rendering conditional gave an approximately 3% speed
boost to glsl with the GL stub (~12200fps to ~12550fps), for either
conback render method.
The wording might seem a little odd, but cl_screen is really the full 2D
client HUD while the console is completely independent of the client and
shouldn't know that the client even exists. Ideally, the resize events
would be handled by the canvas system, to which end this is a small
step.
This is the beginning of supporting 2d rendering in 3d space. The idea
is that a canvas can be in 2d orthographic space (not attached to any
entity with a 3d transform), or in 3d perspective space (attached to an
entity with a 3d transform, either as a child of the camera, or of some
object in 3d space).
It will replace the current HUD code when it's working.
I found I needed the subrange start as well as the end, but I liked that
the subpools themselves used only the end of the range, so switching to
just a unint32_t for the value and adding a function to return a tuple
made sense. I had kept the struct because I thought I might want to
store additional information (eg, the entity "owning" the subpool), but
found that I didn't need such information as the systems using subpools
that way would have access to the entity by other means.
Interestingly, the change found a bug in subpool creation: I really
don't know why things worked before, but they work better now :)
Subpools are for grouping components by some criterion. Any component
that has a rangeid callback will be grouped with other components that
return the same render id. Note that the ordering of components within a
group will be affected by adding a component into a group that comes
before that group (or removing a component).
Component pools can have multiple groups, added and removed dynamically,
but removing a group should (currently) be done only when empty.
While "set" is a tad strong (there's just the one component for now), I
had missed the changes when adding ECS systems. Fixes the segfault at
the end of demo1 (ie, when any center text is printed).
Instead of creating new entities for the text views. This approximately
halves the number of entities required to display flowed text, but also
tests the ability to have an entity in multiple hierarchies (the goal of
the ECS component and system changes).
The system struct bundles the registry and component base together,
making it easier to reuse systems in multiple registries, or really,
easier to separate one set of ECS system components from those of other
systems.