The new render system now passes validation for the first frame (but
no drawing is done by the various subsystems yet). Something is wrong
with how swap chain semaphores are handled thus the second frame fails.
Frame buffer attachments can now be defined externally, with
"$swapchain" supported for now (in which case, the swap chain defines
the size of the frame buffer).
Also, render pass render areas and pipeline viewport and scissor rects
are updated when necessary.
I don't like the current name (update_framebuffer), but if the
referenced render pass doesn't have a framebuffer, one is created. The
renderpass is referenced via the active renderpass of the named render
step. Unfortunately, this has uncovered a bug in the setup of renderpass
objects: main.deferred has output's renderpass, and main.deferred_cube
and output have bogus renderpass objects.
The string type is useful for passing around strings (the only thing
that they can do, currently), particularly as arguments to functions.
The voidptr type is (currently) never generated by the core cexpr
system, but is useful for storing pointers via cexpr (probably a bit of
a hack, but it seems to work well in my current use).
Being able to specify the types in the push constant ranges makes it a
lot easier to get the specification correct. I never did like having to
do the offsets and sizes by hand as it was quite error prone. Right now,
float, int, uint, vec3, vec4 and mat4 are supported, and adheres to
layout std430.
This allows the likes of:
qfv_pushconstantrangeinfo_s = {
.name = qfv_pushconstantrangeinfo_t;
.type = (QFDictionary);
.dictionary = {
.parse = {
type = (labeledarray, qfv_pushconstantinfo_t, name);
size = num_pushconstants;
values = pushconstants;
};
stageFlags = $name.auto;
};
stageFlags = auto;
};
Leading to:
pushConstants = {
vertex = { Model = mat4; blend = float; };
fragment = { colors = uint; base_color = vec4; fog = vec4; };
};
Where the label of the labeled array (which pushConstants is) is
actually an enum flag and the dictionary value is another labeled array.
The up-coming changes to push constant handling has qfv_float etc type
enum values and using "float" instead of "qfv_float" is highly desirable
as the names match the glsl type names.
The creation of the render jobs doesn't really belong with the running
of those jobs. This makes the code a little easier to navigate as it was
too easy to lost between load-time and run-time code.
This is with the new render job scheme. I very much doubt it actually
works (can't start testing until everything passes, and it's disabled
for the moment (define in vid_render_vulkan.c)), but it's helping iron
out what more is needed in the render system.
I never liked it, but with C2x coming out, it's best to handle bools
properly. I haven't gone through all the uses of int as bool (I'll leave
that for fixing when I encounter them), but this gets QF working with
both c2x (really, gnu2x because of raw strings).
Wrap the strtod, strtof, strtol, strtoul functions, supporting the end
pointer as well (if not nil, the int offset of the end pointer relative
to the string start is returned).
Also, str_unmutable creates a return string from a mutable string
(copying it).
Meaning some leaks have been plugged, and some useful functions added:
loading a file (avoids polluting progs memory), setting the single
character lexeme string, and getting the line number.
Segfaulting when trying to produce an error message doesn't help get the
message out. Sure, `obj_error (nil...)` is a bit of an abuse, but it
shouldn't segfault the engine.
I don't know what I was thinking when I checked for 0 count for resizing
the set. Attempting to add/remove 0 elements results in adding/removing
4G elements. Oops.
set_while checks the iterator's current element membership and skips to
the first element with different membership. ie, if the current element
is in the set, then set_while returns the next element *not* in the set,
but if the current is not in the set, then set_while returns the next
element that *is* in the set. Rather handy for dealing with clusters of
set elements.
I never liked the various hacks I had come up with for representing
resource handles in Ruamoko. Structs with an int were awkward to test,
pointers and ints could be modified, etc etc. The new @handle keyword (@
used to keep handle free for use) works just like struct, union and
enum in syntax, but creates an opaque type suitable for a 32-bit handle.
The backing type is a function so v6 progs can use it without (all the
necessary opcodes exist) and no modifications were needed for
type-checking in binary expressions, but only assignment and comparisons
are supported, and (of course) nil. Tested using cbuf_t and QFile: seems
to work as desired.
I had considered 64-bit handles, but really, if more than 4G resource
objects are needed, I'm not sure QF can handle the game. However, that
limit is per resource manager, not total.
Removed a bogus dependency from libQFecs, and fixed the order of ui
libraries. This takes care of some first-time make install issues.
Libtool needs the libraries to be specified in dependency order.
Carrying on as if the missing font had been loaded leads to way too many
issues for it to be a good thing (not that that really needs to be
said). Fixes the segfaults in my test scene.
Really, a bit more than stub as the basic code is there, but nothing
works properly yet due to missing resources (especially descriptor sets
and pools), and the frame buffer creation is still disabled.
The step dependencies are not handled yet as threading isn't used at
this stage, but since I'll require dependencies to always come earlier,
this shouldn't cause a problem.
I always suspected the overflow conversions were UB, but with gcc doing
different things on arm, I thought it was about time to abandon those
particular tests. What I was not expecting was for the return value of
strcmp to be "UB" (in that there's no guarantee of the exact value, just
< = > 0). Fortunately, nothing actually relies on the value of the op
other than the tests, so modify the test to make the behavior well
defined.
I had somehow missed vkfieldignore in a consistency pass, or just messed
up its initialization (and thus deallocation) resulting in a double-free
of the strings.
This fixes a Sys_Error when loading the level for the first demo (and
probably many other times). It was mod_numknown getting set to 0 that
triggered the issue, but that seems to be necessary for the other
renderers. I think the whole model loading and caching system needs an
overhaul as this doesn't feel quite right due to removing part of the
advantage of caching the model data.
While the previous cleanup took care of the C side, it turns out vkgen
was leaking property list items all over the place, but they were
cleaned up by the shutdown code.
Requiring top-level {} or () for (usually) hand-written files is awkward
and even a little error prone, and certainly ugly at times. With this,
loaders that expect a particular format can specify the format a little
more directly.
The jobs will become the core of the renderer, with each job step being
one of a render pass, compute pass, or processor (CPU-only) task. The
steps support dependencies, which will allow for threading the system in
the future.
Currently, just the structures, parse support, and prototype job
specification (render.plist) have been implemented. No conversion to
working data is done yet, and many things, in particular resources, will
need to be reworked, but this gets the basic design in.
I had looked into doing reference counting on the strings, but didn't
like the implementation. However, it did make for better string handling
in the property list parser.
Flushing memory requires nonCoherentAtomSize alignment, but this can
cause the flush range to go out of bounds of an improperly sized buffer.
However, only host-visible (and probably really only cached, but all
three covered) needs flushing, so no rounding up is done for
device-local memory.
I'm not sure just what was going on other than *other* components were
getting double-removed when the hierarchy reference component was
removed when the entity was being deleted. This might even prevent
issues with removing the hierarchy from an entity that's not being
deleted as the pre-invalidation prevents the removal from deleting the
entity.
It turns out that the fixes for other problems related to removing
hierarchy reference components fixed the problem moving the entity
invalidation fixed, and invalidating the entity late somehow broke the
sprite renderer (at least in glsl).
The hierarchy leak was particularly troublesome to fix, but now the
hierarchies get updated (and freed) automatically just by removing the
hierarchy reference component from the entity. I suspect there will be
issues with entities that are on multiple hierarchies, but I'll sort
that out later.
It turns out that the bsearch bug was hiding incorrect handling of
indices in the subpool beyond the last tracked subpool. In which case, a
correctly working bsearch correctly fails to find the range, but the
search can be skipped entirely.
And rename _bsearch to QF_bsearch_r since that's far less confusing.
Also, update the test to make it possible for valgrind to detect the
out-by-one. The problem was found when trying to remove components from
an entity when using subpools.
I'm not 100% sure this is the best fix for the issue, but the way the
cbuf interpreter stack works (especially in the console code) meant that
the stack was built in the order opposite to how it could be safely
deleted with the existing function. Yeah, more leaks :P
Some of them, especially in rua_obj, were quite legitimate and even a
problem for thread-safety (rua_cmd is currently not thread-safe, but it
needs a lock, which I don't feel like doing at this stage).
Uncovered by the memory leak cleanup: the nodes were all being "linked"
to the first node, those nodes in between the first and last were
getting lost.
This was mainly for the shutdown functions, thus allowing Sys_Shutdown
(and Sys_RegisterShutdown) to be per-thread, but it seemed like a good
idea to make everything per-thread.
Finally, hash links can be freed when the hash context is no longer
relevant. The context is created automatically when needed, and the
owner can delete the context when its done with the relevant hash
tables.
It should have been this way all along, and it seems I thought they were
when I did rua_gui.c as it already freed its resource block, which would
have been a double free (oops). Fixes an invalid write when shutting
down progs in qwaq-cmd (relevant change not committed).
I tried out -std=c2x (doesn't work due to typeof (gcc bug?)) and
-sdt=gnu2x (still no #embed) and found that only regex.c was a problem
(nice), and now it no longer is a problem.
Render passes and subpasses are now mostly initialized, just command
buffers and frame buffer related info to go (including view/scissor for
pipelines).
Due to a typo in the list of extra property list items to add to the
symbol table (corrected), subsequent symbols were pointing to the wrong
memory address.
Not only does this quieten the validation layers, it ensures that all
the object handles are named and where they need to be. Also fixes only
one pipeline being created instead of the 15 or so.
It turns out labeled arrays don't work if structs aren't declared in the
right order (no idea what that is, though) as the struct might not have
been processed when the labeled array field is initialized. Thus, do a
pro-processing pass to set up any parse data prior to writing the
tables.
Most importantly, this cleans up creation of self-referencing symbol
tables from property lists, but adds in C-defined symbols as well. While
QFV_ParseRenderInfo is currently the only the function that uses it, it
might be helpful in the future, especially as I clean up the other parse
support code.
The render passes seem to be created successfully, but pipelines fail
due to not having layout set, resulting in a segfault (bug in validation
layers?).
I don't remember why I kept the abbreviated configs for images and image
views, but it because such that I need to be able to specify them
completely. In addition, image views support external images.
The rest was just cleaning up after the changes to qfv_resobj_t.
Vulkan requires color blend state is only for color attachments (ignored
otherwise), but it shouldn't be necessary to actually specify the blend
state and instead have it default to something reasonable.
Unfortunately, colorWriteMask affects the output even if blending is
disabled, so it must be initialized to something reasonable (r|g|b|a)
for when the default is used.
.dictionary can ask for standard parsing via a .parse key (value is
ignored currently).
Fields can use $auto to use standard parsing for that field.
If either is used, the plist field descriptors are written.
They're currently just stubs, but this gets the render info loading
working without any errors. The next step is to connect up pipelines and
create the image resources, then implementing the task functions will
have meaning.
This gets an empty (no tasks or pipelines connected) render context
initialized and available for other subsystems to register their task
functions. Nothing is using it yet, but the test parse of rp_main_def
fails gracefully (needs those tasks).
This just sets up the memory block and cexpr descriptors for the
parameters, parameter parsing is separate (and next). The parameters are
aligned to their size.
Needed to add the render passes plitem to the cexr symbol table, too.
All that remains is to figure out how to deal with multiview (or really
@next) and get task parsing working.
A bunch of missed struct members, incorrect parse types, and some logic
errors in the parse setup. Still not working due to problems with
vectors from plist string references and some other errors, but getting
there.
This is most useful when parsing a labeled array where the key/value
pairs go into a simple array:
key = value;
going to:
struct foo {
const char *key;
enumtype value;
};
This treats dictionary items as arrays ordered by key creation (ie, the
order of the key/value pairs in the dictionary is preserved). The label
is written to the specified field when parsing the struct. Both actual
arrays and single element "arrays" are supported.
This allows having sections in a spec used for things like `properties`
that have no corresponding fields in the actual struct: the field is
ignored when parsing and no cexpr field symbol is emitted.
There's still a lot of work to do, but the basics are in. The spec will
be parsed into info structs that can then be further processed to
generate all the actual structs, generally making things a little less
timing dependent (eg, image view info refers to its image by name).
The new render pass and subpass structs have their names mangled for now
until I can switch over to the new system.
Ruamoko currently doesn't support `const`, so that's not relevant, but
recognizing `char *` (via a hack to work around what looks like a bug
with type aliasing) allows strings to be handled without having to use a
custom parser. Things are still a little clunky for custom parsers, but
this seems to be a good start.
Using the typedef name makes using structs declared as
typedef struct foo_s { ... } foo_t;
easier and cleaner. Sure, I could have written the "struct foo_s" for
the output name, but I'm much more likely to look for foo_t than foo_s
when checking the generated code.
While the old system did get things going, it felt clunky to set up,
especially when it came to variations on render passes (eg, flat vs
cube-mapped). Also, much of it felt inside-out, especially the
separation of pipelines and render passes: having to specify the render
pass and subpass in the pipeline spec made the spec feel overly coupled
to the render pass setup. While this is the case in Vulkan, it is not
reflected properly in the pipeline spec. The new system will adjust the
render pass and subpass parameters of the pipeline spec as needed,
making the pipeline specs more reusable, and hopefully less error prone
as the pipelines are directly referenced by the subpasses that use them.
In addition, subpass dependencies should be much easier to set up as
only the dependent subpass specifies the dependency and the subpass
source dependency is mentioned by name. Frame buffer attachments also
get a similar treatment.
The new spec "format" isn't quite finalized (needs to meet the enemy
known as parsing) but it feels like a good starting place.
I suspect this is a hold-over from before the bsp thread safety changes,
but with the nicely separated queues, it's easy to pass the sky surfaces
through the depth pass as well as the translucency pass (I think the
reason for that is lighting). This prevents bits of world being seen
through sky surfaces when the sky isn't fully opaque (like skysheet due
to the shortcuts in the shader).
Partial because frame buffer creation isn't handled yet (using six
layers), but using layer a layer capable view and shaders doesn't cause
problems (other than maybe slightly slower code).
It turns out that my laptop doesn't do multiview properly (or I've
misconfigured something, later), but the biggest issue I had on my
desktop seems to be that I had the push constants wrong: fov in aspect,
time in fov, and I had degrees instead of radians (half angle) anyway.
There are some missing parts from this commit as these are the fairly
clean changes. Missing is building a separate set of pipelines for the
new render pass (might be able to get away from that), OIT heads texture
is flat rather than an array, view matrices aren't set up, and the
fisheye renderer isn't hooked up to the output pass (code exists but is
messy). However, with the missing parts included, testing shows things
mostly working: the cube map is rendered correctly even though it's not
displayed correctly (incorrect view). This has definitely proven to be a
good test for Vulkan's multiview feature (very nice).
While the cexpr parser itself doesn't support void functions, they have
their uses when used with the system, and mixing them into the list of
function overloads shouldn't break non-void functions.
At least with a push-parser, by the time the parser has figured out it
has an identifier, the lexer has forgotten the token, thus the annoying
and uninformative "undefined identifier " error messages. Since
identifiers should always have a value (and functions need a function
type), setting up a dummy symbol with just the identifier name
duplicated seems to do the trick. It is a bit wasteful of memory if
there are a lot of such errors between cmem resets, though.
I ran into the need to get at the label of labeled array element and the
best way seemed to be by setting the name field of the plfield_t item
passed to the parser function, and then found that PL_ParseSymtab
already does this. I then decided passing the array index would also be
good, and the offset field made sense.
Some of the queues start don't get fully initialized, but rather than go
through everything making sure they do, it's just easier to zero the
whole lot at the beginning.
When bubbling a component past an empty range, there's no need for any
actual movement other than adjusting the range itself, and doing so
corrupts the sparse/dense array relationship. Fixes a segfault when
hiding the deathmatch overlay (that resulted from the change to using
canvases).
Canvas_SortComponentPool now takes the raw canvas component id as it is
specialized to the canvas subpools.
Canvas_SetLen resizes the root view and then updates the hierarchy for
every canvas in the system.
Canvas_InitSys sets up the component system with the systems it needs
(canvas, view, text). This is required to ensure view_href is just past
the canvas components as it is needed for retrieving the actual canvas
component (and thus sub-pool range ids) from arbitrary views in the
canvas.
Entities are fetched with the correct offset from the pool entities.
This will make it easy for client code to set up data needed by the
console before the console initializes. It already separates console
cvar setup and initialization, which has generally been a good thing.
The flashing pink around the Q menu cursor was caused by vulkan command
buffer writes and draw queue population being out of phase, which was
fixed by the recent screen update changes (specifically,
42441e87d4).
Rather important for debugging 2d stuff (draw's lines are 2d-only).
Other than translucent console, this gets the vulkan draw api back to
full operation.
This needed either more font ids to be supported, or small lump pics (up
to 32 x 32) to be loaded into the atlas. I went with both. The menus
don't use Draw_TextBox, but quakeworld's netgraph does.
This makes use of slice rendering to achieve the effective scaling, but
the slice data is created only when needed so pics that never use slices
don't waste 16 vertices.
The goal is to get vulkan relying on the "renderpass" abstraction, but
this gets vulkan up and running again, and even fixes the rendering
issues (in the end, getting canvas working wasn't required, but is still
planned).
This is a bit of a hack to allow me to work on vulkan's screen update
"pipeline" without having to mess with the other renderers, since it
turns out they're (currently) fundamentally incompatible.
When a pic needs dynamic vertices (eg, for sub-pics), a descriptor set
is allocated and updated if one has not been created for the pic. This
is done each frame: the descriptor sets are recycled (there currently is
rarely a need for more than a small handful of dynamic descriptors, so
64 should be plenty for now).
Unfortunately, due to the order of operations issue between draw items
getting queued and submitting commands to vulkan (the cause of the pics
not rendering correctly per 8fff71ed4b),
the validation layers complain (correctly) about the command buffers
being executed with updated descriptor sets. Getting the canvas system
up and running will fix that.
The pic is scaled to fill the specified rect (then clipped to the
screen (effectively)). Done just for the console background for now, but
it will be used for slice-pics as well.
Not implemented for vulkan yet as I'm still thinking about the
descriptor management needed for the instanced rendering.
Making the conback rendering conditional gave an approximately 3% speed
boost to glsl with the GL stub (~12200fps to ~12550fps), for either
conback render method.
The wording might seem a little odd, but cl_screen is really the full 2D
client HUD while the console is completely independent of the client and
shouldn't know that the client even exists. Ideally, the resize events
would be handled by the canvas system, to which end this is a small
step.
This fixes the broken dynamic lighting in fisheye rendering. It does
mean that frustum culling of lit surfaces needed to be removed, but if
not doing frustum culling on lit surfaces was good enough for a P90,
it's probably good enough for an i7-6850K.
They are usually larger images (eg, the main menu graphic) and thus make
a mess of the atlas (thus, making them separate means a smaller atlas
can be used). All sorts of things are in a mess (descriptor management,
subpic rendering not supported, wrong alpha value for the transparent
pixel), but this gets the basic loader going.
This just takes advantage of the dynamic verts for doing subpics. It's
not really the most optimal code as it has to write both the vertices
(64 bytes per quad) and the instances (24 bytes per quad), but that's
still better than the old 128 bytes per quad (and having a single
pipeline is nice).
The problem was that I had mixed up the purpose of the per-frame vertex
buffers and used them for the core quad data when they were meant for
subpic and the like, and forgotten about the static vertex buffer.
This gets at least conchars working (pic in general not tested yet).
Any performance gains will be utterly swamped by the deferred renderer,
but it will allow better control of quad render order by any client
code (and should be slightly better for simpler renderers when I get
support for them working).
Right now, plenty is broken (much of the higher level draw functions are
disabled, and pics don't render correctly), but this gets at least the
basics in so I'm not bouncing diffs around as much.
It turns out the slice pipeline is compatible with the glyph pipeline in
that its vertex attribute data is a superset (just the addition of the
offset attributes). While the queues have yet to be merged, this will
eventually get glyphs, sliced sprites, and general (static) quads into
the one pipeline. Although this is slightly slower for glyph rendering
(due to the need to pass an extra 8 bytes per glyph), this should be
faster for quad rendering (when done) as it will be 24 bytes per quad
instead of 32 bytes per vertex (ie, 128 bytes per quad), but this does
serve as a proof of concept for doing quads, glyphs and sprites in the
one pipeline.
The main reason I had created in the first place was I hadn't thought of
using image view swizzles to handle coverage-alpha textures (for
monochrome glyphs), and for whatever reason also had the texture in a
different binding slot to the twod fragment shader. With both issues out
of the way, there's no reason to have an almost identical (just some
naming) shader just for glyphs.
With an eye towards merging the 2d pipelines as much as possible, I
found that the glyph and basic 2d quad texture descriptors were in
different slots for no reason I can think of. Having them in the same
slot would mean I could use the same fragment shader for all 2d
pipelines (though the plan is to get it down to two: (sliced) quads and
lines).
I hadn't noticed the problem until playing with early fragment tests for
the sprite fragment shaders, but passing data that expects triangle
strips to a pipeline that expects triangle lists doesn't work too well
when drawing quads.
This is the beginning of supporting 2d rendering in 3d space. The idea
is that a canvas can be in 2d orthographic space (not attached to any
entity with a 3d transform), or in 3d perspective space (attached to an
entity with a 3d transform, either as a child of the camera, or of some
object in 3d space).
It will replace the current HUD code when it's working.
I found I needed the subrange start as well as the end, but I liked that
the subpools themselves used only the end of the range, so switching to
just a unint32_t for the value and adding a function to return a tuple
made sense. I had kept the struct because I thought I might want to
store additional information (eg, the entity "owning" the subpool), but
found that I didn't need such information as the systems using subpools
that way would have access to the entity by other means.
Interestingly, the change found a bug in subpool creation: I really
don't know why things worked before, but they work better now :)
Subpools are for grouping components by some criterion. Any component
that has a rangeid callback will be grouped with other components that
return the same render id. Note that the ordering of components within a
group will be affected by adding a component into a group that comes
before that group (or removing a component).
Component pools can have multiple groups, added and removed dynamically,
but removing a group should (currently) be done only when empty.
While "set" is a tad strong (there's just the one component for now), I
had missed the changes when adding ECS systems. Fixes the segfault at
the end of demo1 (ie, when any center text is printed).
Instead of creating new entities for the text views. This approximately
halves the number of entities required to display flowed text, but also
tests the ability to have an entity in multiple hierarchies (the goal of
the ECS component and system changes).
Marking them as cached means that they'll be "uncached" instead of
destroyed when freed, which would not be a particularly good thing. I
have no memory as to how I found this as I found the change in my git
stash.
While this does require an extra call after registering components, it
allows for multiple component sets (ie, sub-systems) to be registered
before the component pools are created. The base id for the registered
component set is returned so it can be passed to the subsystem as
needed.
There's now a main ecs.h file that includes the sub-system headers,
removing the need to explicitly include several header files, but the
sub-systems are a less cluttered.
This means that the component id used for hierarchy references must be
passed to Hierarchy_New and Hierarchy_Copy, but does all an entity to
have more than one hierarchy, which is useful for canvases (hierarchies
of views) in the 3d world (the canvas root would have a 3d hierarchy
reference and a 2d (view) hierarchy reference).
It seems that the mouse escaping the barriers requires some combination
of hitting two at once, and holding your mouth just right (something
about sliding the mouse up and down one barrier near the other).
However, sending the mouse back to the center of the screen when it
touches a barrier makes such sliding impossible.
This seems to fix#38
I obviously need a better way to test legacy code because the fix for
unsigned-int behavior with clang broke mouse warping when using
XGrabPointer instead of XInput2's XIGrabEnter.
The separation now uses height above (right of) the base line, and depth
below (left of) the base line. This puts the text exactly where I want
it, but there's still the problem of uneven line spacing caused by
descenders and ascenders. However, I suspect that's more up to the
text/font handling code to get the boxes right (maybe set spaces to have
the right dimensions?).
The main problem was the confusion about the coordinates within a single
glyph, and thus the glyphs position within the view's box. With this,
flowed text works fairly well except for some issues with spacing
between lines (which I think is due to the flow code not having been
tested with offset boxes).
While Draw_Glyph does draw only one glyph at a time, it doesn't shape
the text every time, so is a major win for performance (especially
coupled with pre-shaped text).
Font cannot be overridden yet, but script attributes (language, script
type, direction) and features can be set at all three levels in a
passage. Attributes on the root level act as defaults for the paragraph
and word levels, and paragraph attributes act as defaults for the word
level.
Passage_Delete needs to check if the hierarchy is valid as no text may
have been added, which results in a null pointer for the hierarchy.
Text shaping needs to set language etc every time it resets the buffer.
This causes some problems with linking if libQFgui is linked with
libQFrenderer (which is necessary in the long run), but it seems
everything gets away with it for now (which, tbh, I don't like).
And add a function to process a passage into a set of views with glyphs.
The views can be flowed: they have flow gravity and their sizes set to
contain all the glyphs within each view (nominally, words). Nothing is
tested yet, and font rendering is currently broken completely.
Font and text handling is very much part of user interface and at least
partially independent of rendering, but does fit it better with GUI than
genera UI (ie, both graphics and text mode), thus libQFgui as well as
libQFui are built in the ui directory.
The existing font related builtins have been moved into the ruamoko
client library.
I had done the loader for the GPU renderers, so the CPU renderer didn't
draw the characters transparently. Fixes the pink block in my ruamoko
test scene (due to the notify text area).
While it doesn't really make any difference to the texture upload (8-bit
is 8-bit), and the sampler is in control of the interpretation, this
makes vulkan more consistent with the specification of the glyph
texture.
In theory, it supports all the non-palette formats, but only luminance
and alpha (tex_l and tex_a) have been tested. Fixes the rather broken
glyph rendering.
World scale can only be approximate if non-uniform scales and
non-orthogonal rotations are involved, but it is still useful
information sometimes.
However, the calculation is expensive (needs a square root), so remove
world scale as a component and instead calculate it on an as-needed
basis because it is quite expensive to do for every transform when it is
used only by the legacy-GL alias model renderer.
Thanks to the 3d frame buffer output being separate from the swap chain,
it's possible to have a different frame buffer size from the window
size, allowing for a smaller buffer and thus my laptop can cope (mostly)
with the vulkan renderer.
The escape was actually harmless as the buffers would not be read due to
the particle count being 0 (thus why the buffers were at the end of the
staging buffer: no space was allocated for them, only for the system
buffer, but their offsets were just past the system buffer). However,
the validation layers quite rightly did not like that. Thus, the two
buffers are pointed to the system buffer so all three descriptors are
always valid.
Where too far is 1024 units as that is the maximum supported, or the
radius. The change to using unsigned for the distances meant the simple
checks missed the effective max dist going negative, thus leading to a
segfault.
I had debated putting the blending in the compose subpass or a separate
pass but went with the separate pass originally, but it turns out that
removing the separate pass gains 1-3% (5-15/545 fps in a timedemo of
demo1).
viewstate's time is from cl.time which is not what's used to set
last_servermessage (that uses realtime). After careful investigation, I
found that cl.time is not at all suitable and that the original id code
used realtime (I think it was just me being lazy when I merged the
code). Fixes the stuck net icon.
quake changes rocket and grenade models to explosion models, but
quakeworld does not. This resulted in nq drawing two explosion sprites
instead of one. Separating the types allows nq to skip adding a sprite
for the explosion.
It's a bit flaky for particles, especially at higher frame rates, but
that's due to supporting only 64 overlapping pixels. A reasonable
solution is probably switching to a priority heap for the "sort" and
upping the limit.
This required making the texture set accessible to the vertex shader
(instead of using a dedicated palette set), which I don't particularly
like, but I don't feel like dealing with the texture code's hard-coded
use of the texture set. QF style particles need something mostly for the
smoke puffs as they expect a texture.
It doesn't want to work on my nvidia (or more recent sid?) and doesn't
seem to be necessary. The problem may be multiple event sets before the
first wait, but investigation can wait for now.
This is probably the biggest reason I had problems with particles not
updating correctly: the descriptors were generally point pointing to
where the data actually was in the staging buffer.
I don't yet know whether they actually work (not rendering yet), but the
system isn't locking up, and shutdown is clean, so at least resources
are handled correctly.
Although it works as intended (tested via hacking), it's not hooked up
as the current frame buffer handling in r_screen is not readily
compatible with how vulkan output is handled. This will need some
thought to get working.
This splits up render pass creation so that the creation of the various
resources can be tailored to the needs of the actual render pass
sub-system. In addition, it gets window resizing mostly working (just
some problems with incorrect rendering).
If the result object type pointer is null, then the parsed result type
and value pointers are written directly to the result object rather than
testing the parsed result type against the object type and copying the
parsed result value data to the location of the object value. It is then
up to the caller to check the type and copy the value data.
It turns out the semaphore used for vkAcquireNextImageKHR may be left in
a signaled state for VK_ERROR_OUT_OF_DATE_KHR. While it seems to be
possible to clear the semaphore using an empty queue submission,
destroying and recreating the semaphore works well.
Still have problems with the frame buffer after window resize, though.
Swap chain acquisition is part of final output handling. However, as the
correct frame buffers are required for the render passes, the
acquisition needs to be performed during the preoutput render pass.
Window resize is still broken, but this is a big step towards fixing it.
This is the minimum maximum count for sampled images, and with layered
shadow maps (with a minimum of 2048 layers supported), that's really way
more than enough.
I guess nvidia gives a non-srgb format as the first in the list, but my
laptop gives an srgb format first, thus the unexpected difference in
rendering brightness. Hard-coding BGRA isn't any better, but it will do
for now.
Things are a bit of a mess with interdependence between sub-module
initialization and render pass initialization, and window resizing is
broken, but the main render pass rendering to an image that is then
post-processed (currently just blitted) is working. This will make it
possible to implement fisheye and water warp (and other effects, of
course).
When working, this will handle the output to the swap-chain images and
any final post-processing effects (gamma correction, screen scaling,
etc). However, currently the screen is just black because the image
for getting the main render pass output isn't hooked up yet.
Now each (high level) render pass can have its own frame buffer. The
current goal is to get the final output render pass to just transfer the
composed output to the swap chain image, potentially with scaling (my
laptop might be able to cope).
While the HUD and status bar don't cut out a lot of screen (normally),
they might start to make a difference when I get transparency working
properly. The main thing is this is a step towards pulling the 2d
rendering into another render pass so the main deferred pass is
world-only.
Using swizzles in an image view allows the same shader to be used with
different image "types" (eg, color vs coverage).
Of course, this needed to abandon QFV_CreateImageView, but that is
likely for the best.
It turns out that nearest filtering doesn't need any offsets to avoid
texel leaks so long as the screen isn't also offset. With this, the 2d
rendering looks good at any scale (minus the inherent blockiness).
It seemed like a good idea at the time, but it exacerbates pixel leakage
in atlas textures that have no border pixels (even in nearest sampling
modes).
The rest of the system won't add one automatically (since entity
creation no longer does), but the alias and iqm rendering code expect
there to be one. Fixes a segfault when starting a scene (demo etc).
There's no API yet as I need to look into the handling of qpic_t before
I can get any of this into the other renderers (or even vulkan, for that
matter).
However, the current design for slice rendering is based on glyphs (ie,
using instances and vertex pulling), with 3 strips of 3 quads, 16 verts,
and 26 indices (2 reset). Hacky testing seems to work, but real tests
need the API.
I don't know why it didn't happen during the demo loop, but going from
the start map to e1m1 caused a segfault due to the efrags for a lava
ball getting double freed (however, I do think it might be because the
ball passed through at least two leafs, while entities in the demos did
not). The double free was because SCR_NewScene (indirectly) freed all
the efrags without removing them from entities, and then the client code
deleting the entities caused the visibility components to get deleted
and thus the efrags freed a second time. Using ECS_RemoveEntities on the
visibility component ensures the entities don't have a visibility
component to remove when they are later deleted.
While simple component pools can be cleared simply by zeroing their
counts, ones that have a delete function need that function to be called
for all the components in the pool otherwise leaks can happen.