Using set iterators can be quite a lot faster for sparse sets due to the
function call overhead in testing each element. Times for ad_tears
dropped from about 1200us to about 670us (hard to say due to there being
only 3 data points and a lot of noise in the time).
If a step has process tasks, any render or compute
pipelines/renderpasses are **not** run automatically: the idea is the
process tasks need to run the relevant pipelines in a custom manner but
needs the objects to be created.
The recent light changes highlighted that the renderer does not own the
efrags (segfault in qwaq when shutting down my test scene). After
digging through the history of efrag clearing, it turns out that the
renderer never owned them, I just didn't understand the concept of
scenes at the time that I moved efrags into the renderer.
This makes a pretty significant difference: 8-25% for demo1, demo2,
demo3 and bigass1. Many thanks to Nick Alger
(https://stackoverflow.com/questions/5122228/box-to-sphere-collision).
It took me a moment to recognize that it's the Minkowski difference (and
maybe GJK?). Of course, I adapted it for simd.
This eliminates the O(N^2) (N = map leaf count) operation of finding
visible lights and will later allow for finer culling of the lights as
they can be tested against the leaf volume (which they currently are
not as this was just getting things going). However, this has severely
hurt ad_tears' performance (I suspect due to the extreme number of
leafs), but the speed seems to be very steady. Hopefully, reconstructing
the vis clusters will help (I imagine it will help in many places, not
just lights).
I guess I wasn't sure how to find all the allocated entities from within
the registry, but it turned out to be trivial. This takes care of leaked
static entities (and, in a later commit, leaked light entities, which is
how I found the problem).
The grid calculations are modified from those of Inigo Quilez
(https://iquilezles.org/articles/filterableprocedurals/), but give very
nice results: when thin enough, the lines fade out nicely instead of
producing crazy moire patterns. Though currently disabled, the default
planes are the xy, yz and zx planes with colored axes.
Based on the article
(https://developer.nvidia.com/content/depth-precision-visualized), this
should give nice precision behavior, and removes the need to worry about
large maps getting clipped. If I'm doing my math correctly, despite
being reversed, near precision is still crazy high. And (thanks to the
reversed depth) about a quarter of a unit (for near clip of 4) out at 1M
unit distance.
This seems to be more for legacy X11 (ie, without fixes etc), but
fullscreen really shouldn't affect grabbing directly (rather, it should
be up to the client whether grabbing (and thus warping) is enabled at
all.
con_linewidt starts out as 0, which leads to bad results for the initial
widths of input lines and later calculations. However, really, they
probably shouldn't be using size_t for the width, but this is a nice
quick fix.
Due to doing most of my testing using the demos, I hadn't noticed the
double-draw until flying around with the debug camera (and it showed as
a weird shimmer behind the sky layers).
Panels can be anchored to a widget in another hierarchy, allowing for
things like cascading menus. They can also be extended via referencing
them by name, allowing for subsystems to add items to an already panel
(eg, extending a menu).
This prevent the layout system from repositioning the text view and thus
breaking text-shaping. Now Tengwar Telcontar looks much more balanced in
the widgets.
The lights debug is from the light splat experiment (this is why I kept
the code), and the bsp debug is based on that. Both currently disabled
for now until I get UI controls in.
SCR_UpdateScreen_Legacy now takes only the screen functions pointer (it
didn't need camera or realtime), and the camera sizzle code has been
moved into one place to make cleaning it up easier (when I get around to
auditing AngleVectors etc).
Skipping the root view (widget) sort of made sense before windows became
separate canvases as there was only the one hierarchy, but doing so
prevented windows (panels) from fitting themselves to their children.
However, now I need to think of a good way of specifying a minimum size
for panels.
WIdgets can't possibly be active when the entire UI is hidden, and
resetting the active widget when hiding the UI helps when the state gets
broken due to widget id conflicts.
The intent is to use them for menus, tooltips and anything else along
those lines, but windows was a good starting point (and puts a border
along the top of the window too).
It's not perfect as the first N expanding children get grown by 1 pixel
regarless of weight, but it's much better than leaving a (possibly quite
large) gap at the edge of the layout.
I'm not sure this is what I want, especially in the long run, but it
does make simple windows much easier to create (and not look broken due
to being specified too small).
Canvas draw order is sorted by group then order within the group. As a
fallback, the canvas entity id is used to keep the sort stable, but
that's only as stable as the ids themselves (if the canvases are
destroyed and recreated, the ids may switch around).
This has use when the order of components in the pool affects draw order
(or has other significance), especially at the subpool level. I plan to
use it for fixing overlapping windows in imui.
Shaped text is cached using all the shaping parameters as well as the
text itself as a key. This makes text shaping a non-issue for imui when
the text is stable, taking my simple test from 120fps to 1000fps
(optimized build).
As I had long suspected, building large hierarchies is fiendishly
expensive (at least O(N^2)). However, this is because the hierarchies
are structured such that adding high-level nodes results in a lot of
copying due to the flattened (breadth-first) layout (which does make for
excellent breadth-first performance when working with a hierarchy).
Using tree mode allows adding new nodes to be O(1) (I guess O(N) for the
size of the sub-tree being added, but that's not supported yet) and
costs only an additional 8 bytes per node. Switching from flat mode to
tree mode is very cheap as only the additional tree-related indices need
to be fixed up (they're almost entirely ignored in flat mode). Switching
from tree to flat mode is a little more expensive as the entire tree
needs to be copied, but it seems to be an O(N) (size of the tree).
With this, building the style editor window went from about 25% to about
5% (and most of that is realloc!), with a 1.3% cost for switching from
tree mode to flat mode.
There's still a lot of work to do (supporting removal and tree inserts).
There's still the problem with unused variables when building for
windows because of vulkan debug stuff, but this fixes the important
errors. It actually still works (at least under wine).
By default, horizontal and vertical layouts expand to fill their parent
in their on-axis direction (horizontally for horizontal layouts), but
fit to their child views in their off-axis.
Flexible space views take advantage of auto-expansion, pushing sibling
views such that the grandparent view is filled on the parent view's
on-axis, and the parent view is filled by the space in the parent view's
off-axis. Flexible views currently have a background fill, allowing them
to provide background filling of the overall view with minimal overdraw
(ancestor views don't need to have any fill at all).
Removing a hierarchy from an entity can result in a large number of
component removes in the same pool, thus changing index of the lest
element of the pool. This *seems* to fix the memory corruption I've been
experiencing with the debug UI.
The biggest change was splitting up the job resources into
per-render-pass resources, allowing individual render passes to
reallocate their resources without affecting any others. After that, it
was just getting translucency and capture working after a window resize.
I had forgotten that Hash_NewTable checked the hashctx parameter, so
calling Hash_NewTable in the struct initializer meant the hasctx was
uninitialized.
This takes care of element order stability. It did need reworking the
mouse tracking code (including adding an active flag to views), but now
buttons seem to work correctly.
It looks horrible due to the lack of lighting etc, but it's good enough
for basic testing, especially of my render job design (that passed with
flying colors).
It's there for a reason :P. Fixes most of the really bad behavior after
disabling some widgets (re-layout isn't working at all, though, and
adding the widgets back again puts them in the wrong place).
Using label + key_offset in both imui_state_getkey and the call to
Hash_Find greatly simplifies the logic of using the correct key. Fixes
an ever-growing set of buttons when using separators (hmm, I think this
means pruning isn't working correctly).
TextContent seems redundant at this stage since a text view is always
sized to its content, and PercentOfParent doesn't work yet. Pixels
definitely works and Null seems to work in that it does no sizing or
positioning. Vertical layout is supported but not yet tested, similar
for ChildrenSum, but I can have two buttons side by side.