This was mainly for the shutdown functions, thus allowing Sys_Shutdown
(and Sys_RegisterShutdown) to be per-thread, but it seemed like a good
idea to make everything per-thread.
Finally, hash links can be freed when the hash context is no longer
relevant. The context is created automatically when needed, and the
owner can delete the context when its done with the relevant hash
tables.
While the cexpr parser itself doesn't support void functions, they have
their uses when used with the system, and mixing them into the list of
function overloads shouldn't break non-void functions.
At least with a push-parser, by the time the parser has figured out it
has an identifier, the lexer has forgotten the token, thus the annoying
and uninformative "undefined identifier " error messages. Since
identifiers should always have a value (and functions need a function
type), setting up a dummy symbol with just the identifier name
duplicated seems to do the trick. It is a bit wasteful of memory if
there are a lot of such errors between cmem resets, though.
I ran into the need to get at the label of labeled array element and the
best way seemed to be by setting the name field of the plfield_t item
passed to the parser function, and then found that PL_ParseSymtab
already does this. I then decided passing the array index would also be
good, and the offset field made sense.
If the result object type pointer is null, then the parsed result type
and value pointers are written directly to the result object rather than
testing the parsed result type against the object type and copying the
parsed result value data to the location of the object value. It is then
up to the caller to check the type and copy the value data.
This fixes maplist showing only those maps in the user directory.
However, no checking is done for duplicate files due to earlier search
paths overriding later paths.
While chatting about utf-8, I noticed that QF doesn't ensure the input
sequences are the shortest possible encodings. It turns out that the
check is easy in that only the second byte needs to be checked if the
first byte's data bits are 0, and the second byte must have a data value
larger than that representable by the next lower leading byte.
I must have forgotten about the SYS_DeveloperID_... enum values, when I
wrote that code, because relying on the line number is not really for
the best.
Just head and tail are atomic, but it seems to work nicely (at least on
intel). I actually had more trouble with gcc (due to accidentally
testing lock-free with the wrong ring buffer... oops, but yup, gcc will
happily optimize your loop to spin really really fast). Also served as a
nice test for C11 threading.
Because the calculation didn't take the hunk header size (which is not
included in the hunk size) into account, the conversion to MB was one
short and thus the rounding up to the next 8 MB boundary was giving the
current total hunk size (ie, the already given size). Most confusing to
a user ("But I already asked for 128MB!").
It turns out that copying just "unknown" is a significant performance
hit when doing over 100M allocations. Making Hunk_RawAlloc the core and
initializing the name field with a single 0 shaved about a second off
`qfvis gmsp3v2.bsp` (from about 39s to about 38s).
My reason for using Hunk_HighAlloc for allocating cache blocks was to
lock them down so they were safe for the sound mixer to access when
running in a real-time thread. However, I had never tested under tight
memory constraints, which proved that the design (or maybe just
implementation) just wasn't robust. However, now that sounds are loaded
into a completely separate region, it's safe to put the cache back to
its original behaviour (still with 64-byte alignment and such, of
course). This will even allow the high hunk to be used again, though it
effectively was anyway with Hunk_TempAlloc.
Getting the tag is possibly useful in general and definitely in
debugging. Setting, I'm not so sure as it should be done when allocated,
but that's not always possible.
Also, correct the return type of z_block_size, though it affected only
Z_Print. While an allocation larger than 4GB is... big for zone, the
blocks do support it, so printing should too.
Since Ruamoko got vector types, zone's 8-byte alignment was no longer
sufficient due to hardware-enforced alignment requirements of the
underlying vector operations.
Fixes#28.
And use it for Ruamoko object reference counts.
I need reference counts for dealing with block sound buffers since they
can be shared by many channels. I figured I take care of Ruamoko's
reference count location at the same time.
Fixes#27.
The tests fail as they exercise how the cache *SHOULD* work rather than
how it does now.
The tests do currently pass for the pending work I've done on the cache
system, but while working on it, I remembered why I reworked cache
allocation...
The essential problem is that sounds are loaded into the cache, which is
fine for synchronous output targets, but has proven to be a minefield
for asynchronous output targets (JACK, ALSA).
The reason for the minefield is the hunk takes priority over the cache,
and is free to move cache blocks around, and *even dispose of them
entirely* in order to satisfy memory allocations from either end of the
hunk. Doing this in an entirely single-threaded process (as DOS Quake
was) is perfectly safe, as the users of the cache just reload the
pointer each time, and bail if it's null (meaning the block has been
freed), or even cause the data to be reloaded if possible (I'm a little
fuzzy on the details for that as I didn't write that code). However, in
multi-threaded code, especially real-time (JACK, possibly ALSA), it's a
recipe for disaster. The 4cab5b90e6 commit was a (mostly) successful
attempt to mitigate the problem by allocating the cache blocks from the
high-hunk (thus minimizing any movement caused by low-hunk allocations),
it resulted in cache allocates and regular high-hunk allocations somehow
getting intertwined: while investigating just how much memory ad_tears
needs (somewhere between 192MB and 256MB), I got "trashed sentinel"
errors and upon investigation, I found what looks very suspiciously like
audio data written across a hunk control block.
I've decided that the cache allocation *algorithm* should be reverted to
how it was originally designed by Id (details will remain "modern"), but
while working on the tests, I remembered why I had done the changes in
the first place (above story). Thus the work on reverting the cache
allocation can't go in until I get sound memory management independent
of the cache. The tests are going in now so I have a constant reminder :)
And make Sys_MaskPrintf take the developer enum rather than just a raw
int.
It was actually getting some nasty hunk corruption errors when under
memory pressure that made it clear the sound system needs some work.
The main goal was to get visframe out of mnode_t to make it thread-safe
(each thread can have its own visframe array), but moving the plane info
into mnode_t made for better data access patters when traversing the bsp
tree as the plane is right there with the child indices. Nicely, the
size of mnode_t is the same as before (64 bytes due to alignment), with
4 bytes wasted.
Performance-wise, there seems to be very little difference. Maybe
slightly slower.
The unfortunate thing about the change is the plane distance is negated,
possibly leading to some confusion, particularly since the box and
sphere culling functions were affected. However, this is so point-plane
distance calculations can be done with a single 4d dot product.
This ensures that the plugin's shutdown function won't get called twice
in the event of an error in the plugin's unload sequence triggering a
second Sys_Shutdown, especially if the plugin is being unloaded as a
part of another sub-system's shutdown sequence (which is probably in
itself a design mistake, need to look into that).
They're really redundant, and removing the next pointer makes for a
slightly smaller cvar struct. Cvar_Select was added to allow finding
lists of matching cvars.
The tab-completion and config saving code was reworked to use the hash
table DO functions. Comments removed since the code was completely
rewritten, but still many thanks to EvilTypeGuy and Fett.
Hash_Select returns a list of elements that match a given criterion
(select callback returning non-0).
Hash_ForEach simply calls a function for every element.
Other parts of quakefs treat an empty path as an error, so fs_sharepath
and fs_userpath must never be empty or they will effectively be
rejected. While the user explicitly setting them to empty strings is one
way for them to become empty, another is QFS_CompressPath compressing
'.' to an empty path, which makes it rather difficult to set up the
traditional quake directory tree (ie, operate from the current
directory).
This is an extremely extensive patch as it hits every cvar, and every
usage of the cvars. Cvars no longer store the value they control,
instead, they use a cexpr value object to reference the value and
specify the value's type (currently, a null type is used for strings).
Non-string cvars are passed through cexpr, allowing expressions in the
cvars' settings. Also, cvars have returned to an enhanced version of the
original (id quake) registration scheme.
As a minor benefit, relevant code having direct access to the
cvar-controlled variables is probably a slight optimization as it
removed a pointer dereference, and the variables can be located for data
locality.
The static cvar descriptors are made private as an additional safety
layer, though there's nothing stopping external modification via
Cvar_FindVar (which is needed for adding listeners).
While not used yet (partly due to working out the design), cvars can
have a validation function.
Registering a cvar allows a primary listener (and its data) to be
specified: it will always be called first when the cvar is modified. The
combination of proper listeners and direct access to the controlled
variable greatly simplifies the more complex cvar interactions as much
less null checking is required, and there's no need for one cvar's
callback to call another's.
nq-x11 is known to work at least well enough for the demos. More testing
will come.
The prefix gives more context to the error messages, making the system a
lot easier to use (it was especially helpful when getting my cvar revamp
into shape).
Based on the flags type used in vkparse (difference is the lack of
support for plists). Having this will make supporting named flags in
cvars much easier (though setting up the enum type is a bit of a chore).
This allows for easy (and safe) printing of cexpr values where the type
supports it. Types that don't support printing would be due to being too
complex or possibly write-only (eg, password strings, when strings are
supported directly).