It turns out to be possible to get a barrier event at the same time as a
configure notify event (which rebuilds the barriers), and trying to
release the pointer at such a time results in a bad barrier error and
program crash. Thus check the event barrier against the currently
existing barriers before attempting to release the pointer.
This does mean that a better mechanism for sequencing window
repositioning and barrier creation may be required.
This should be a much friendlier way of "grabbing" input, though I
suspect that using raw keyboard events will result in a keyboard grab,
which is part of the reason for wanting a friendly grab.
There does seem to be a problem with the mouse sneaking out of the
top-right and bottom-left corners. I currently suspect a bug in the X
server, but further investigation is needed.
This is needed for getting window position info into in_x11 without
exposing more globals, and is likely to be useful for other things,
especially as it doubles as a resize event when that's eventually
supported.
This is necessary in focus-follows-mouse environments (at least for
openbox, but it wouldn't surprise me if most other WMs behave the same
way) because the WMs don't set focus when the pointer is grabbed (which
XInput does before the WM sees the enter event). This is especially
important when the window is fullscreen on a multi-monitor setup as
there is no border to *maybe* catch the mouse before it enters the
window.
Right now, only raw pointer motion and button events are handled, and
the mouse escapes the window, and there are some issues with focus in
focus-follows-mouse environments. However, this should be a much nicer
setup than DGA.
The current limit is still 32. Dealing with it properly will take some
rather advanced messing with XInput, and will be necessary assuming
non-XInput support is continued.
There's now IN_X11_Preinit, IN_X11_Postinit (both for want of better
names), and in_x11_init. The first two are for taking care of
initialization that needs to be done before window creation and between
window creation and mapping (ie, are very specific to X11 stuff) while
in_x11_init takes care of the setup for the input system. This proved
necessary in my XInput experimentation: a passive enter grab takes
effect only when the pointer enters the window, thus setting up the grab
with the pointer already in the window has no effect until the pointer
leaves the window and returns.
This was always a horrible hack just to get the screen centered on the
window back when we were doing fullscreen badly. With my experiments
with XInput, it has proven to be a liability (I'd forgotten it was even
there until it started imposing a 2s delay to QF's startup).
Input driver can now have an optional init_cvars function. This allows
them to create all their cvars before the actual init pass thus avoiding
some initialization order interdependency issues (in this case, fixing a
segfault when starting x11 clients fullscreen due to the in_dga cvar not
existing yet).
Well... it could be done better, but this works for now assuming it's in
/usr/include (and it's correct for mxe builts). Does need proper
autoconfiscation, though.
Seems to work nicely for keyboard (though key bindings are not
cross-platform). Mouse not tested yet, and I expect there are problems
with it for absolute inputs (yay mouse warp :P).
I didn't notice that uint is defined somewhere on Linux... until I tried
compiling for windows (not defined). Use a define to keep the cast
function naming nice.
Mouse axis and button names are handled internally (and thus
case-insensitive).
Key names are handled by X11. Case-sensitivity is currently determined
by Xlib.
keyhelp provides the input name if it is known, and in_bind tries to use
the provided input name if not a number. Case sensitivity for name
lookups is dependent on the input driver.
Reset the blocks completely when loading configs and fix a leftover from
when I thought I'd expose the block numbers to bindings but then changed
my mind to simply track the base binding.
The cooked inputs (ie_key, ie_mouse) are intended for UI interaction, so
generally should have priority over the raw events, which are intended
for game interaction.
There's now an internal event handler for taking care of device addition
and removal, and a public event handler for dealing with device input
events in various contexts In particular, so the clients can check for
the escape key.
While the console command line is quite good for setting everything up,
the devices being bound do need to be present when the commands are
executed (due to needing extra data provided by the devices). Thus
property lists that store the extra data (button and axis counts, device
names/ids, connection names, etc) seems to be the best solution.
Recipes themselves still use float, but using double in the cexpr values
allows bare floating point numbers (which parse as double) to be used,
making the bind command line a little more user-friendly.
The mouse bound to movement axes works (though signs are all over the
place, so movement direction is a little off), and binding F10 (key 68)
to quit works :)
Each axis binding has its own recipe (meaning the same input axis can be
interpreted differently for each binding)
Recipes are specified with field=value pairs after the axis name.
Valid fields are minzone, maxzone, deadzone, curve and scale, with
deadzone doubling as a balanced/unbalanced flag.
The default recipe has no zones, is balanced, and curve and scale are 1.
Hot-plug support is done via "connections" (not sure I'm happy with the
name) that provide a user specifiable name to input devices. The
connections record the device name (eg, "6d spacemouse") and id (usually
usb path for evdev devices, but may be the device unique id if
available) and whether automatic reconnection should match just the
device name or both device name and id (prevents problems with changing
the device connected to the one usb port).
Unnecessary enum removed, and the imt block struct moved to imt.c
(doesn't need to be public). Also, remove device name from the imt block
(and thus the parameter to the functions) as it turns out not to be
needed.
in_bind is only partially implemented (waiting on imt), but device
listing, device naming, and input identification are working. The event
handling system made for a fairly clean implementation for input
identification thanks to the focused event handling.
This has smashed the keydest handling for many things, and bindings, but
seems to be a good start with the new input system: the console in
qw-client-x11 is usable (keyboard-only).
The button and axis values have been removed from the knum_t enum as
mouse events are separate from key events, and other button and axis
inputs will be handled separately.
keys.c has been disabled in the build as it is obsolute (thus much of
the breakage).
I'm undecided on how to handle application focus (probably gain/lose
events), and the destination handler has been a stub for a while. One less
dependency on the "old" key handling code.
I'm undecided if the pasted text should be sent as a string rather than
individual key events, but this will do the job for now as it gets me
closer to being able to test everything.
It seems that under certain circumstances (window managers?), select is not
reliable for getting key events, so use of select has been disabled until I
figure out what's going on and how to fix it.
For the mouse in x11, I'm not sure which is more cooked: deltas or
window-relative coordinates, but I don't imagine that really matters too
much. However, keyboard and mouse events suitable for 2D user interfaces
are sent at the same time as the more game oriented button and axis events.
The x11 keyboard and mouse devices are really core input devices rather
than x11 input devices in that keyboard and mouse will be present on most
systems and thus not specific to the main user interface (x11, windows,
etc).
It turns out that calling Sys_Shutdown in the signal handler can cause
lockups due to the signal occurring at unsafe times. Fortunately, this is
just the IO related signals (INT, HUP, TERM, QUIT) as the others are
usually caused by actual errors and should not occur in system code thus
timing should not be an issue. However, care will need to be taken when it
comes to handling SIGINT or similar for breaking runaway progs code when
that time comes.
Now nothing works at all ;) However, that's only because the binding
system is incomplete: the X11 input events are getting through to the
binding system, so now it's just a matter of getting that to work.
Input Mapping Tables are still at the core as they are a good concept,
however they include both axis and button mappings, and the size is not
hard-coded, but dependent on the known devices. Not much actually works
yet (nq segfaults when a key is pressed).
kbutton_t is now in_button_t and has been moved to input.h. Also, a
button registration function has been added to take care of +button and
-button command creation and, eventually, direct binding of "physical"
buttons to logical buttons. "Physical" buttons are those coming in from
the OS (keyboard, mouse, joystick...), logical buttons are what the code
looks at for button state.
Additionally, the button edge detection code has been cleaned up such
that it no longer uses magic numbers, and the conversion to a float is
cleaner. Interestingly, I found that the handling is extremely
frame-rate dependent (eg, +forward will accelerate the player to full
speed much faster at 72fps than it does at 20fps). This may be a factor
in why gamers are frame rate obsessed: other games doing the same thing
would certainly feel different under varying frame rates.
For drivers that support it. Polling is still supported and forces the
select timeout to 0 if any driver requires polling. For now, the default
timeout when all drivers use select is 10ms.
Removing the device from the devices list after closing the device
could cause the device to be double-freed if something went wrong in the
device removal callback resulting in system shutdown which would then
close all open devices.
The device is removed from the list before the callback is called.
There's still a small opportunity for such in a multi-threaded
environment, but that would take device removal occurring at the same
time as the input system is shut down. Probably the responsibility of
the threaded environment rather than inputlib.
I had forgotten that _size was the number of rows in the map, not the
number of objects (1024 objects per row). This fixes the missed device
removal messages. And probably a slew of other bugs I'd yet to encounter
:P
This includes device add and remove events, and axis and buttons for
evdev. Will need to sort out X11 input later, but next is getting qwaq
responding.
While QF doesn't currently use nanoseconds, having access to a clock
that is not affected by setting system time is nice, and as a bonus, can
handle suspends should the need arise.
The common input code (input outer loop and event handling) has been
moved into libQFinput, and modified to have the concept of input drivers
that are registered by the appropriate system-level code (x11, win,
etc).
As well, my evdev input library code (with hotplug support) has been
added, but is not yet fully functional. However, the idea is that it
will be available on all systems that support evdev (Linux, and from
what I've read, FreeBSD).
At the low level, only unions can cause a set to grow. Of course, things
get interesting at the higher level when infinite (inverted) sets are
mixed in.
Instead of printing every representable member of an infinite set (ie,
up to element 63 in a set that can hold 64 elements), only those
elements up to one after the last non-member are listed. For example,
{...} - {2 3} -> {0 1 4 ...}
This makes reading (and testing!) infinite sets much easier.
Most of the set ops were always endian-agnostic since they were simply
operating on multiple bits in parallel, but individual element
add/remove/test was very endian-dependent. For the most part, this
didn't matter, but it does matter very much when loading external data
into a set or writing the data out (eg, for PVS).
Attempting to vis ad_tears drags a few lurking bugs out of
SmallestEnclosingBall_vf: poor calculation of 2-point affine space, poor
handling of duplicate points and dropped support points, poor
calculation of the new center (related to duplicate points), and
insufficient iterations for large point sets. qfvis (modified for
cluster spheres) now loads ad_tears.
As per usual, fp math finds a way to confound any epsilon test. So
rather than relying entirely on test_support_points, check the distance
from the sphere center to the affine point and break out of the loop if
the distance is small enough (< 1% of the current radius). This allows
qfvis to load ad_tears without hacks.
Scaling the checks by 1e-6 was a little too tight for very small
triangles, but 1e-5 seems to work well. This fixes SEB getting stuck for
a ridiculously small (for quake) triangle in ad_tears (probably resulted
from some bad math in qfbsp when generating the portal file from the
bsp).
For now, the functions check for a null hunk pointer and use the global
hunk (initialized via Memory_Init) if necessary. However, Hunk_Init is
available (and used by Memory_Init) to create a hunk from any arbitrary
memory block. So long as that block is 64-byte aligned, allocations
within the hunk will remain 64-byte aligned.
I need to write some automated tests for this, and reading of course,
but 1 and two byte outputs look correct. Kind of sad it took sixteen
years to get around to attempting to use the code :(
Mod_DecompressVis_set (via Mod_LeafPVS_set) can be used to recycle pvs
sets, but the set may have been set to everything at some stage, which
is implemented by inverting the set (making the set infinite) and having
1-bits remove elements from the set. This is most definitely not wanted
for pvs :)
Currently undecided what to do about Mod_DecompressVis_mix, thus the
fixme.
Fixes the flickering lights in any map where the camera is out of the
map for a single frame (eg, start.bsp, The Catacombs (hipnotic, hip2m3)).
I knew counting bits individually was slow, but it never really mattered
until now. However, I didn't expect such a dramatic boost just by going
to mapping bytes to bit counts. 16-bit words would be faster still, but
the 64kB lookup table would probably start hurting cache performance,
and 32-bit words (4GB table) definitely would ruin the cache. The
universe isn't big enough for 64-bits :)
The fact that numleafs did not include leaf 0 actually caused in many
places due to never being sure whether to add 1. Hopefully this fixes
some of the confusion. (and that comment in sv_init didn't last long :P)
After seeing set_size and thinking it redundant (thought it returned the
capacity of the set until I checked), I realized set_count would be a
much better name (set_count (node->successors) in qfcc does make much
more sense).
Modern maps can have many more leafs (eg, ad_tears has 98983 leafs).
Using set_t makes dynamic leaf counts easy to support and the code much
easier to read (though set_is_member and the iterators are a little
slower). The main thing to watch out for is the novis set and the set
returned by Mod_LeafPVS never shrink, and may have excess elements (ie,
indicate that nonexistent leafs are visible).
Having set_expand exposed is useful for loading data into a set.
However, it turns out there was a bug in its size calculation in that
when the requested set size was a multiple of SET_BITS (and greater than
the current set size), the new set size one be SET_BITS larger than
requested. There's now some tests for this :)
-999999 seems to be a hold-over from the software renderer passed
through both gl renderers. I guess it didn't matter in the gl renderers
due to various draw hacks, but it made quite a difference in vulkan.
Fixes the view model covering the hud.
Quake just looked wrong without the view model. I can't say I like the
way the depth range is hacked, but it was necessary because the view
model needs to be processed along with the rest of the alias models
(didn't feel like adding more command buffers, which I imagine would be
expensive with the pipeline switching).
When setting local rotation/scale/transform, need to cache the rotation and
scale, otherwise they can't be fetched easily later on (position is easy as
it's just the fourth column of the matrix).
The recent changes to key handling broke using escape to get out of the
console (escape would toggle between console and menu). Thus take care
of the menu (escape) part of the coupling FIXME by implementing a
callback for the escape key (and removing key_togglemenu) and sorting
out the escape key handling in console. Seems to work nicely
This fixes a bug when loading bsp29 files that resulted in leaf nodes
having bogus bounding boxes if any coordinates were negative (and thus
dynamic lights, and probably all sorts of other things) being broken.
And it took me only 9 years to notice :P
Without shadows, this is quite the cheat, but noclip is a cheat anyway,
so probably not that big a deal. It does, however, make noclip usable
for debugging.
Since vulkan supports 32-bit indexes, there's no need for the
shenanigans the EGL-based glsl renderer had to go through to render bsp
models (maps often had quite a bit more than 65536 vertices), though the
reduced GPU memory requirements of 16-bit indices does have its
advantages.
Any sun (a directional light) is in the outside node, which due to not
having its own PVS data is visible to all nodes, but that's a tad
excessive. However, any leaf node with sky surfaces will potentially see
any suns, and leaf nodes with no sky surfaces will see the sun only if
they can see a leaf that does have sky surfaces. This can be quite
expensive to calculate (already known to be moderately expensive for
just the camera leaf node (singular!) when checking for in-map lights)
Getting close to understanding (again) how it all works. I only just
barely understood when I got vulkan's renderer running, but I really
need to understand for when I modify things for shadows. The main thing
hurdle was tinst, but that was dealt with in the previous commit, and
now it's just sorting out the mess of elechains and elementss.
Its sole purpose was to pass the newly allocated instsurf when chaining
an instance model (ammo box, etc) surface, but using expresion
statements removes the need for such shenanigans, and even makes
msurface_t that little bit smaller (though a separate array would be
much better for cache coherence).
More importantly, the relevant code is actually easier to understand: I
spent way too long working out what tinst was for and why it was never
cleared.
This reduces the overhead needed to manage the memory blocks as the
blocks are guaranteed to be page-aligned. Also, the superblock is now
alllocated from within one of the memory blocks it manages. While this
does slightly reduce the available cachelines within the first block (by
one or two depending on 32 vs 64 bit pointers), it removes the need for
an extra memory allocation (probably via malloc) for the superblock.
The renderer's LineGraph now takes a height parameter, and netgraph now
uses cl_* cvars instead of r_* (which never really made sense),
including it's own height cvar (the render graphs still use
r_graphheight).
The uptime display had not been updated for the offset Sys_DoubleTime,
so add Sys_DoubleTimeBase to make it easy to use Sys_DoubleTime as
uptime.
Line up the layout of the client list was not consistent for drop and
qport.
The render plugins have made a bit of a mess of getting at the data and
thus it's a tad confusing how to get at it in different places. Really
needs a proper cleanup :(
conwidth and conheight have been moved into vid.conview (probably change
the name at some time), and scr_vrect has been replaced by a view as
well. This makes it much easier to create 2d elements that follow the
screen size (taking advantage of a view's gravity) which will, in the
end, make changing the window size easier.
One moves and resizes the view in one operation as a bit of an
optimization as moving and resizing both update any child views, and
this does only one update.
The other sets the gravity and updates any child views as their
absolute positions would change as well as the updated view's absolute
position.
It now processes 4 pixels at a time and uses a bit mask instead of a
conditional to set 3 of the 4 pixels to black. On top of the 4:1 pixel
processing and avoiding inner-loop conditional jumps, gcc unrolls the
loop, so Draw_FadeScreen itself is more than 4x as fast as it was. The
end result is about 5% (3fps) speedup to timedemo demo1 on my 900MHz
EEE Pc when nq has been hacked to always draw the fade-screen.
qwaq-curses has its place, but its use for running vkgen was really a
placeholder because I didn't feel like sorting out the different
initialization requirements at the time. qwaq-cmd has the (currently
unnecessary) threading power of qwaq-curses, but doesn't include any UI
stuff and thus doesn't need curses. The work also paves the way for
qwaq-x11 to become a proper engine (though sorting out its init will be
taken care of later).
Fixes#15.
This refactors (as such) keys.c so that it no longer depends on console
or gib, and pulls keys out of video targets. The eventual plan is to
move all high-level general input handling into libQFinput, and probably
low-level (eg, /dev/input handling for joysticks etc on Linux).
Fixes#8
I had forgotten to test with shared libs and it turns out jack and alsa
were directly accessing symbols in the renderer (and in jack's case,
linking in a duplicate of the renderer).
Fixes#16.
The JACK Audio Connection Kit support is now just an output target
rather than a full duplicate of the renderer (in pull mode). This is
what I wanted to to back when I first added jack support, but I needed
to get the renderer working asynchronously without affecting any of the
other outputs.
Fixes#16.
on_update is for pull-model outpput targets to do periodic synchronous
checks (eg, checking that the connection to the actual output device is
still alive and reviving it if necessary)
Output plugins can use either a push model (synchronous) or a pull
model (asynchronous). The ALSA plugin now uses the pull model. This
paves the way for making jack output a simple output plugin rather than
the combined render/output plugin it currently is (for #16) as now
snd_dma works with both models.
This gets the alsa target working nicely for mmapped outout. I'm not
certain, but I think it will even deal with NPOT buffer sizes (I copied
the code from libasound's sample pcm.c, thus the uncertainty).
Non-mmapped output isn't supported yet, but the alsa target now works
nicely for pull rendering.
However, some work still needs to be done for recovery failure: either
disable the sound system, or restart the driver entirely (preferable).
This brings the alsa driver in line with the jack render (progress
towards #16), but breaks most of the other drivers (for now: one step at
a time). The idea is that once the pull model is working for at least
one other target, the jack renderer can become just another target like
it should have been in the first place (but I needed to get the pull
model working first, then forgot about it).
Correct state checking is not done yet, but testsound does produce what
seems to be fairly good sound when it starts up correctly (part of the
state checking (or lack thereof), I imagine).
This failed with errors such as:
from ./include/QF/simd/vec4d.h:32,
from libs/util/simd.c:37:
./include/QF/simd/vec4d.h: In function ‘qmuld’:
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/avx2intrin.h:1049:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_permute4x64_pd’: target specific option mismatch
1049 | _mm256_permute4x64_pd (__m256d __X, const int __M)
and rename the variable since it's not the size of the frame (may be
from the very early days of ALSA development, and I suspect the
terminology changed a bit).
The calculation was including the bits per sample, which makes no sense
as the period size determines the number of samples in a submission
chunk (and thus latency). For now, set it to around 5.5ms (will probably
need a cvar).
If Sys_Shutdown gets called twice, particularly if a shutdown callback
hangs and the program is killed with INT or QUIT, shutdown_list would be
in an invalid state. Thus, get the required data (function pointer and
data pointer) from the list element, then unlink the element before
calling the function. This ensures that a reinvocation of Sys_Shutdown
continues from the next callback or ends cleanly. Fixes a segfault when
killing testsound while using the oss output (it hangs on shutdown).
Fixes#12
However, this is a bit of a band-aid in that the code for global defs
seems redundant (there is very similar code a little above that is
always executed) and the code for field defs should probably be executed
unconditionally: I suspect the problem fixed by
d5454faeb7 still shows with game coded
compiled with recent versions of the compiler, I just haven't tested
any.
Support for finding the first address associated with a source line was
added to the engine, returning 0 if not found.
A temporary breakpoint is set and the progs allowed to run free.
However, better handling of temporary breakpoitns is needed as currently
a "permanent" breakpoint will be cleared without clearing the temporary
breakpoing if the permanent breakpoing is hit while execut-to-cursor is
running.
For now, just bsearch (normal and fuzzy), qsort, and prefixsum (not in
C's stdlib that I know of, but I think having native implementations of
float and int prefix sums will be useful.
Fuzzy bsearch is useful for finding an entry in a prefix sum array
(value is >= ele[0], < ele[1]), and the reentrant version is good when
data needs to be passed to the compare function. Adapted from the code
used in pr_resolve.
A bit of a mess for optimized vs unoptimized, but the tests acknowledge
the differences in precision while checking that the code produces the
right results allowing for that precision.
It seems that i686 code generation is all over the place reguarding sse2
vs fp, with the resulting differences in carried precision. I'm not sure
I'm happy with the situation, but at least it's being tested to a
certain extent. Not sure if this broke basic (no sse) i686 tests.