I might need to do similar for other formats, but i ran into the problem
of the texture type being tex_palette instead of the expected tex_rgba
when pre-(no-)loading a tga image resulting in Vulkan not liking my
attempt at generating mipmaps.
Having to refigure out what values are going into the vectors got old
very fast. The comments don't help with verifying the values, but at
least I can tell at a glance where 2(xy - wz) goes and thus determine
the "orientation" of the matrix.
Defs and symbols benefit from swizzling as that's one instruction vs 2-3
for loading a scalar into a vector component by component. Constants are
ok because the result gets converted to a vector constant.
qfcc is putting two temps in the same location due to
defspace_alloc_aligned_loc returning the same address when there was a
hole caused by an earlier aligned alloc: specifically, a size-3 hole and
a size-2 allocation with alignment-2.
The destination operand must be a full four component vector, but the
source can be smaller and small sources do not need to be aligned: the
offset of the source operand and the swizzle indices are adjusted. The
adjustments are done during final statement emission in order to avoid
confusing the data flow analyser (and that's when def offsets are known).
This reverts commit 2904c619c1.
In order to support swizzle operations, I need to be able to alias defs
to larger types (eg, float to vec4), but alias_def rightly won't allow
this. However, as the plan is to do this in the final steps before
emitting the instruction, I plan on creating an alias to a float then
adjusting the type in the alias, but to do so without extra shenanigans,
I need alias_def to allow aliases to the same type. As a fringe benefit,
it makes the code agree with the comment in def.h :P
This allows the fuzzy bsearch used to find a def by address to work
properly (ie, find the actual def instead of giving some other def +
offset). Makes for a much more readable instruction stream.
The scene id is in the lower 32-bits for all objects (upper 32-bits are
0 for actual scene objects) and entity/transform ids are in the upper
32-bits. Saves having to pass around a second parameter in progs code.
This came up when investigating an internal error from the line above.
It turned out the error was correct (problem with converting scalars to
vectors), but the break was not.
Currently, only vector/vec3 and quaternion/vec4 can be printed anyway,
but I plan on making explicit format strings for the types, so there
should be no need to promote any vector types (and really, any hidden
promotion is a bit of a pain, but standards...).
While the code would handle int vector types, there aren't any such
instructions, and the expression code shouldn't generate them, but all
float (32 and 64 bit) vector types do have a dot product instruction, so
check width rather than just vector/quaternion.
This fixes an error that's been lurking for over two years (since I made
parameters unlimited internally). The problem was the array was being
allocated on the stack and a simple struct copy was used to store type
type, resulting in a dangling pointer onto the stack. I'm surprised it
didn't cause more problems.
This allows all the tests to build and pass. I'll need to add tests to
ensure warnings happen when they should and that all vec operations are
correct (ouch, that'll be a lot of work), but vectors and quaternions
are working again.
Vector expressions no longer auto-widen due to the new vector types (I
might add such later, but for now this lets the tests try to build
(minus actual fixes in qfcc)).
With this, all vector widths and types are supported: 2, 3, 4 and int,
uint, long, ulong, float and double, along with support for suffixes to
make the type explicit: '1 2'd specifies a dvec2 constant, while '1 2 3'u
is a uivec3 constant. Default types are double (dvec2, dvec3, dvec4) for
literals with float-type components, and int (ivec2...) for those with
integer-type components.
Having three very similar sets of code for outputting values (just for
debug purposes even) got to be a tad annoying. Now there's only one, and
in the right place, too (with the other value code).
I'd created new_value_expr some time ago, but never used it...
Also, replace convert_* with cast_expr to the appropriate type (removes
a pile of value check and create code).
Use with quaternions and vectors is a little broken in that
vec4/quaternion and vec3/vector are not the same types (by design) and
thus a cast is needed (not what I want, though). However, creating
vectors (that happen to be int due to int constants) does seem to be
working nicely otherwise.
Nicely, I was able to reuse the generated conversion code used by the
progs engine to do the work in qfcc, just needed appropriate definitions
for the operand macros, and to set up the conversion code. Helped
greatly by the new value load/store functions.
pr_type_t now contains only the one "value" field, and all the access
macros now use their PACKED variant for base access, making access to
larger types more consistent with the smaller types.
The goal is to have somewhere to put entities so they can be rendered. I
suspect I could have used a bsp tree to partition space for frustum
culling, but it's probably not worth it at this stage (but shouldn't
require any changes to the engine: just the model).
Vulkan doesn't appreciate the empty buffers that result from the model
not having any textures or surfaces that can be rendered (rightfully so,
for such a bare-metal api).
I doubt the calls were ever actually made in a normal map due to the
node actually being a node when breaking out of the loop, but when I
experimented with an empty world model (no nodes, one infinite empty
leaf) I found that visit_leaf was getting called twice instead of once.
Since it is updated every frame, it needs to be as fast as possible for
the cpu code. This seems to make a difference of about 10us (~130 ->
~120) when testing in marcher. Not a huge change, but the timing
calculation was wrapped around the entire base world pass, so there was
a fair bit of overhead from bsp traversal etc.
It makes a significant difference to level load times (approximately
halves them for demo1 and demo2). Nicely, it turns out I had implemented
the rest of the staging buffer code (in particular, flushing) correctly
in that it seems there's no corruption any of the data.
They're really redundant, and removing the next pointer makes for a
slightly smaller cvar struct. Cvar_Select was added to allow finding
lists of matching cvars.
The tab-completion and config saving code was reworked to use the hash
table DO functions. Comments removed since the code was completely
rewritten, but still many thanks to EvilTypeGuy and Fett.
Hash_Select returns a list of elements that match a given criterion
(select callback returning non-0).
Hash_ForEach simply calls a function for every element.
And use it for hud_scoreboard_gravity. Putting the enum def in view made
the most sense as view does own the base type and the enum is likely to
be by useful for other settings.