quakeforge

mirror of https://git.code.sf.net/p/quake/quakeforge synced 2024-11-13 08:27:39 +00:00

Author	SHA1	Message	Date
Bill Currie	7a6ca0ebcb	[simd] Use portable swizzles gcc and clang have rather different swizzle builtins, but both do a nice job of optimizing the intuitive initializer swizzle (I think gcc 8(?) didn't do such a good job thus my use of __builtin_shuffle).	2022-03-31 02:25:33 +09:00
Bill Currie	23613ca985	[simd] Get the new functions working on older hardware In some cases, gcc-11 does a good enough job translating normal looking C expressions so use just those, but other times need to dig around for an appropriate intrinsic. Also, now need to disable psapi warnings when compiling for anything less than avx.	2022-01-07 11:48:28 +09:00
Bill Currie	80c5e2c3f6	[simd] Remove requirements for AVX2 for vec4d It seems gcc-11 does a pretty good job of emulating the instructions (it no longer requires avx2 for 256-bit wide vectors).	2022-01-06 18:06:56 +09:00
Bill Currie	63db48bf42	[simd] Add integral loadvec3 versions that set w to 1 Always setting w to 0 made it impossible to use the resulting 4d vectors in division-based operations as they would result in divide-by-zero and thus an unavoidable exception (CPUs don't like integer div-by-zero). I'll probably add similar for float and double, but they're not as critical as they'll just give inf or nan. This also increases my doubts about the value of keeping 3d vector operations.	2022-01-04 18:23:32 +09:00
Bill Currie	97034d9dde	[simd] Add 2d vector types For int, long, float and double. I've been meaning to add them for a while, and they're part of the new Ruamoko instructions set (which is progressing nicely).	2022-01-02 00:57:55 +09:00
D G Turner	b799d48ccb	[simd] fix build when avx2 is not available, but avx is. This failed with errors such as: from ./include/QF/simd/vec4d.h:32, from libs/util/simd.c:37: ./include/QF/simd/vec4d.h: In function ‘qmuld’: /usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/avx2intrin.h:1049:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_permute4x64_pd’: target specific option mismatch 1049 \| _mm256_permute4x64_pd (__m256d __X, const int __M)	2021-06-23 01:10:42 +01:00
Bill Currie	778c07e91f	[util] Get vectors working for non-SSE archs GCC does a fairly nice job of producing code for vector types when the hardware doesn't support SIMD, but it seems to break certain math optimization rules due to excess precision (?). Still, it works well enough for the core engine, but may not be well suited to the tools. However, so far, only qfvis uses vector types (and it's not tested yet), and tools should probably be used on suitable machines anyway (not forces, of course).	2021-06-01 18:53:53 +09:00
Bill Currie	9ac4cdc6bd	[simd] Fix more portability issues I had missed vec4d.h because it's mostly unused at this stage.	2021-04-02 23:25:14 +09:00
Bill Currie	8309e1852a	[simd] Fix some portability issues Use [u]int64_t instead of long, and fix some incorrect attribute usage (I had misread the gcc docs at the time).	2021-03-27 20:04:10 +09:00
Bill Currie	9039c6975a	[util] Clean up some missed vsqrt changes	2021-01-05 08:35:53 +09:00
Bill Currie	015cee7b6f	[util] Add vector-quaternion shortcut functions Care needs to be taken to ensure the right function is used with the right arguments, but with these, the need to use qconj(d\|f) for a one-off inverse rotation is removed.	2021-01-02 10:44:45 +09:00
Bill Currie	7bf90e5f4a	[util] Sort out implementation issues for simd	2021-01-02 09:55:59 +09:00
Bill Currie	1ddd57b09e	[util] Add qconj, vtrunc, vceil and vfloor functions I had forgotten these rather critical functions. Both double and float versions are included.	2020-12-30 18:20:11 +09:00
Bill Currie	09a10f80e1	[util] Add basic SIMD implemented vector functions They take advantage of gcc's vector_size attribute and so only cross, dot, qmul, qvmul and qrot (create rotation quaternion from two vectors) are needed at this stage as basic (per-component) math is supported natively by gcc. The provided functions work on horizontal (array-of-structs) data, ie a vec4d_t or vec4f_t represents a single vector, or traditional vector layout. Vertical layout (struct-of-arrays) does not need any special functions as the regular math can be used to operate on four vectors at a time. Functions are provided for loading a vec4 from a vec3 (4th element set to 0) and storing a vec4 into a vec3 (discarding the 4th element). With this, QF will require AVX2 support (needed for vec4d_t). Without support for doubles, SSE is possible, but may not be worthwhile for horizontal data. Fused-multiply-add is NOT used because it alters the results between unoptimized and optimized code, resulting in -mfma really meaning -mfast-math-anyway. I really do not want to have to debug issues that occur only in optimized code.	2020-12-30 18:20:11 +09:00

14 commits