turns out that __builtin_alloca_with_align() might releases the
allocated memory at the end of the block it was allocated in, instead
of the end of the function (which is the behavior of regular alloca()
and __builtin_alloca()): "The lifetime of the allocated object ends at
the end of the block in which the function was called. The allocated
storage is released no later than just before the calling function
returns to its caller, but may be released at the end of the block in
which the function was called."
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005falloca_005fwith_005falign
Clang also supports __builtin_alloca_with_align(), but always releases
the memory at the end of the function.
And it seems that newer GCC versions also tend to release it at the
end of the function, but GCC 4.7.2 (that I use for the official Linux
release binaries) didn't, and that caused weird graphical glitches.
But as I don't want to rely on newer GCC versions behaving like this
(and the documentation explicitly says that it *may* be released at
the end of the block, but will definitely be released at the end of
the function), I removed all usage of __builtin_alloca_with_align().
(Side-Note: GCC only started documenting the behavior of
__builtin_alloca and __builtin_alloca_with_align at all with GCC 6.1.0)
On Windows, ID_INLINE does __forceinline, which is bad if the function
calls alloca() and is called in a loop..
So use just __inline there so the compiler can choose not to inline
(if called in a loop).
This didn't cause actual stack overflows as far as I know, but it could
(and MSVC warns about it).
(This includes "Fix ID_MAYBE_INLINE on non-Windows platforms")