I have no idea why they were even in there, as they intentionally circumvented all GC related features - they declared themselves fixed if prone to getting collected, they all used OF_YesReallyDelete when destroying themselves and they never used any of the object creation or RTTI features, aside from a single assert in V_Init2.
Essentially they were a drag on the system and OF_YesReallyDelete was effectively added just to deal with the canvases which were DObjects but not supposed to behave like them in the first place.
This is necessary because the hardware accelerated renderers will hide the problem, but with pure software rendering to a locked hardware surface, like DirectDraw can result in a crash.
Note that ANY mod that gets caught in this did something wrong!
This one was particularly nasty because Windows also defines a DWORD, but in Windows it is an unsigned long, not an unsigned int so changing types caused type conflicts and not all could be removed.
Those referring to the Windows type have to be kept, fortunately they are mostly in the Win32 directory, with a handful of exceptions elsewhere.
- check and use WGL_EXT_swap_control_tear extension. The above change makes the system always wait for a full vsync with a wglSwapInterval of 1, so it now uses the official extension that enables adaptive vsync. Hopefully this also works on the cards where the old setup did not.
After doing some profiling it was very obvious that this has better performance than client arrays. Persistent buffers are still better, though, especially for handling dynamic lights.
GLEW has two major problems:
- it always includes everything, there is no way to restrict the header to a specific GL version
- it is mostly broken with a core profile and only works if all sanity checks get switched off.
There was one issue preventing the previous 2.0 betas from running under GL 3.x: The lack of persistently mapped buffers.
For the dynamic light buffer today's changes take care of that problem.
For the vertex buffer there is no good workaround but we can use immediate mode render calls instead which have been reinstated.
To handle the current setup, the engine first tries to get a core profile context and checks for presence of GL 4.4 or the GL_ARB_buffer_storage extension.
If this fails the context is deleted again and a compatibility context retrieved which is then used for 'old style' rendering which does work on older GL versions.
This new version does not support GL 3.2 or lower, meaning that Intel GMA 3000 or lower is not supported. The reason for this is that the engine uses a few GL 3.3 features which are not present in the latest Intel driver.
In general the Intel GMA 3000 is far too weak, though, to run the demanding shader of GZDoom 2.x, so this is no real loss. Performance would be far from satisfying.
A command line option '-gl3' exists to force the fallback render path. On my Geforce 550Ti there's approx. 10% performance loss on this path.
Sadly, anything else makes no sense.
All the recently made changes live or die, depending on this extension's presence.
Without it, there are major performance issues with the buffer uploads. All of the traditional buffer upload methods are without exception horrendously slow, especially in the context of a Doom engine where frequent small updates are required.
It could be solved with a complete restructuring of the engine, of course, but that's hardly worth the effort, considering it's only for legacy hardware whose market share will inevitably shrink considerably over the next years.
And even then, under the best circumstances I'd still get the same performance as the old immediate mode renderer in GZDoom 1.x and still couldn't implement the additions I'd like to make.
So, since I need to keep GZDoom 1.x around anyway for older GL 2.x hardware, it may as well serve for 3.x hardware, too. It's certainly less work than constantly trying to find workarounds for the older hardware's limitations that cost more time than working on future-proofing the engine.
This new, trimmed down 4.x renderer runs on a core profile configuration and uses persistently mapped buffers for nearly everything that is getting transferred to the GPU. (The global uniforms are still being used as such but they'll be phased out after the first beta release.
- we need to check all GL versions when trying to get a context because some drivers only give us the version we request, leaving out newer features that are not exposed via extension.
- added some status info about uniform blocks.
- reactivate alpha testing per fixed function pipeline
- use the 'modern' way to define clip planes (GL_CLIP_DISTANCE). This is far more portable than the old glClipPlane method and a lot more robust than checking this in the fragment shader.