- Ported vlinetallasm4 to AMD64 assembly. Even with the increased number of

registers AMD64 provides, this routine still needs to be written as self-
  modifying code for maximum performance. The additional registers do allow
  for further optimization over the x86 version by allowing all four pixels
  to be in flight at the same time. The end result is that AMD64 ASM is about
  2.18 times faster than AMD64 C and about 1.06 times faster than x86 ASM.
  (For further comparison, AMD64 C and x86 C are practically the same for
  this function.) Should I port any more assembly to AMD64, mvlineasm4 is the
  most likely candidate, but it's not used enough at this point to bother.
  Also, this may or may not work with Linux at the moment, since it doesn't
  have the eh_handler metadata. Win64 is easier, since I just need to
  structure the function prologue and epilogue properly and use some
  assembler directives/macros to automatically generate the metadata. And
  that brings up another point: You need YASM to assemble the AMD64 code,
  because NASM doesn't support the Win64 metadata directives.
- Added an SSE version of DoBlending. This is strictly C intrinsics.
  VC++ still throws around unneccessary register moves. GCC seems to be
  pretty close to optimal, requiring only about 2 cycles/color. They're
  both faster than my hand-written MMX routine, so I don't need to feel
  bad about not hand-optimizing this for x64 builds.
- Removed an extra instruction from DoBlending_MMX, transposed two
  instructions, and unrolled it once, shaving off about 80 cycles from the
  time required to blend 256 palette entries. Why? Because I tried writing
  a C version of the routine using compiler intrinsics and was appalled by
  all the extra movq's VC++ added to the code. GCC was better, but still
  generated extra instructions. I only wanted a C version because I can't
  use inline assembly with VC++'s x64 compiler, and x64 assembly is a bit
  of a pain. (It's a pain because Linux and Windows have different calling
  conventions, and you need to maintain extra metadata for functions.) So,
  the assembly version stays and the C version stays out.
- Removed all the pixel doubling r_detail modes, since the one platform they
  were intended to assist (486) actually sees very little benefit from them.
- Rewrote CheckMMX in C and renamed it to CheckCPU.
- Fixed: CPUID function 0x80000005 is specified to return detailed L1 cache
  only for AMD processors, so we must not use it on other architectures, or
  we end up overwriting the L1 cache line size with 0 or some other number
  we don't actually understand.


SVN r1134 (trunk)
This commit is contained in:
Randy Heit 2008-08-09 03:13:43 +00:00
parent 14e94b86e2
commit dda5ddd3c2
37 changed files with 1158 additions and 1337 deletions

View file

@ -1,3 +1,20 @@
August 8, 2008
- Ported vlinetallasm4 to AMD64 assembly. Even with the increased number of
registers AMD64 provides, this routine still needs to be written as self-
modifying code for maximum performance. The additional registers do allow
for further optimization over the x86 version by allowing all four pixels
to be in flight at the same time. The end result is that AMD64 ASM is about
2.18 times faster than AMD64 C and about 1.06 times faster than x86 ASM.
(For further comparison, AMD64 C and x86 C are practically the same for
this function.) Should I port any more assembly to AMD64, mvlineasm4 is the
most likely candidate, but it's not used enough at this point to bother.
Also, this may or may not work with Linux at the moment, since it doesn't
have the eh_handler metadata. Win64 is easier, since I just need to
structure the function prologue and epilogue properly and use some
assembler directives/macros to automatically generate the metadata. And
that brings up another point: You need YASM to assemble the AMD64 code,
because NASM doesn't support the Win64 metadata directives.
August 8, 2008 (Changes by Graf Zahl)
- Replaced the ActorInfo definitions of several internal classes with DECORATE definitions
- Converted teleport fog and destinations to DECORATE.
@ -14,6 +31,23 @@ August 8, 2008 (Changes by Graf Zahl)
- Added aWeaponGiver class to generalize the standing AssaultGun.
- converted a_Strifeweapons.cpp to DECORATE, except for the Sigil.
August 7, 2008
- Added an SSE version of DoBlending. This is strictly C intrinsics.
VC++ still throws around unneccessary register moves. GCC seems to be
pretty close to optimal, requiring only about 2 cycles/color. They're
both faster than my hand-written MMX routine, so I don't need to feel
bad about not hand-optimizing this for x64 builds.
- Removed an extra instruction from DoBlending_MMX, transposed two
instructions, and unrolled it once, shaving off about 80 cycles from the
time required to blend 256 palette entries. Why? Because I tried writing
a C version of the routine using compiler intrinsics and was appalled by
all the extra movq's VC++ added to the code. GCC was better, but still
generated extra instructions. I only wanted a C version because I can't
use inline assembly with VC++'s x64 compiler, and x64 assembly is a bit
of a pain. (It's a pain because Linux and Windows have different calling
conventions, and you need to maintain extra metadata for functions.) So,
the assembly version stays and the C version stays out.
August 7, 2008 (Changes by Graf Zahl)
- Converted the rest of a_strifestuff.cpp to DECORATE.
- Fixed: AStalker::CheckMeleeRange did not perform all checks of AActor::CheckMeleeRange.
@ -39,6 +73,13 @@ August 7, 2008 (SBARINfO update)
- Fixed: Various bugs I noticed in the fullscreenoffsets code.
August 6, 2008
- Removed all the pixel doubling r_detail modes, since the one platform they
were intended to assist (486) actually sees very little benefit from them.
- Rewrote CheckMMX in C and renamed it to CheckCPU.
- Fixed: CPUID function 0x80000005 is specified to return detailed L1 cache
only for AMD processors, so we must not use it on other architectures, or
we end up overwriting the L1 cache line size with 0 or some other number
we don't actually understand.
- The x87 precision control is now explicitly set for double precision, since
GCC defaults to extended precision instead, unlike Visual C++.

View file

@ -173,11 +173,24 @@ endif( FMOD_LIBRARY )
if( NOT NO_ASM )
find_program( NASM_PATH NAMES ${NASM_NAMES} )
find_program( YASM_PATH yasm )
if( YASM_PATH )
set( ASSEMBLER ${YASM_PATH} )
else( YASM_PATH )
if( X64 )
message( STATUS "Could not find YASM. Disabling assembly code." )
set( NO_ASM ON )
else( X64 )
if( NOT NASM_PATH )
message( STATUS "Could not find YASM or NASM. Disabling assembly code." )
set( NO_ASM ON )
else( NOT NASM_PATH )
set( ASSEMBLER ${NASM_PATH} )
endif( NOT NASM_PATH )
endif( X64 )
endif( YASM_PATH )
if( NOT NASM_PATH )
message( STATUS "Could not find NASM. Disabling assembly code." )
set( NO_ASM ON )
else( NOT NASM_PATH )
# I think the only reason there was a version requirement was because the
# executable name for Windows changed from 0.x to 2.0, right? This is
# how to do it in case I need to do something similar later.
@ -188,7 +201,6 @@ if( NOT NO_ASM )
# if( NOT NASM_VER LESS 2 )
# message( SEND_ERROR "NASM version should be 2 or later. (Installed version is ${NASM_VER}.)" )
# endif( NOT NASM_VER LESS 2 )
endif( NOT NASM_PATH )
endif( NOT NO_ASM )
if( NOT NO_ASM )
@ -201,22 +213,31 @@ if( NOT NO_ASM )
# Tell CMake how to assemble our files
if( UNIX )
set( NASM_OUTPUT_EXTENSION .o )
set( NASM_FLAGS -f elf -DM_TARGET_LINUX )
set( ASM_OUTPUT_EXTENSION .o )
if( X64 )
set( ASM_FLAGS -f elf64 -DM_TARGET_LINUX )
else( X64 )
set( ASM_FLAGS -f elf -DM_TARGET_LINUX )
endif( X64 )
else( UNIX )
set( NASM_OUTPUT_EXTENSION .obj )
set( NASM_FLAGS -f win32 -DWIN32 )
set( ASM_OUTPUT_EXTENSION .obj )
if( X64 )
set( ASM_FLAGS -f win64 -DWIN32 -DWIN64 )
else( X64 )
set( ASM_FLAGS -f win32 -DWIN32 )
endif( X64 )
endif( UNIX )
if( WIN32 )
set( FIXRTEXT fixrtext )
endif( WIN32 )
message( STATUS "Selected assembler: ${ASSEMBLER}" )
MACRO( ADD_ASM_FILE infile )
set( ASM_OUTPUT_${infile} "${CMAKE_CURRENT_BINARY_DIR}/CMakeFiles/zdoom.dir/${infile}${NASM_OUTPUT_EXTENSION}" )
set( ASM_OUTPUT_${infile} "${CMAKE_CURRENT_BINARY_DIR}/CMakeFiles/zdoom.dir/${infile}${ASM_OUTPUT_EXTENSION}" )
if( WIN32 )
set( FIXRTEXT_${infile} COMMAND ${FIXRTEXT} "${ASM_OUTPUT_${infile}}" )
endif( WIN32 )
add_custom_command( OUTPUT ${ASM_OUTPUT_${infile}}
COMMAND ${NASM_PATH} ${NASM_FLAGS} -i${CMAKE_CURRENT_SOURCE_DIR}/ -o"${ASM_OUTPUT_${infile}}" "${CMAKE_CURRENT_SOURCE_DIR}/${infile}"
COMMAND ${ASSEMBLER} ${ASM_FLAGS} -i${CMAKE_CURRENT_SOURCE_DIR}/ -o"${ASM_OUTPUT_${infile}}" "${CMAKE_CURRENT_SOURCE_DIR}/${infile}"
${FIXRTEXT_${infile}}
DEPENDS ${infile} ${FIXRTEXT} )
set( ASM_SOURCES ${ASM_SOURCES} "${ASM_OUTPUT_${infile}}" )
@ -320,14 +341,18 @@ else( WIN32 )
endif( WIN32 )
if( NOT NO_ASM )
ADD_ASM_FILE( a.nas )
ADD_ASM_FILE( misc.nas )
ADD_ASM_FILE( tmap.nas )
ADD_ASM_FILE( tmap2.nas )
ADD_ASM_FILE( tmap3.nas )
if( X64 )
ADD_ASM_FILE( asm_x86_64/tmap3.asm )
else( X64 )
ADD_ASM_FILE( asm_ia32/a.asm )
ADD_ASM_FILE( asm_ia32/misc.asm )
ADD_ASM_FILE( asm_ia32/tmap.asm )
ADD_ASM_FILE( asm_ia32/tmap2.asm )
ADD_ASM_FILE( asm_ia32/tmap3.asm )
endif( X64 )
if( WIN32 )
if( NOT X64 )
ADD_ASM_FILE( win32/wrappers.nas )
ADD_ASM_FILE( win32/wrappers.asm )
endif( NOT X64 )
endif( WIN32 )
endif( NOT NO_ASM )
@ -482,6 +507,7 @@ add_executable( zdoom WIN32
v_video.cpp
w_wad.cpp
wi_stuff.cpp
x86.cpp
zstrformat.cpp
zstring.cpp
g_doom/a_arachnotron.cpp
@ -705,6 +731,9 @@ if( CMAKE_COMPILER_IS_GNUCXX )
# Compile this one file with SSE2 support.
set_source_files_properties( nodebuild_classify_sse2.cpp PROPERTIES COMPILE_FLAGS "-msse2 -mfpmath=sse" )
# Need to enable intrinsics for this file.
set_source_files_properties( x86.cpp PROPERTIES COMPILE_FLAGS "-msse2 -mmmx" )
endif( CMAKE_COMPILER_IS_GNUCXX )
if( MSVC )

View file

@ -1766,8 +1766,8 @@ void AM_Drawer ()
{
f_x = viewwindowx;
f_y = viewwindowy;
f_w = realviewwidth;
f_h = realviewheight;
f_w = viewwidth;
f_h = viewheight;
f_p = screen->GetPitch ();
}
AM_activateNewScale();

200
src/asm_ia32/misc.asm Normal file
View file

@ -0,0 +1,200 @@
;*
;* misc.nas
;* Miscellaneous assembly functions
;*
;*---------------------------------------------------------------------------
;* Copyright 1998-2006 Randy Heit
;* All rights reserved.
;*
;* Redistribution and use in source and binary forms, with or without
;* modification, are permitted provided that the following conditions
;* are met:
;*
;* 1. Redistributions of source code must retain the above copyright
;* notice, this list of conditions and the following disclaimer.
;* 2. Redistributions in binary form must reproduce the above copyright
;* notice, this list of conditions and the following disclaimer in the
;* documentation and/or other materials provided with the distribution.
;* 3. The name of the author may not be used to endorse or promote products
;* derived from this software without specific prior written permission.
;*
;* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
;* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
;* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
;* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
;* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
;* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
;* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
;* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
;* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
;* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
;*---------------------------------------------------------------------------
;*
BITS 32
%ifndef M_TARGET_LINUX
%define DoBlending_MMX _DoBlending_MMX
%define BestColor_MMX _BestColor_MMX
%endif
%ifdef M_TARGET_WATCOM
SEGMENT DATA PUBLIC ALIGN=16 CLASS=DATA USE32
SEGMENT DATA
%else
SECTION .data
%endif
Blending256:
dd 0x01000100,0x00000100
%ifdef M_TARGET_WATCOM
SEGMENT CODE PUBLIC ALIGN=16 CLASS=CODE USE32
SEGMENT CODE
%else
SECTION .text
%endif
;-----------------------------------------------------------
;
; DoBlending_MMX
;
; MMX version of DoBlending
;
; (DWORD *from, DWORD *to, count, tor, tog, tob, toa)
;-----------------------------------------------------------
GLOBAL DoBlending_MMX
DoBlending_MMX:
pxor mm0,mm0 ; mm0 = 0
mov eax,[esp+4*4]
shl eax,16
mov edx,[esp+4*5]
shl edx,8
or eax,[esp+4*6]
or eax,edx
mov ecx,[esp+4*3] ; ecx = count
movd mm1,eax ; mm1 = 00000000 00RRGGBB
mov eax,[esp+4*7]
shl eax,16
mov edx,[esp+4*7]
shl edx,8
or eax,[esp+4*7]
or eax,edx
mov edx,[esp+4*2] ; edx = dest
movd mm6,eax ; mm6 = 00000000 00AAAAAA
punpcklbw mm1,mm0 ; mm1 = 000000RR 00GG00BB
movq mm7,[Blending256]
punpcklbw mm6,mm0 ; mm6 = 000000AA 00AA00AA
mov eax,[esp+4*1] ; eax = source
pmullw mm1,mm6 ; mm1 = 000000RR 00GG00BB (multiplied by alpha)
psubusw mm7,mm6 ; mm7 = 000000aa 00aa00aa (one minus alpha)
nop ; Does this actually pair on a Pentium?
; Do four colors per iteration: Count must be a multiple of four.
.loop movq mm2,[eax] ; mm2 = 00r2g2b2 00r1g1b1
add eax,8
movq mm3,mm2 ; mm3 = 00r2g2b2 00r1g1b1
punpcklbw mm2,mm0 ; mm2 = 000000r1 00g100b1
punpckhbw mm3,mm0 ; mm3 = 000000r2 00g200b2
pmullw mm2,mm7 ; mm2 = 0000r1rr g1ggb1bb
add edx,8
pmullw mm3,mm7 ; mm3 = 0000r2rr g2ggb2bb
sub ecx,2
paddusw mm2,mm1
psrlw mm2,8
paddusw mm3,mm1
psrlw mm3,8
packuswb mm2,mm3 ; mm2 = 00r2g2b2 00r1g1b1
movq [edx-8],mm2
movq mm2,[eax] ; mm2 = 00r2g2b2 00r1g1b1
add eax,8
movq mm3,mm2 ; mm3 = 00r2g2b2 00r1g1b1
punpcklbw mm2,mm0 ; mm2 = 000000r1 00g100b1
punpckhbw mm3,mm0 ; mm3 = 000000r2 00g200b2
pmullw mm2,mm7 ; mm2 = 0000r1rr g1ggb1bb
add edx,8
pmullw mm3,mm7 ; mm3 = 0000r2rr g2ggb2bb
sub ecx,2
paddusw mm2,mm1
psrlw mm2,8
paddusw mm3,mm1
psrlw mm3,8
packuswb mm2,mm3 ; mm2 = 00r2g2b2 00r1g1b1
movq [edx-8],mm2
jnz .loop
emms
ret
;-----------------------------------------------------------
;
; BestColor_MMX
;
; Picks the closest matching color from a palette
;
; Passed FFRRGGBB and palette array in same format
; FF is the index of the first palette entry to consider
;
;-----------------------------------------------------------
GLOBAL BestColor_MMX
GLOBAL @BestColor_MMX@8
BestColor_MMX:
mov ecx,[esp+4]
mov edx,[esp+8]
@BestColor_MMX@8:
pxor mm0,mm0
movd mm1,ecx ; mm1 = color searching for
mov eax,257*257+257*257+257*257 ;eax = bestdist
push ebx
punpcklbw mm1,mm0
mov ebx,ecx ; ebx = best color
shr ecx,24 ; ecx = count
and ebx,0xffffff
push esi
push ebp
.loop movd mm2,[edx+ecx*4] ; mm2 = color considering now
inc ecx
punpcklbw mm2,mm0
movq mm3,mm1
psubsw mm3,mm2
pmullw mm3,mm3 ; mm3 = color distance squared
movd ebp,mm3 ; add the three components
psrlq mm3,32 ; into ebp to get the real
mov esi,ebp ; (squared) distance
shr esi,16
and ebp,0xffff
add ebp,esi
movd esi,mm3
add ebp,esi
jz .perf ; found a perfect match
cmp eax,ebp
jb .skip
mov eax,ebp
lea ebx,[ecx-1]
.skip cmp ecx,256
jne .loop
mov eax,ebx
pop ebp
pop esi
pop ebx
emms
ret
.perf lea eax,[ecx-1]
pop ebp
pop esi
pop ebx
emms
ret

View file

@ -51,7 +51,7 @@ FUZZTABLE equ 50
%define fuzzpos _fuzzpos
%define fuzzoffset _fuzzoffset
%define NormalLight _NormalLight
%define realviewheight _realviewheight
%define viewheight _viewheight
%define fuzzviewheight _fuzzviewheight
%define CPU _CPU
@ -103,7 +103,7 @@ EXTERN centery
EXTERN fuzzpos
EXTERN fuzzoffset
EXTERN NormalLight
EXTERN realviewheight
EXTERN viewheight
EXTERN fuzzviewheight
EXTERN CPU

182
src/asm_x86_64/tmap3.asm Normal file
View file

@ -0,0 +1,182 @@
%include "valgrind.inc"
BITS 64
DEFAULT REL
%ifnidn __OUTPUT_FORMAT__,win64
%macro PROC_FRAME 1
%1:
%endmacro
%macro rex_push_reg 1
push %1
%endmacro
%macro push_reg 1
push %1
%endmacro
%macro alloc_stack 1
sub rsp,%1
%endmacro
%define parm1lo dil
%else
%define parm1lo cl
%endif
SECTION .data
EXTERN vplce
EXTERN vince
EXTERN palookupoffse
EXTERN bufplce
EXTERN dc_count
EXTERN dc_dest
EXTERN dc_pitch
SECTION .text
ALIGN 16
GLOBAL ASM_PatchPitch
ASM_PatchPitch:
mov ecx, [dc_pitch]
mov [pm+3], ecx
mov [vltpitch+3], ecx
selfmod pm, vltpitch+6
ret
ALIGN 16
GLOBAL setupvlinetallasm
setupvlinetallasm:
mov [shifter1+2], parm1lo
mov [shifter2+2], parm1lo
mov [shifter3+2], parm1lo
mov [shifter4+2], parm1lo
selfmod shifter1, shifter4+3
ret
%ifidn __OUTPUT_FORMAT__,win64
; Yasm can't do progbits alloc exec for win64?
; Hmm, looks like it's automatic. No worries, then.
SECTION .rtext write ;progbits alloc exec
%else
SECTION .rtext progbits alloc exec write
%endif
ALIGN 16
GLOBAL vlinetallasm4
PROC_FRAME vlinetallasm4
rex_push_reg rbx
push_reg rdi
push_reg r15
push_reg r14
push_reg r13
push_reg r12
push_reg rbp
push_reg rsi
alloc_stack 8 ; Stack must be 16-byte aligned
END_PROLOGUE
; rax = bufplce base address
; rbx =
; rcx = offset from rdi/count (negative)
; edx/rdx = scratch
; rdi = bottom of columns to write to
; r8d-r11d = column offsets
; r12-r15 = palookupoffse[0] - palookupoffse[4]
mov ecx, [dc_count]
mov rdi, [dc_dest]
test ecx, ecx
jle vltepilog ; count must be positive
mov rax, [bufplce]
mov r8, [bufplce+8]
sub r8, rax
mov r9, [bufplce+16]
sub r9, rax
mov r10, [bufplce+24]
sub r10, rax
mov [source2+4], r8d
mov [source3+4], r9d
mov [source4+4], r10d
pm: imul rcx, 320
mov r12, [palookupoffse]
mov r13, [palookupoffse+8]
mov r14, [palookupoffse+16]
mov r15, [palookupoffse+24]
mov r8d, [vince]
mov r9d, [vince+4]
mov r10d, [vince+8]
mov r11d, [vince+12]
mov [step1+3], r8d
mov [step2+3], r9d
mov [step3+3], r10d
mov [step4+3], r11d
add rdi, rcx
neg rcx
mov r8d, [vplce]
mov r9d, [vplce+4]
mov r10d, [vplce+8]
mov r11d, [vplce+12]
selfmod loopit, vltepilog
jmp loopit
ALIGN 16
loopit:
mov edx, r8d
shifter1: shr edx, 24
step1: add r8d, 0x88888888
movzx rdx, BYTE [rax+rdx]
mov ebx, r9d
mov dl, [r12+rdx]
shifter2: shr ebx, 24
step2: add r9d, 0x88888888
source2: movzx ebx, BYTE [rax+rbx+0x88888888]
mov ebp, r10d
mov bl, [r13+rbx]
shifter3: shr ebp, 24
step3: add r10d, 0x88888888
source3: movzx ebp, BYTE [rax+rbp+0x88888888]
mov esi, r11d
mov bpl, BYTE [r14+rbp]
shifter4: shr esi, 24
step4: add r11d, 0x88888888
source4: movzx esi, BYTE [rax+rsi+0x88888888]
mov [rdi+rcx], dl
mov [rdi+rcx+1], bl
mov sil, BYTE [r15+rsi]
mov [rdi+rcx+2], bpl
mov [rdi+rcx+3], sil
vltpitch: add rcx, 320
jl loopit
mov [vplce], r8d
mov [vplce+4], r9d
mov [vplce+8], r10d
mov [vplce+12], r11d
vltepilog:
add rsp, 8
pop rsi
pop rbp
pop r12
pop r13
pop r14
pop r15
pop rdi
pop rbx
ret
ENDPROC_FRAME

View file

@ -207,7 +207,7 @@ void CT_Drawer (void)
int screen_height = con_scaletext > 1? SCREENHEIGHT/2 : SCREENHEIGHT;
int st_y = con_scaletext > 1? ST_Y/2 : ST_Y;
y += ((SCREENHEIGHT == realviewheight && viewactive) || gamestate != GS_LEVEL) ? screen_height : st_y;
y += ((SCREENHEIGHT == viewheight && viewactive) || gamestate != GS_LEVEL) ? screen_height : st_y;
promptwidth = SmallFont->StringWidth (prompt) * scalex;
x = screen->Font->GetCharWidth ('_') * scalex * 2 + promptwidth;

View file

@ -582,7 +582,7 @@ void D_Display ()
StatusBar->BlendView (blend);
}
screen->SetBlendingRect(viewwindowx, viewwindowy,
viewwindowx + realviewwidth, viewwindowy + realviewheight);
viewwindowx + viewwidth, viewwindowy + viewheight);
P_CheckPlayerSprites();
screen->RenderView(&players[consoleplayer]);
if ((hw2d = screen->Begin2D(viewactive)))
@ -593,8 +593,11 @@ void D_Display ()
}
if (automapactive)
{
int saved_ST_Y=ST_Y;
if (hud_althud && realviewheight == SCREENHEIGHT) ST_Y=realviewheight;
int saved_ST_Y = ST_Y;
if (hud_althud && viewheight == SCREENHEIGHT)
{
ST_Y = viewheight;
}
AM_Drawer ();
ST_Y = saved_ST_Y;
}
@ -603,13 +606,13 @@ void D_Display ()
R_RefreshViewBorder ();
}
if (hud_althud && realviewheight == SCREENHEIGHT)
if (hud_althud && viewheight == SCREENHEIGHT)
{
if (DrawFSHUD || automapactive) DrawHUD();
StatusBar->DrawTopStuff (HUD_None);
}
else
if (realviewheight == SCREENHEIGHT && viewactive)
if (viewheight == SCREENHEIGHT && viewactive)
{
StatusBar->Draw (DrawFSHUD ? HUD_Fullscreen : HUD_None);
StatusBar->DrawTopStuff (DrawFSHUD ? HUD_Fullscreen : HUD_None);
@ -2085,7 +2088,10 @@ void D_DoomMain (void)
_FPU_SETCW(cw);
}
#elif defined(_PC_53)
_control87(_PC_53, _MCW_PC);
// On the x64 architecture, changing the floating point precision is not supported.
#ifndef _WIN64
int cfp = _control87(_PC_53, _MCW_PC);
#endif
#endif
PClass::StaticInit ();

View file

@ -132,10 +132,6 @@ extern int viewwindowy;
extern "C" int viewheight;
extern "C" int viewwidth;
extern "C" int halfviewwidth; // [RH] Half view width, for plane drawing
extern "C" int realviewwidth; // [RH] Physical width of view window
extern "C" int realviewheight; // [RH] Physical height of view window
extern "C" int detailxshift; // [RH] X shift for horizontal detail level
extern "C" int detailyshift; // [RH] Y shift for vertical detail level

View file

@ -43,24 +43,62 @@
// Since this file is included by everything, it seems an appropriate place
// to check the NOASM/USEASM macros.
#if (!defined(_M_IX86) && !defined(__i386__)) || defined(__APPLE__)
// The assembly code requires an x86 processor.
// And needs to be tweaked for Mach-O before enabled on Macs.
#if defined(__APPLE__)
// The assembly code needs to be tweaked for Mach-O before enabled on Macs.
#ifndef NOASM
#define NOASM
#endif
#endif
// There are three assembly-related macros:
//
// NOASM - Assembly code is disabled
// X86_ASM - Using ia32 assembly code
// X64_ASM - Using amd64 assembly code
//
// Note that these relate only to using the pure assembly code. Inline
// assembly may still be used without respect to these macros, as
// deemed appropriate.
#ifndef NOASM
#ifndef USEASM
#define USEASM 1
// Select the appropriate type of assembly code to use.
#if defined(_M_IX86) || defined(__i386__)
#define X86_ASM
#ifdef X64_ASM
#undef X64_ASM
#endif
#elif defined(_M_X64) || defined(__amd64__)
#define X64_ASM
#ifdef X86_ASM
#undef X86_ASM
#endif
#else
#ifdef USEASM
#undef USEASM
#define NOASM
#endif
#endif
#ifdef NOASM
// Ensure no assembly macros are defined if NOASM is defined.
#ifdef X86_ASM
#undef X86_ASM
#endif
#ifdef X64_ASM
#undef X64_ASM
#endif
#endif
#if defined(_MSC_VER) || defined(__WATCOMC__)
#define STACK_ARGS __cdecl
#else

View file

@ -842,7 +842,7 @@ private:
{
AWeaponHolder *hold = static_cast<AWeaponHolder*>(inv);
if (hold->PieceWeapon->TypeName == FourthWeaponNames[FourthWeaponClass])
if (hold->PieceWeapon->TypeName == FourthWeaponNames[(int)FourthWeaponClass])
{
// Weapon Pieces
if (oldpieces != hold->PieceMask)
@ -883,7 +883,7 @@ private:
}
if (oldpieces != 0)
{
DrawImage (ClassImages[FourthWeaponClass][imgWEAPONSLOT], 190, 0);
DrawImage (ClassImages[(int)FourthWeaponClass][imgWEAPONSLOT], 190, 0);
oldpieces = 0;
}
}

View file

@ -203,9 +203,9 @@ void DSBarInfo::Draw (EHudState state)
if(SBarInfoScript->completeBorder) //Fill the statusbar with the border before we draw.
{
FTexture *b = TexMan[gameinfo.border->b];
R_DrawBorder(viewwindowx, viewwindowy + realviewheight + b->GetHeight(), viewwindowx + realviewwidth, SCREENHEIGHT);
R_DrawBorder(viewwindowx, viewwindowy + viewheight + b->GetHeight(), viewwindowx + viewwidth, SCREENHEIGHT);
if(screenblocks == 10)
screen->FlatFill(viewwindowx, viewwindowy + realviewheight, viewwindowx + realviewwidth, viewwindowy + realviewheight + b->GetHeight(), b, true);
screen->FlatFill(viewwindowx, viewwindowy + viewheight, viewwindowx + viewwidth, viewwindowy + viewheight + b->GetHeight(), b, true);
}
if(SBarInfoScript->automapbar && automapactive)
{

View file

@ -1067,8 +1067,8 @@ void DBaseStatusBar::DrawCrosshair ()
}
screen->DrawTexture (CrosshairImage,
realviewwidth / 2 + viewwindowx,
realviewheight / 2 + viewwindowy,
viewwidth / 2 + viewwindowx,
viewheight / 2 + viewwindowy,
DTA_DestWidth, w,
DTA_DestHeight, h,
DTA_AlphaChannel, true,

View file

@ -450,7 +450,6 @@ static void StartScoreboardMenu (void);
static void InitCrosshairsList();
EXTERN_CVAR (Bool, st_scale)
EXTERN_CVAR (Int, r_detail)
EXTERN_CVAR (Bool, r_stretchsky)
EXTERN_CVAR (Int, r_columnmethod)
EXTERN_CVAR (Bool, r_drawfuzz)
@ -464,14 +463,6 @@ EXTERN_CVAR (Int, screenblocks)
static TArray<valuestring_t> Crosshairs;
static value_t DetailModes[] =
{
{ 0.0, "Normal" },
{ 1.0, "Double Horizontally" },
{ 2.0, "Double Vertically" },
{ 3.0, "Double Horiz and Vert" }
};
static value_t ColumnMethods[] = {
{ 0.0, "Original" },
{ 1.0, "Optimized" }
@ -517,7 +508,6 @@ static menuitem_t VideoItems[] = {
{ slider, "Brightness", {&Gamma}, {1.0}, {3.0}, {0.1}, {NULL} },
{ discretes,"Crosshair", {&crosshair}, {8.0}, {0.0}, {0.0}, {NULL} },
{ discrete, "Column render mode", {&r_columnmethod}, {2.0}, {0.0}, {0.0}, {ColumnMethods} },
{ discrete, "Detail mode", {&r_detail}, {4.0}, {0.0}, {0.0}, {DetailModes} },
{ discrete, "Stretch short skies", {&r_stretchsky}, {2.0}, {0.0}, {0.0}, {OnOff} },
{ discrete, "Stretch status bar", {&st_scale}, {2.0}, {0.0}, {0.0}, {OnOff} },
{ discrete, "Alternative HUD", {&hud_althud}, {2.0}, {0.0}, {0.0}, {OnOff} },

View file

@ -120,7 +120,7 @@ inline int BigLong (int x)
| ((((unsigned int)x)<<8) & 0xff0000)
| (((unsigned int)x)<<24));
}
#endif // USEASM
#endif
#endif // WORDS_BIGENDIAN

View file

@ -1,537 +0,0 @@
;*
;* misc.nas
;* Miscellaneous assembly functions
;*
;*---------------------------------------------------------------------------
;* Copyright 1998-2006 Randy Heit
;* All rights reserved.
;*
;* Redistribution and use in source and binary forms, with or without
;* modification, are permitted provided that the following conditions
;* are met:
;*
;* 1. Redistributions of source code must retain the above copyright
;* notice, this list of conditions and the following disclaimer.
;* 2. Redistributions in binary form must reproduce the above copyright
;* notice, this list of conditions and the following disclaimer in the
;* documentation and/or other materials provided with the distribution.
;* 3. The name of the author may not be used to endorse or promote products
;* derived from this software without specific prior written permission.
;*
;* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
;* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
;* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
;* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
;* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
;* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
;* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
;* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
;* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
;* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
;*---------------------------------------------------------------------------
;*
BITS 32
%ifndef M_TARGET_LINUX
%define CheckMMX _CheckMMX
%define EndMMX _EndMMX
%define DoBlending_MMX _DoBlending_MMX
%define BestColor_MMX _BestColor_MMX
%define DoubleHoriz_MMX _DoubleHoriz_MMX
%define DoubleHorizVert_MMX _DoubleHorizVert_MMX
%define DoubleVert_ASM _DoubleVert_ASM
%endif
%ifdef M_TARGET_WATCOM
SEGMENT DATA PUBLIC ALIGN=16 CLASS=DATA USE32
SEGMENT DATA
%else
SECTION .data
%endif
Blending256:
dd 0x01000100,0x00000100
%ifdef M_TARGET_WATCOM
SEGMENT CODE PUBLIC ALIGN=16 CLASS=CODE USE32
SEGMENT CODE
%else
SECTION .text
%endif
;-----------------------------------------------------------
;
; CheckMMX
;
; Checks for the presence of MMX instructions on the
; current processor. This code is adapted from the samples
; in AMD's document entitled "AMD-K6™ MMX Processor
; Multimedia Extensions." Also fills in the vendor
; information string.
;
;-----------------------------------------------------------
GLOBAL CheckMMX
; void CheckMMX (struct CPUInfo *)
CheckMMX:
xor eax,eax
mov ecx,92/4
push ebx
push edi
mov edi,[esp+12]
rep stosd
sub edi,92
mov [edi+88],byte 32; Assume a 32-byte cache line
pushfd ; save EFLAGS
pop eax ; store EFLAGS in EAX
mov ebx,eax ; save in EBX for later testing
xor eax,0x00200000 ; toggle bit 21
push eax ; put to stack
popfd ; save changed EAX to EFLAGS
pushfd ; push EFLAGS to TOS
pop eax ; store EFLAGS in EAX
cmp eax,ebx ; see if bit 21 has changed
jz near .noid ; if no change, then no CPUID
; Get vendor ID
xor eax,eax
CPUID
mov [edi],ebx
mov [edi+4],edx
mov [edi+8],ecx
cmp ebx,0x68747541 ; 'htuA'
jne .notamd
cmp edx,0x69746e65 ; 'itne'
jne .notamd
cmp ecx,0x444d4163 ; 'DMAc'
jne .notamd
inc byte [edi+87]
.notamd:
; Get features flags and other info
mov eax,1
CPUID
mov [edi+68],ebx ; Store brand index and other stuff
mov [edi+72],ecx ; Store extended feature flags
mov [edi+76],edx ; Store feature flags
test edx,(1<<19) ; If CLFLUSH instruction is supported,
jz .noclf
shl bh,3 ; get the real cache line size.
mov [edi+88],bh
.noclf mov bl,al ; Extract stepping
and bl,0x0F
mov [edi+64],bl
mov bl,ah ; Extract processor type
shr bl,4 ; (Valid for Intel only)
and bl,0x03
mov [edi+67],bl
shr al,4 ; Extract model and family
and ah,0x0F ; model in al and family in ah
cmp ah,15
jne .noex
mov ebx,eax ; Add extended model and family
shr ebx,12
and bl,0xF0
add ah,bh
or al,bl
.noex mov [edi+65],al
mov [edi+66],ah
; Check for processor brand string
mov eax,0x80000000
CPUID
cmp eax,0x80000001
je .feat2
jb near .noid
cmp eax,0x80000004
jb .feat2
cmp eax,0x80000005
jb .brand
; Get data L1 cache info
mov eax,0x80000005
CPUID
mov [edi+88],ecx
; Get processor brand string
.brand mov eax,0x80000002
CPUID
mov [edi+16],eax
mov [edi+20],ebx
mov [edi+24],ecx
mov [edi+28],edx
mov eax,0x80000003
CPUID
mov [edi+32],eax
mov [edi+36],ebx
mov [edi+40],ecx
mov [edi+44],edx
mov eax,0x80000004
CPUID
mov [edi+48],eax
mov [edi+52],ebx
mov [edi+56],ecx
mov [edi+60],edx
; Get AMD-specific feature flags
.feat2 cmp byte [edi+87],0
jz .noid
mov eax,0x80000001
CPUID
mov [edi+80],edx
mov bl,al ; Extract stepping
and bl,0x0F
mov [edi+84],bl
shr al,4 ; Extract model and family
and ah,0x0F ; model in al and family in ah
cmp ah,15
jne .noex2
mov ebx,eax ; Add extended model and family
shr ebx,12
and bl,0xF0
add ah,bh
or al,bl
.noex2 mov [edi+85],al
mov [edi+86],ah
.noid pop edi
pop ebx
ret
;-----------------------------------------------------------
;
; EndMMX
;
; Signal the end of MMX code for compilers that can't
; do inline assembly. Currently unused.
;
;-----------------------------------------------------------
GLOBAL EndMMX
EndMMX:
emms
ret
;-----------------------------------------------------------
;
; DoBlending_MMX
;
; MMX version of DoBlending
;
; (DWORD *from, DWORD *to, count, tor, tog, tob, toa)
;-----------------------------------------------------------
GLOBAL DoBlending_MMX
DoBlending_MMX:
pxor mm0,mm0 ; mm0 = 0
mov eax,[esp+4*4]
shl eax,16
mov edx,[esp+4*5]
shl edx,8
or eax,[esp+4*6]
or eax,edx
mov ecx,[esp+4*3] ; ecx = count
movd mm1,eax ; mm1 = 00000000 00RRGGBB
mov eax,[esp+4*7]
shl eax,16
mov edx,[esp+4*7]
shl edx,8
or eax,[esp+4*7]
or eax,edx
mov edx,[esp+4*2] ; edx = dest
movd mm6,eax ; mm6 = 00000000 00AAAAAA
punpcklbw mm1,mm0 ; mm1 = 000000RR 00GG00BB
movq mm7,[Blending256]
punpcklbw mm6,mm0 ; mm6 = 000000AA 00AA00AA
mov eax,[esp+4*1] ; eax = source
pmullw mm1,mm6 ; mm1 = 000000RR 00GG00BB (multiplied by alpha)
psubusw mm7,mm6 ; mm7 = 000000aa 00aa00aa (one minus alpha)
nop ; Does this actually pair on a Pentium?
; Do two colors per iteration: Count must be even.
.loop movq mm2,[eax] ; mm2 = 00r2g2b2 00r1g1b1
add eax,8
movq mm3,mm2 ; mm3 = 00r2g2b2 00r1g1b1
punpcklbw mm2,mm0 ; mm2 = 000000r1 00g100b1
movq mm4,mm1
punpckhbw mm3,mm0 ; mm3 = 000000r2 00g200b2
pmullw mm2,mm7 ; mm2 = 0000r1rr g1ggb1bb
add edx,8
pmullw mm3,mm7 ; mm3 = 0000r2rr g2ggb2bb
sub ecx,2
paddusw mm2,mm1
paddusw mm3,mm1
psrlw mm2,8
psrlw mm3,8
packuswb mm2,mm3 ; mm2 = 00r2g2b2 00r1g1b1
movq [edx-8],mm2
jnz .loop
emms
ret
;-----------------------------------------------------------
;
; BestColor_MMX
;
; Picks the closest matching color from a palette
;
; Passed FFRRGGBB and palette array in same format
; FF is the index of the first palette entry to consider
;
;-----------------------------------------------------------
GLOBAL BestColor_MMX
GLOBAL @BestColor_MMX@8
BestColor_MMX:
mov ecx,[esp+4]
mov edx,[esp+8]
@BestColor_MMX@8:
pxor mm0,mm0
movd mm1,ecx ; mm1 = color searching for
mov eax,257*257+257*257+257*257 ;eax = bestdist
push ebx
punpcklbw mm1,mm0
mov ebx,ecx ; ebx = best color
shr ecx,24 ; ecx = count
and ebx,0xffffff
push esi
push ebp
.loop movd mm2,[edx+ecx*4] ; mm2 = color considering now
inc ecx
punpcklbw mm2,mm0
movq mm3,mm1
psubsw mm3,mm2
pmullw mm3,mm3 ; mm3 = color distance squared
movd ebp,mm3 ; add the three components
psrlq mm3,32 ; into ebp to get the real
mov esi,ebp ; (squared) distance
shr esi,16
and ebp,0xffff
add ebp,esi
movd esi,mm3
add ebp,esi
jz .perf ; found a perfect match
cmp eax,ebp
jb .skip
mov eax,ebp
lea ebx,[ecx-1]
.skip cmp ecx,256
jne .loop
mov eax,ebx
pop ebp
pop esi
pop ebx
emms
ret
.perf lea eax,[ecx-1]
pop ebp
pop esi
pop ebx
emms
ret
;-----------------------------------------------------------
;
; DoubleHoriz_MMX
;
; Stretches an image horizontally using MMX instructions.
; The source image is assumed to occupy the right half
; of the destination image.
;
; height of source
; width of source
; dest pointer (at end of row)
; pitch
;
;-----------------------------------------------------------
GLOBAL DoubleHoriz_MMX
DoubleHoriz_MMX:
mov edx,[esp+8] ; edx = width
push edi
neg edx ; make edx negative so we can count up
mov edi,[esp+16] ; edi = dest pointer
sar edx,2 ; and make edx count groups of 4 pixels
push ebp
mov ebp,edx ; ebp = # of columns remaining in this row
push ebx
mov ebx,[esp+28] ; ebx = pitch
mov ecx,[esp+16] ; ecx = # of rows remaining
.loop movq mm0,[edi+ebp*4]
.loop2 movq mm1,mm0
punpcklbw mm0,mm0 ; double left 4 pixels
movq mm2,[edi+ebp*4+8]
punpckhbw mm1,mm1 ; double right 4 pixels
movq [edi+ebp*8],mm0 ; write left pixels
movq mm0,mm2
movq [edi+ebp*8+8],mm1 ; write right pixels
add ebp,2 ; increment counter
jnz .loop2 ; repeat until done with this row
add edi,ebx ; move edi to next row
dec ecx ; decrease row counter
mov ebp,edx ; prep ebp for next row
jnz .loop ; repeat until every row is done
emms
pop ebx
pop ebp
pop edi
ret
;-----------------------------------------------------------
;
; DoubleHorizVert_MMX
;
; Stretches an image horizontally and vertically using
; MMX instructions. The source image is assumed to occupy
; the right half of the destination image and to leave
; every other line unused for expansion.
;
; height of source
; width of source
; dest pointer (at end of row)
; pitch
;
;-----------------------------------------------------------
GLOBAL DoubleHorizVert_MMX
DoubleHorizVert_MMX:
mov edx,[esp+8] ; edx = width
push edi
neg edx ; make edx negative so we can count up
mov edi,[esp+16] ; edi = dest pointer
sar edx,2 ; and make edx count groups of 4 pixels
push ebp
mov ebp,edx ; ebp = # of columns remaining in this row
push ebx
mov ebx,[esp+28] ; ebx = pitch
mov ecx,[esp+16] ; ecx = # of rows remaining
push esi
lea esi,[edi+ebx]
.loop movq mm0,[edi+ebp*4] ; get 8 pixels
movq mm1,mm0
punpcklbw mm0,mm0 ; double left 4
punpckhbw mm1,mm1 ; double right 4
add ebp,2 ; increment counter
movq [edi+ebp*8-16],mm0 ; write them back out
movq [edi+ebp*8-8],mm1
movq [esi+ebp*8-16],mm0
movq [esi+ebp*8-8],mm1
jnz .loop ; repeat until done with this row
lea edi,[edi+ebx*2] ; move edi and esi to next row
lea esi,[esi+ebx*2]
dec ecx ; decrease row counter
mov ebp,edx ; prep ebp for next row
jnz .loop ; repeat until every row is done
emms
pop esi
pop ebx
pop ebp
pop edi
ret
;-----------------------------------------------------------
;
; DoubleVert_ASM
;
; Stretches an image vertically using regular x86
; instructions. The source image should be interleaved.
;
; height of source
; width of source
; source/dest pointer
; pitch
;
;-----------------------------------------------------------
GLOBAL DoubleVert_ASM
DoubleVert_ASM:
mov edx,[esp+16] ; edx = pitch
mov eax,[esp+4] ; eax = # of rows left
push esi
mov esi,[esp+16]
push edi
lea edi,[esi+edx]
shl edx,1 ; edx = pitch*2
mov ecx,[esp+16]
sub edx,ecx ; edx = dist from end of one line to start of next
shr ecx,2
.loop rep movsd
mov ecx,[esp+16]
add esi,edx
add edi,edx
shr ecx,2
dec eax
jnz .loop
pop edi
pop esi
ret

View file

@ -794,7 +794,7 @@ AInventory *AActor::FindInventory (FName type)
AInventory *AActor::GiveInventoryType (const PClass *type)
{
AInventory *item;
AInventory *item = NULL;
if (type != NULL)
{

View file

@ -69,19 +69,6 @@ int scaledviewwidth;
int viewwindowx;
int viewwindowy;
extern "C" {
int realviewwidth; // [RH] Physical width of view window
int realviewheight; // [RH] Physical height of view window
int detailxshift; // [RH] X shift for horizontal detail level
int detailyshift; // [RH] Y shift for vertical detail level
}
#ifdef USEASM
extern "C" void STACK_ARGS DoubleHoriz_MMX (int height, int width, BYTE *dest, int pitch);
extern "C" void STACK_ARGS DoubleHorizVert_MMX (int height, int width, BYTE *dest, int pitch);
extern "C" void STACK_ARGS DoubleVert_ASM (int height, int width, BYTE *dest, int pitch);
#endif
// [RH] Pointers to the different column drawers.
// These get changed depending on the current
// screen depth and asm/no asm.
@ -130,8 +117,6 @@ const BYTE* bufplce[4];
int dccount;
}
cycle_t DetailDoubleCycles;
int dc_fillcolor;
BYTE *dc_translation;
BYTE shadetables[NUMCOLORMAPS*16*256];
@ -161,7 +146,7 @@ EXTERN_CVAR (Int, r_columnmethod)
/* */
/************************************/
#ifndef USEASM
#ifndef X86_ASM
//
// A column is a vertical slice/span from a wall texture that,
// given the DOOM style restrictions on the view orientation,
@ -212,7 +197,7 @@ void R_DrawColumnP_C (void)
} while (--count);
}
}
#endif // USEASM
#endif
// [RH] Just fills a column with a color
void R_FillColumnP (void)
@ -404,7 +389,7 @@ void R_InitFuzzTable (int fuzzoff)
}
}
#ifndef USEASM
#ifndef X86_ASM
//
// Creates a fuzzy image by copying pixels from adjacent ones above and below.
// Used with an all black colormap, this could create the SHADOW effect,
@ -480,7 +465,7 @@ void R_DrawFuzzColumnP_C (void)
fuzzpos = fuzz;
}
}
#endif // USEASM
#endif
//
// R_DrawTranlucentColumn
@ -976,7 +961,7 @@ int dscount;
//
// Draws the actual span.
#if !defined(USEASM)
#ifndef X86_ASM
void R_DrawSpanP_C (void)
{
dsfixed_t xfrac;
@ -1256,14 +1241,21 @@ void R_FillSpan (void)
// wallscan stuff, in C
#ifndef USEASM
#ifndef X86_ASM
static DWORD STACK_ARGS vlinec1 ();
static void STACK_ARGS vlinec4 ();
static int vlinebits;
DWORD (STACK_ARGS *dovline1)() = vlinec1;
DWORD (STACK_ARGS *doprevline1)() = vlinec1;
#ifdef X64_ASM
extern "C" static void vlinetallasm4();
#define dovline4 vlinetallasm4
extern "C" void setupvlinetallasm (int);
#else
static void STACK_ARGS vlinec4 ();
void (STACK_ARGS *dovline4)() = vlinec4;
#endif
static DWORD STACK_ARGS mvlinec1();
static void STACK_ARGS mvlinec4();
@ -1281,8 +1273,8 @@ DWORD STACK_ARGS prevlineasm1 ();
DWORD STACK_ARGS vlinetallasm1 ();
DWORD STACK_ARGS prevlinetallasm1 ();
void STACK_ARGS vlineasm4 ();
void STACK_ARGS vlinetallasm4 ();
void STACK_ARGS vlinetallasmathlon4 ();
void STACK_ARGS vlinetallasm4 ();
void STACK_ARGS setupvlineasm (int);
void STACK_ARGS setupvlinetallasm (int);
@ -1301,7 +1293,7 @@ void (STACK_ARGS *domvline4)() = mvlineasm4;
void setupvline (int fracbits)
{
#ifdef USEASM
#ifdef X86_ASM
if (CPU.Family <= 5)
{
if (fracbits >= 24)
@ -1329,10 +1321,13 @@ void setupvline (int fracbits)
}
#else
vlinebits = fracbits;
#ifdef X64_ASM
setupvlinetallasm(fracbits);
#endif
#endif
}
#ifndef USEASM
#if !defined(X86_ASM)
DWORD STACK_ARGS vlinec1 ()
{
DWORD fracstep = dc_iscale;
@ -1374,7 +1369,7 @@ void STACK_ARGS vlinec4 ()
void setupmvline (int fracbits)
{
#if defined(USEASM)
#if defined(X86_ASM)
setupmvlineasm (fracbits);
domvline1 = mvlineasm1;
domvline4 = mvlineasm4;
@ -1383,7 +1378,7 @@ void setupmvline (int fracbits)
#endif
}
#ifndef USEASM
#if !defined(X86_ASM)
DWORD STACK_ARGS mvlinec1 ()
{
DWORD fracstep = dc_iscale;
@ -1863,17 +1858,17 @@ void R_DrawViewBorder (void)
SB_state = screen->GetPageCount ();
}
if (realviewwidth == SCREENWIDTH)
if (viewwidth == SCREENWIDTH)
{
return;
}
R_DrawBorder (0, 0, SCREENWIDTH, viewwindowy);
R_DrawBorder (0, viewwindowy, viewwindowx, realviewheight + viewwindowy);
R_DrawBorder (viewwindowx + realviewwidth, viewwindowy, SCREENWIDTH, realviewheight + viewwindowy);
R_DrawBorder (0, viewwindowy + realviewheight, SCREENWIDTH, ST_Y);
R_DrawBorder (0, viewwindowy, viewwindowx, viewheight + viewwindowy);
R_DrawBorder (viewwindowx + viewwidth, viewwindowy, SCREENWIDTH, viewheight + viewwindowy);
R_DrawBorder (0, viewwindowy + viewheight, SCREENWIDTH, ST_Y);
M_DrawFrame (viewwindowx, viewwindowy, realviewwidth, realviewheight);
M_DrawFrame (viewwindowx, viewwindowy, viewwidth, viewheight);
V_MarkRect (0, 0, SCREENWIDTH, ST_Y);
}
@ -1893,7 +1888,7 @@ void R_DrawTopBorder ()
FTexture *p;
int offset;
if (realviewwidth == SCREENWIDTH)
if (viewwidth == SCREENWIDTH)
return;
offset = gameinfo.border->offset;
@ -1901,135 +1896,34 @@ void R_DrawTopBorder ()
if (viewwindowy < 34)
{
R_DrawBorder (0, 0, viewwindowx, 34);
R_DrawBorder (viewwindowx, 0, viewwindowx+realviewwidth, viewwindowy);
R_DrawBorder (viewwindowx+realviewwidth, 0, SCREENWIDTH, 34);
R_DrawBorder (viewwindowx, 0, viewwindowx + viewwidth, viewwindowy);
R_DrawBorder (viewwindowx + viewwidth, 0, SCREENWIDTH, 34);
p = TexMan(gameinfo.border->t);
screen->FlatFill(viewwindowx, viewwindowy - p->GetHeight(),
viewwindowx + realviewwidth, viewwindowy, p, true);
viewwindowx + viewwidth, viewwindowy, p, true);
p = TexMan(gameinfo.border->l);
screen->FlatFill(viewwindowx - p->GetWidth(), viewwindowy,
viewwindowx, 35, p, true);
p = TexMan(gameinfo.border->r);
screen->FlatFill(viewwindowx + realviewwidth, viewwindowy,
viewwindowx + realviewwidth + p->GetWidth(), 35, p, true);
screen->FlatFill(viewwindowx + viewwidth, viewwindowy,
viewwindowx + viewwidth + p->GetWidth(), 35, p, true);
p = TexMan(gameinfo.border->tl);
screen->DrawTexture (p, viewwindowx-offset, viewwindowy - offset, TAG_DONE);
screen->DrawTexture (p, viewwindowx - offset, viewwindowy - offset, TAG_DONE);
p = TexMan(gameinfo.border->tr);
screen->DrawTexture (p, viewwindowx+realviewwidth, viewwindowy - offset, TAG_DONE);
screen->DrawTexture (p, viewwindowx + viewwidth, viewwindowy - offset, TAG_DONE);
}
else
{
R_DrawBorder (0, 0, SCREENWIDTH, 34);
}
}
// [RH] Double pixels in the view window horizontally
// and/or vertically (or not at all).
void R_DetailDouble ()
{
if (!viewactive) return;
DetailDoubleCycles = 0;
clock (DetailDoubleCycles);
switch ((detailxshift << 1) | detailyshift)
{
case 1: // y-double
#ifdef USEASM
DoubleVert_ASM (viewheight, viewwidth, dc_destorg, RenderTarget->GetPitch());
#else
{
int rowsize = realviewwidth;
int pitch = RenderTarget->GetPitch();
int y;
BYTE *line;
line = dc_destorg;
for (y = viewheight; y != 0; --y, line += pitch<<1)
{
memcpy (line+pitch, line, rowsize);
}
}
#endif
break;
case 2: // x-double
#ifdef USEASM
if (CPU.bMMX && (viewwidth&15)==0)
{
DoubleHoriz_MMX (viewheight, viewwidth, dc_destorg+viewwidth, RenderTarget->GetPitch());
}
else
#endif
{
int rowsize = viewwidth;
int pitch = RenderTarget->GetPitch();
int y,x;
BYTE *linefrom, *lineto;
linefrom = dc_destorg;
for (y = viewheight; y != 0; --y, linefrom += pitch)
{
lineto = linefrom - viewwidth;
for (x = 0; x < rowsize; ++x)
{
BYTE c = linefrom[x];
lineto[x*2] = c;
lineto[x*2+1] = c;
}
}
}
break;
case 3: // x- and y-double
#ifdef USEASM
if (CPU.bMMX && (viewwidth&15)==0 && 0)
{
DoubleHorizVert_MMX (viewheight, viewwidth, dc_destorg+viewwidth, RenderTarget->GetPitch());
}
else
#endif
{
int rowsize = viewwidth;
int realpitch = RenderTarget->GetPitch();
int pitch = realpitch << 1;
int y,x;
BYTE *linefrom, *lineto;
linefrom = dc_destorg;
for (y = viewheight; y != 0; --y, linefrom += pitch)
{
lineto = linefrom - viewwidth;
for (x = 0; x < rowsize; ++x)
{
BYTE c = linefrom[x];
lineto[x*2] = c;
lineto[x*2+1] = c;
lineto[x*2+realpitch] = c;
lineto[x*2+realpitch+1] = c;
}
}
}
break;
}
unclock (DetailDoubleCycles);
}
ADD_STAT(detail)
{
FString out;
out.Format ("doubling = %04.1f ms", (double)DetailDoubleCycles * 1000 * SecondsPerCycle);
return out;
}
// [RH] Initialize the column drawer pointers
void R_InitColumnDrawers ()
{
#ifdef USEASM
#ifdef X86_ASM
R_DrawColumn = R_DrawColumnP_ASM;
R_DrawColumnHoriz = R_DrawColumnHorizP_ASM;
R_DrawFuzzColumn = R_DrawFuzzColumnP_ASM;

View file

@ -67,7 +67,12 @@ extern void (*R_DrawColumn)(void);
extern DWORD (STACK_ARGS *dovline1) ();
extern DWORD (STACK_ARGS *doprevline1) ();
#ifdef X64_ASM
#define dovline4 vlinetallasm4
extern "C" void vlinetallasm4();
#else
extern void (STACK_ARGS *dovline4) ();
#endif
extern void setupvline (int);
extern DWORD (STACK_ARGS *domvline1) ();
@ -151,7 +156,7 @@ void STACK_ARGS rt_addclamp4cols_asm (int sx, int yl, int yh);
extern void (STACK_ARGS *rt_map4cols)(int sx, int yl, int yh);
#ifdef USEASM
#ifdef X86_ASM
#define rt_copy1col rt_copy1col_asm
#define rt_copy4cols rt_copy4cols_asm
#define rt_map1col rt_map1col_asm
@ -175,7 +180,19 @@ void rt_initcols (void);
void R_DrawFogBoundary (int x1, int x2, short *uclip, short *dclip);
#ifndef USEASM
#ifdef X86_ASM
extern "C" void R_DrawColumnP_Unrolled (void);
extern "C" void R_DrawColumnHorizP_ASM (void);
extern "C" void R_DrawColumnP_ASM (void);
extern "C" void R_DrawFuzzColumnP_ASM (void);
void R_DrawTranslatedColumnP_C (void);
void R_DrawShadedColumnP_C (void);
extern "C" void R_DrawSpanP_ASM (void);
extern "C" void R_DrawSpanMaskedP_ASM (void);
#else
void R_DrawColumnHorizP_C (void);
void R_DrawColumnP_C (void);
void R_DrawFuzzColumnP_C (void);
@ -184,18 +201,6 @@ void R_DrawShadedColumnP_C (void);
void R_DrawSpanP_C (void);
void R_DrawSpanMaskedP_C (void);
#else /* USEASM */
extern "C" void R_DrawColumnP_Unrolled (void);
extern "C" void R_DrawColumnHorizP_ASM (void);
extern "C" void R_DrawColumnP_ASM (void);
extern "C" void R_DrawFuzzColumnP_ASM (void);
void R_DrawTranslatedColumnP_C (void);
void R_DrawShadedColumnP_C (void);
extern "C" void R_DrawSpanP_ASM (void);
extern "C" void R_DrawSpanMaskedP_ASM (void);
#endif
void R_DrawSpanTranslucentP_C (void);
@ -232,10 +237,6 @@ extern FDynamicColormap ShadeFakeColormap[16];
extern BYTE identitymap[256];
extern BYTE *dc_translation;
// [RH] Double view pixels by detail mode
void R_DetailDouble (void);
// If the view size is not full screen, draws a border around it.
void R_DrawViewBorder (void);

View file

@ -59,13 +59,13 @@ unsigned int dc_tspans[4][MAXHEIGHT];
unsigned int *dc_ctspan[4];
unsigned int *horizspan[4];
#ifdef USEASM
#ifdef X86_ASM
extern "C" void R_SetupShadedCol();
extern "C" void R_SetupAddCol();
extern "C" void R_SetupAddClampCol();
#endif
#ifndef USEASM
#ifndef X86_ASM
// Copies one span at hx to the screen at sx.
void rt_copy1col_c (int hx, int sx, int yl, int yh)
{
@ -218,7 +218,7 @@ void STACK_ARGS rt_map4cols_c (int sx, int yl, int yh)
dest += pitch*2;
} while (--count);
}
#endif /* !USEASM */
#endif
void rt_Translate1col(const BYTE *translation, int hx, int yl, int yh)
{
@ -850,7 +850,7 @@ void rt_draw4cols (int sx)
dc_ctspan[x][1] = screen->GetHeight();
}
#ifdef USEASM
#ifdef X86_ASM
// Setup assembly routines for changed colormaps or other parameters.
if (hcolfunc_post4 == rt_shaded4cols)
{

View file

@ -191,7 +191,7 @@ bool foggy; // [RH] ignore extralight and fullbright?
int r_actualextralight;
bool setsizeneeded;
int setblocks, setdetail = -1;
int setblocks;
fixed_t freelookviewheight;
@ -516,8 +516,8 @@ void R_SetVisibility (float vis)
else
r_WallVisibility = r_BaseVisibility;
r_WallVisibility = FixedMul (Scale (InvZtoScale, SCREENWIDTH*(BaseRatioSizes[WidescreenRatio][1]<<detailyshift),
(viewwidth<<detailxshift)*SCREENHEIGHT*3), FixedMul (r_WallVisibility, FocalTangent));
r_WallVisibility = FixedMul (Scale (InvZtoScale, SCREENWIDTH*BaseRatioSizes[WidescreenRatio][1],
viewwidth*SCREENHEIGHT*3), FixedMul (r_WallVisibility, FocalTangent));
// Prevent overflow on floors/ceilings. Note that the calculation of
// MaxVisForFloor means that planes less than two units from the player's
@ -562,48 +562,6 @@ void R_SetViewSize (int blocks)
setblocks = blocks;
}
//==========================================================================
//
// CVAR r_detail
//
// Selects a pixel doubling mode
//
//==========================================================================
CUSTOM_CVAR (Int, r_detail, 0, CVAR_ARCHIVE|CVAR_GLOBALCONFIG)
{
static bool badrecovery = false;
if (badrecovery)
{
badrecovery = false;
return;
}
if (self < 0 || self > 3)
{
Printf ("Bad detail mode. (Use 0-3)\n");
badrecovery = true;
self = (detailyshift << 1) | detailxshift;
return;
}
setdetail = self;
setsizeneeded = true;
}
//==========================================================================
//
// R_SetDetail
//
//==========================================================================
void R_SetDetail (int detail)
{
detailxshift = detail & 1;
detailyshift = (detail >> 1) & 1;
}
//==========================================================================
//
// R_SetWindow
@ -616,19 +574,19 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
if (windowSize >= 11)
{
realviewwidth = fullWidth;
freelookviewheight = realviewheight = fullHeight;
viewwidth = fullWidth;
freelookviewheight = viewheight = fullHeight;
}
else if (windowSize == 10)
{
realviewwidth = fullWidth;
realviewheight = stHeight;
viewwidth = fullWidth;
viewheight = stHeight;
freelookviewheight = fullHeight;
}
else
{
realviewwidth = ((setblocks*fullWidth)/10) & (~15);
realviewheight = ((setblocks*stHeight)/10)&~7;
viewwidth = ((setblocks*fullWidth)/10) & (~15);
viewheight = ((setblocks*stHeight)/10)&~7;
freelookviewheight = ((setblocks*fullHeight)/10)&~7;
}
@ -637,10 +595,7 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
DrawFSHUD = (windowSize == 11);
viewwidth = realviewwidth >> detailxshift;
viewheight = realviewheight >> detailyshift;
fuzzviewheight = viewheight - 2; // Maximum row the fuzzer can draw to
freelookviewheight >>= detailyshift;
halfviewwidth = (viewwidth >> 1) - 1;
if (!bRenderingToCanvas)
@ -659,8 +614,8 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
centerxfrac = centerx<<FRACBITS;
centeryfrac = centery<<FRACBITS;
virtwidth = fullWidth >> detailxshift;
virtheight = fullHeight >> detailyshift;
virtwidth = fullWidth;
virtheight = fullHeight;
if (WidescreenRatio & 4)
{
virtheight = virtheight * BaseRatioSizes[WidescreenRatio][3] / 48;
@ -692,8 +647,8 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
R_InitTextureMapping ();
MaxVisForWall = FixedMul (Scale (InvZtoScale, SCREENWIDTH*(r_Yaspect<<detailyshift),
(viewwidth<<detailxshift)*SCREENHEIGHT), FocalTangent);
MaxVisForWall = FixedMul (Scale (InvZtoScale, SCREENWIDTH*r_Yaspect,
viewwidth*SCREENHEIGHT), FocalTangent);
MaxVisForWall = FixedDiv (0x7fff0000, MaxVisForWall);
MaxVisForFloor = Scale (FixedDiv (0x7fff0000, viewheight<<(FRACBITS-2)), FocalLengthY, 160*FRACUNIT);
@ -712,20 +667,13 @@ void R_ExecuteSetViewSize ()
setsizeneeded = false;
BorderNeedRefresh = screen->GetPageCount ();
if (setdetail >= 0)
{
R_SetDetail (setdetail);
setdetail = -1;
}
R_SetWindow (setblocks, SCREENWIDTH, SCREENHEIGHT, ST_Y);
// Handle resize, e.g. smaller view windows with border and/or status bar.
viewwindowx = (screen->GetWidth() - (viewwidth<<detailxshift))>>1;
viewwindowx = (screen->GetWidth() - viewwidth) >> 1;
// Same with base row offset.
viewwindowy = ((viewwidth<<detailxshift) == screen->GetWidth()) ?
0 : (ST_Y-(viewheight<<detailyshift)) >> 1;
viewwindowy = (viewwidth == screen->GetWidth()) ? 0 : (ST_Y - viewheight) >> 1;
}
//==========================================================================
@ -762,7 +710,7 @@ CUSTOM_CVAR (Int, r_columnmethod, 1, CVAR_ARCHIVE|CVAR_GLOBALCONFIG)
}
else
{ // Trigger the change
r_detail.Callback ();
setsizeneeded = true;
}
}
@ -1434,29 +1382,20 @@ void R_EnterMirror (drawseg_t *ds, int depth)
//
//==========================================================================
void R_SetupBuffer (bool inview)
void R_SetupBuffer ()
{
static BYTE *lastbuff = NULL;
int pitch = RenderTarget->GetPitch();
BYTE *lineptr = RenderTarget->GetBuffer() + viewwindowy*pitch + viewwindowx;
if (inview)
{
pitch <<= detailyshift;
}
if (detailxshift)
{
lineptr += viewwidth;
}
if (dc_pitch != pitch || lineptr != lastbuff)
{
if (dc_pitch != pitch)
{
dc_pitch = pitch;
R_InitFuzzTable (pitch);
#ifdef USEASM
#if defined(X86_ASM) || defined(X64_ASM)
ASM_PatchPitch ();
#endif
}
@ -1478,7 +1417,7 @@ void R_RenderActorView (AActor *actor, bool dontmaplines)
{
WallCycles = PlaneCycles = MaskedCycles = WallScanCycles = 0;
R_SetupBuffer (true);
R_SetupBuffer ();
R_SetupFrame (actor);
// Clear buffers.
@ -1569,17 +1508,8 @@ void R_RenderActorView (AActor *actor, bool dontmaplines)
}
}
WallMirrors.Clear ();
interpolator.RestoreInterpolations ();
// If there is vertical doubling, and the view window is not an even height,
// draw a black line at the bottom of the view window.
if (detailyshift && viewwindowy == 0 && (realviewheight & 1))
{
screen->Clear (0, realviewheight-1, realviewwidth, realviewheight, 0, 0);
}
R_SetupBuffer (false);
R_SetupBuffer ();
}
//==========================================================================
@ -1593,16 +1523,12 @@ void R_RenderActorView (AActor *actor, bool dontmaplines)
void R_RenderViewToCanvas (AActor *actor, DCanvas *canvas,
int x, int y, int width, int height, bool dontmaplines)
{
const int saveddetail = detailxshift | (detailyshift << 1);
const bool savedviewactive = viewactive;
detailxshift = detailyshift = 0;
realviewwidth = viewwidth = width;
viewwidth = width;
RenderTarget = canvas;
bRenderingToCanvas = true;
R_SetDetail (0);
R_SetWindow (12, width, height, height);
viewwindowx = x;
viewwindowy = y;
@ -1612,10 +1538,9 @@ void R_RenderViewToCanvas (AActor *actor, DCanvas *canvas,
RenderTarget = screen;
bRenderingToCanvas = false;
R_SetDetail (saveddetail);
R_ExecuteSetViewSize ();
screen->Lock (true);
R_SetupBuffer (false);
R_SetupBuffer ();
screen->Unlock ();
viewactive = savedviewactive;
}

View file

@ -128,10 +128,6 @@ extern int fixedlightlev;
extern lighttable_t* fixedcolormap;
// [RH] New detail modes
extern "C" int detailxshift;
extern "C" int detailyshift;
//
// Function pointers to switch refresh/drawing functions.
// Used to select shadow mode etc.
@ -190,7 +186,7 @@ void R_SetViewAngle ();
// Called by G_Drawer.
void R_RenderActorView (AActor *actor, bool dontmaplines = false);
void R_RefreshViewBorder ();
void R_SetupBuffer (bool inview);
void R_SetupBuffer ();
void R_RenderViewToCanvas (AActor *actor, DCanvas *canvas, int x, int y, int width, int height, bool dontmaplines = false);

View file

@ -134,7 +134,7 @@ static fixed_t xscale, yscale;
static DWORD xstepscale, ystepscale;
static DWORD basexfrac, baseyfrac;
#ifdef USEASM
#ifdef X86_ASM
extern "C" void R_SetSpanSource_ASM (const BYTE *flat);
extern "C" void STACK_ARGS R_SetSpanSize_ASM (int xbits, int ybits);
extern "C" void R_SetSpanColormap_ASM (BYTE *colormap);
@ -210,7 +210,7 @@ void R_MapPlane (int y, int x1)
FixedMul (GlobVis, abs (centeryfrac - (y << FRACBITS))), planeshade) << COLORMAPSHIFT);
}
#ifdef USEASM
#ifdef X86_ASM
if (ds_colormap != ds_curcolormap)
R_SetSpanColormap_ASM (ds_colormap);
#endif
@ -469,7 +469,7 @@ void R_ClearPlanes (bool fullclear)
// [RH] clip ceiling to console bottom
clearbufshort (ceilingclip, viewwidth,
!screen->Accel2D && ConBottom > viewwindowy && !bRenderingToCanvas
? ((ConBottom - viewwindowy) >> detailyshift) : 0);
? (ConBottom - viewwindowy) : 0);
lastopening = 0;
}
@ -988,7 +988,7 @@ void R_DrawSinglePlane (visplane_t *pl, fixed_t alpha, bool masked)
}
pl->xscale = MulScale16 (pl->xscale, tex->xScale);
pl->yscale = MulScale16 (pl->yscale, tex->yScale);
#ifdef USEASM
#ifdef X86_ASM
R_SetSpanSize_ASM (ds_xbits, ds_ybits);
#endif
ds_source = tex->GetPixels ();
@ -1344,7 +1344,7 @@ void R_DrawSkyPlane (visplane_t *pl)
void R_DrawNormalPlane (visplane_t *pl, fixed_t alpha, bool masked)
{
#ifdef USEASM
#ifdef X86_ASM
if (ds_source != ds_cursource)
{
R_SetSpanSource_ASM (ds_source);
@ -1550,7 +1550,7 @@ void R_DrawTiltedPlane (visplane_t *pl, fixed_t alpha, bool masked)
}
}
#if defined(USEASM)
#if defined(X86_ASM)
if (ds_source != ds_curtiltedsource)
R_SetTiltedSpanSource_ASM (ds_source);
R_MapVisPlane (pl, R_DrawTiltedPlane_ASM);

View file

@ -57,7 +57,6 @@ CUSTOM_CVAR (Bool, r_stretchsky, true, CVAR_ARCHIVE)
R_InitSkyMap ();
}
extern "C" int detailxshift, detailyshift;
extern fixed_t freelookviewheight;
//==========================================================================
@ -107,8 +106,8 @@ void R_InitSkyMap ()
if (viewwidth && viewheight)
{
skyiscale = (r_Yaspect*FRACUNIT) / (((freelookviewheight<<detailxshift) * viewwidth) / (viewwidth<<detailxshift));
skyscale = ((((freelookviewheight<<detailxshift) * viewwidth) / (viewwidth<<detailxshift)) << FRACBITS) /
skyiscale = (r_Yaspect*FRACUNIT) / ((freelookviewheight * viewwidth) / viewwidth);
skyscale = (((freelookviewheight * viewwidth) / viewwidth) << FRACBITS) /
(r_Yaspect);
skyiscale = Scale (skyiscale, FieldOfView, 2048);

View file

@ -33,9 +33,7 @@
//
extern "C" int viewwidth;
extern "C" int realviewwidth;
extern "C" int viewheight;
extern "C" int realviewheight;
// Sprite....
extern int firstspritelump;

View file

@ -1583,7 +1583,7 @@ void R_DrawPSprite (pspdef_t* psp, int pspnum, AActor *owner, fixed_t sx, fixed_
if (camera->player && (RenderTarget != screen ||
realviewheight == RenderTarget->GetHeight() ||
viewheight == RenderTarget->GetHeight() ||
(RenderTarget->GetWidth() > 320 && !st_scale)))
{ // Adjust PSprite for fullscreen views
AWeapon *weapon = NULL;
@ -1593,7 +1593,7 @@ void R_DrawPSprite (pspdef_t* psp, int pspnum, AActor *owner, fixed_t sx, fixed_
}
if (pspnum <= ps_flash && weapon != NULL && weapon->YAdjust != 0)
{
if (RenderTarget != screen || realviewheight == RenderTarget->GetHeight())
if (RenderTarget != screen || viewheight == RenderTarget->GetHeight())
{
vis->texturemid -= weapon->YAdjust;
}
@ -2502,7 +2502,7 @@ void R_DrawParticle (vissprite_t *vis)
fg = fg2rgb[color];
}
spacing = (RenderTarget->GetPitch()<<detailyshift) - countbase;
spacing = RenderTarget->GetPitch() - countbase;
dest = ylookup[yl] + x1 + dc_destorg;
do

View file

@ -63,7 +63,7 @@
EXTERN_CVAR (String, language)
#ifdef USEASM
#if defined(X86_ASM) || defined(X64_ASM)
extern "C" void STACK_ARGS CheckMMX (CPUInfo *cpu);
#endif
@ -182,7 +182,7 @@ void SetLanguageIDs ()
//
void I_Init (void)
{
#ifndef USEASM
#if !defined(X86_ASM) && !defined(X64_ASM)
memset (&CPU, 0, sizeof(CPU));
#else
CheckMMX (&CPU);

View file

@ -100,14 +100,11 @@ CUSTOM_CVAR (Float, Gamma, 1.f, CVAR_ARCHIVE|CVAR_GLOBALCONFIG)
/* Palette management stuff */
/****************************/
extern "C"
{
BYTE BestColor_MMX (DWORD rgb, const DWORD *pal);
}
extern "C" BYTE BestColor_MMX (DWORD rgb, const DWORD *pal);
int BestColor (const uint32 *pal_in, int r, int g, int b, int first, int num)
{
#ifdef USEASM
#ifdef X86_ASM
if (CPU.bMMX)
{
int pre = 256 - num - first;
@ -120,9 +117,10 @@ int BestColor (const uint32 *pal_in, int r, int g, int b, int first, int num)
for (int color = first; color < num; color++)
{
int dist = (r-pal[color].r)*(r-pal[color].r)+
(g-pal[color].g)*(g-pal[color].g)+
(b-pal[color].b)*(b-pal[color].b);
int x = r - pal[color].r;
int y = g - pal[color].g;
int z = b - pal[color].b;
int dist = x*x + y*y + z*z;
if (dist < bestdist)
{
if (dist == 0)
@ -454,10 +452,8 @@ void InitPalette ()
}
extern "C"
{
void STACK_ARGS DoBlending_MMX (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a);
}
extern "C" void STACK_ARGS DoBlending_MMX (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a);
extern void DoBlending_SSE2 (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a);
void DoBlending (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a)
{
@ -478,29 +474,51 @@ void DoBlending (const PalEntry *from, PalEntry *to, int count, int r, int g, in
to[i] = t;
}
}
#ifdef USEASM
else if (CPU.bMMX && !(count & 1))
else if (CPU.bSSE2)
{
DoBlending_MMX (from, to, count, r, g, b, a);
}
#endif
else
{
int i, ia;
ia = 256 - a;
r *= a;
g *= a;
b *= a;
for (i = count; i > 0; i--, to++, from++)
if (count >= 4)
{
to->r = (r + from->r*ia) >> 8;
to->g = (g + from->g*ia) >> 8;
to->b = (b + from->b*ia) >> 8;
int not3count = count & ~3;
DoBlending_SSE2 (from, to, not3count, r, g, b, a);
count &= 3;
if (count <= 0)
{
return;
}
from += not3count;
to += not3count;
}
}
#ifdef X86_ASM
else if (CPU.bMMX)
{
if (count >= 4)
{
int not3count = count & ~3;
DoBlending_MMX (from, to, not3count, r, g, b, a);
count &= 3;
if (count <= 0)
{
return;
}
from += not3count;
to += not3count;
}
}
#endif
int i, ia;
ia = 256 - a;
r *= a;
g *= a;
b *= a;
for (i = count; i > 0; i--, to++, from++)
{
to->r = (r + from->r * ia) >> 8;
to->g = (g + from->g * ia) >> 8;
to->b = (b + from->b * ia) >> 8;
}
}
void V_SetBlend (int blendr, int blendg, int blendb, int blenda)

View file

@ -1192,7 +1192,6 @@ void DFrameBuffer::PrecacheTexture(FTexture *tex, int cache)
void DFrameBuffer::RenderView(player_t *player)
{
R_RenderActorView (player->mo);
R_DetailDouble (); // [RH] Apply detail mode expansion
// [RH] Let cameras draw onto textures that were visible this frame.
FCanvasTextureInfo::UpdateAll ();
}
@ -1317,7 +1316,7 @@ bool V_DoModeSetup (int width, int height, int bits)
RenderTarget = screen;
screen->Lock (true);
R_SetupBuffer (false);
R_SetupBuffer ();
screen->Unlock ();
M_RefreshModesList ();

View file

@ -458,7 +458,7 @@ FString V_GetColorStringByName (const char *name);
// Tries to get color by name, then by string
int V_GetColor (const DWORD *palette, const char *str);
#ifdef USEASM
#if defined(X86_ASM) || defined(X64_ASM)
extern "C" void ASM_PatchPitch (void);
#endif

View file

@ -67,9 +67,7 @@
EXTERN_CVAR (String, language)
#ifdef USEASM
extern "C" void STACK_ARGS CheckMMX (CPUInfo *cpu);
#endif
extern void CheckCPUID(CPUInfo *cpu);
extern "C"
{
@ -344,12 +342,10 @@ void SetLanguageIDs ()
//
// I_Init
//
void I_Init (void)
{
#ifndef USEASM
memset (&CPU, 0, sizeof(CPU));
#else
CheckMMX (&CPU);
CheckCPUID(&CPU);
CalculateCPUSpeed ();
// Why does Intel right-justify this string?
@ -367,7 +363,6 @@ void I_Init (void)
}
}
#endif
if (CPU.VendorID[0])
{
Printf ("CPU Vendor ID: %s\n", CPU.VendorID);
@ -396,7 +391,6 @@ void I_Init (void)
Printf ("\n");
}
// Use a timer event if possible
NewTicArrived = CreateEvent (NULL, FALSE, FALSE, NULL);
if (NewTicArrived)
@ -484,7 +478,7 @@ void CalculateCPUSpeed ()
Printf ("Can't determine CPU speed, so pretending.\n");
}
Printf ("CPU Speed: %f MHz\n", CyclesPerSecond / 1e6);
Printf ("CPU Speed: %.0f MHz\n", CyclesPerSecond / 1e6);
}
//

View file

@ -52,23 +52,23 @@ extern os_t OSPlatform;
struct CPUInfo // 92 bytes
{
char VendorID[16];
char CPUString[48];
char VendorID[16]; // 0
char CPUString[48]; // 16
BYTE Stepping;
BYTE Model;
BYTE Family;
BYTE Type;
BYTE Stepping; // 64
BYTE Model; // 65
BYTE Family; // 66
BYTE Type; // 67
BYTE BrandIndex;
BYTE CLFlush;
BYTE CPUCount;
BYTE APICID;
BYTE BrandIndex; // 68
BYTE CLFlush; // 69
BYTE CPUCount; // 70
BYTE APICID; // 71
DWORD bSSE3:1;
DWORD bSSE3:1; // 72
DWORD DontCare1:31;
DWORD bFPU:1;
DWORD bFPU:1; // 76
DWORD bVME:1;
DWORD bDE:1;
DWORD bPSE:1;
@ -76,7 +76,7 @@ struct CPUInfo // 92 bytes
DWORD bMSR:1;
DWORD bPAE:1;
DWORD bMCE:1;
DWORD bCX8:1;
DWORD bCX8:1; // 77
DWORD bAPIC:1;
DWORD bReserved1:1;
DWORD bSEP:1;
@ -84,7 +84,7 @@ struct CPUInfo // 92 bytes
DWORD bPGE:1;
DWORD bMCA:1;
DWORD bCMOV:1;
DWORD bPAT:1;
DWORD bPAT:1; // 78
DWORD bPSE36:1;
DWORD bPSN:1;
DWORD bCFLUSH:1;
@ -92,7 +92,7 @@ struct CPUInfo // 92 bytes
DWORD bDS:1;
DWORD bACPI:1;
DWORD bMMX:1;
DWORD bFXSR:1;
DWORD bFXSR:1; // 79
DWORD bSSE:1;
DWORD bSSE2:1;
DWORD bSS:1;
@ -101,22 +101,22 @@ struct CPUInfo // 92 bytes
DWORD bReserved3:1;
DWORD bPBE:1;
DWORD DontCare2:22;
DWORD DontCare2:22; // 80
DWORD bMMXPlus:1; // AMD's MMX extensions
DWORD bMMXAgain:1; // Just a copy of bMMX above
DWORD DontCare3:6;
DWORD b3DNowPlus:1;
DWORD b3DNow:1;
BYTE AMDStepping;
BYTE AMDModel;
BYTE AMDFamily;
BYTE bIsAMD;
BYTE AMDStepping; // 84
BYTE AMDModel; // 85
BYTE AMDFamily; // 86
BYTE bIsAMD; // 87
BYTE DataL1LineSize;
BYTE DataL1LinesPerTag;
BYTE DataL1Associativity;
BYTE DataL1SizeKB;
BYTE DataL1LineSize; // 88
BYTE DataL1LinesPerTag; // 89
BYTE DataL1Associativity;//90
BYTE DataL1SizeKB; // 91
};

File diff suppressed because it is too large Load diff