Retro video games delivered to your door every month!
Click above to get retro games delivered to your door ever month!
X-Hacker.org- Watcom C/C++ User's Guide - the watcom c/c++ compiler contains many options for controlling the code to http://www.X-Hacker.org [<<Previous Entry] [^^Up^^] [Next Entry>>] [Menu] [About The Guide]
The Watcom C/C++ compiler contains many options for controlling the code to
be produced.  It is impossible to have a certain set of compiler options
that will produce the absolute fastest execution times for all possible
applications.  With that said, we will list the compiler options that we
think will give the best execution times for most applications.  You may
have to experiment with different options to see which combination of
options generates the fastest code for your particular application.

The recommended options for generating the fastest 16-bit Intel code are:

Pentium Pro
    /oneatx /oh /oi+ /ei /zp8 /6 /fpi87 /fp6

Pentium
    /oneatx /oh /oi+ /ei /zp8 /5 /fpi87 /fp5

486
    /oneatx /oh /oi+ /ei /zp8 /4 /fpi87 /fp3

386
    /oneatx /oh /oi+ /ei /zp8 /3 /fpi87 /fp3

286
    /oneatx /oh /oi+ /ei /zp8 /2 /fpi87 /fp2

186
    /oneatx /oh /oi+ /ei /zp8 /1 /fpi87

8086
    /oneatx /oh /oi+ /ei /zp8 /0 /fpi87

The recommended options for generating the fastest 32-bit Intel code are:

Pentium Pro
    /oneatx /oh /oi+ /ei /zp8 /6 /fp6

Pentium
    /oneatx /oh /oi+ /ei /zp8 /5 /fp5

486
    /oneatx /oh /oi+ /ei /zp8 /4 /fp3

386
    /oneatx /oh /oi+ /ei /zp8 /3 /fp3

The "oi+" option is for C++ only.  Under some circumstances, the "ob" and
"ol+" optimizations may also give better performance with 32-bit Intel code.

Option "on" causes the compiler to replace floating-point divisions with
multiplications by the reciprocal.  This generates faster code
(multiplication is faster than division), but the result may not be the same
because the reciprocal may not be exactly representable.

Option "oe" causes small user written functions to be expanded in-line
rather than generating a call to the function.  Expanding functions in-line
can further expose other optimizations that couldn't otherwise be detected
if a call was generated to the function.

Option "oa" causes the compiler to relax alias checking.

Option "ot" must be specified to cause the code generator to select code
sequences which are faster without any regard to the size of the code.  The
default is to select code sequences which strike a balance between size and
speed.

Option "ox" is equivalent to "obiklmr" and "s" which causes the
compiler/code generator to do branch prediction ("ob"), expand intrinsic
functions in-line ("oi"), enable control flow prologues and epilogues
("ok"), perform loop optimizations ("ol"), generate 387 instructions in-line
for math functions such as sin, cos, sqrt ("om"), reorder instructions to
avoid pipeline stalls ("or"), and to not generate any stack overflow
checking ("s").  Option "or" is very important for generating fast code for
the Pentium and Pentium Pro processors.

Option "oh" causes the compiler to attempt repeated optimizations (which can
result in longer compiles but more optimal code).

Option "oi+" causes the C++ compiler to expand intrinsic functions in-line
(just like "oi") but also sets the inline_depth to its maximum (255).  By
default, inline_depth is 3.  The inline_depth can also be changed by using
the C++ inline_depth pragma.

Option "ei" causes the compiler to allocate at least an "int" for all
enumerated types.

Option "zp8" causes all data to be aligned on 8 byte boundaries.  The
default is "zp2" for the 16-bit compiler and "zp8" for 32-bit compiler.  If,
for example, "zp1" packing was specified then this would pack all data which
would reduce the amount of data memory required but would require extra
clock cycles to access data that is not on an appropriate boundary.

Options "0", "1", "2", "3", "4", "5" and "6" emit Intel code sequences
optimized for processor-specific instruction set features and timings.  For
16-bit Intel applications, the use of these options may limit the range of
systems on which the application will run but there are execution
performance improvements.

Options "fp2", "fp3", "fp5" and "fp6" emit Intel floating-point operations
targetted at specific features of the math coprocessor in the Intel series.
For 16-bit Intel applications, the use of these options may limit the range
of systems on which the application will run but there are execution
performance improvements.

Option "fpi87" causes in-line Intel 80x87 numeric data processor
instructions to be generated into the object code for floating-point
operations.  Floating-point instruction emulation is not included so as to
obtain the best floating-point performance in 16-bit Intel applications.

For 32-bit Intel applications, the use of the "fp5" option will give good
performance on the Intel Pentium but less than optimal performance on the
386 and 486.  The use of the "5" option will give good performance on the
Pentium and minimal, if any, impact on the 386 and 486.  Thus, the following
set of options gives good overall performance for the 386, 486 and Pentium
processors.


     /oneatx /oh /oi+ /ei /zp8 /5 /fp3

Online resources provided by: http://www.X-Hacker.org --- NG 2 HTML conversion by Dave Pearson