CPU 모델 별 권장 compile flags 옵션

Finding the CPU

To identify the model of the CPU, take a look inside /proc/cpuinfo for the “cpu family” and “model” numbers like so:

user $grep -m1 -A3 "vendor_id" /proc/cpuinfo

Once this information is found match the CPU to one listed on this page in order to find out the suggested “safe” CFLAGS.

Below is a list of CFLAGS which are to be considered “safe” for the given processors. These are the settings that should be used, especially when unsure which CFLAGS the processor needs.

Find CPU-specific options

At first, create two files and set language to English to get sed working:

user $touch native.cc march.cc
user $LANG="en"

Compile the first file:

user $gcc -fverbose-asm -march=native native.cc -S

Get march, which gcc will choose:

user $grep march native.s

Options passed: -D_GNU_SOURCE native.cc -march=core-avx-i -mcx16 -msahf

Now compile a second file:

user $gcc -fverbose-asm -march=core-avx-i march.cc -S

Now clean the *.s files for an easy comparisons:

user $sed -i 1,/options\ enabled/d march.s
user $sed -i 1,/options\ enabled/d native.s

When you not set LANG on an localized system, you have now empty files.

Compare both *.s files:

user $diff march.s native.s

Output empty? You found your -march=, use it. In other cases:

20,25c20,23
< # -maccumulate-outgoing-args -maes -malign-stringops -mavx
< # -mavx256-split-unaligned-load -mavx256-split-unaligned-store -mcx16
< # -mf16c -mfancy-math-387 -mfp-ret-in-387 -mfsgsbase -mfxsr -mglibc
< # -mieee-fp -mlong-double-80 -mmmx -mpclmul -mpopcnt -mpush-args -mrdrnd
< # -mred-zone -msahf -msse -msse2 -msse3 -msse4 -msse4.1 -msse4.2 -mssse3
< # -mtls-direct-seg-refs -mxsave -mxsaveopt
---
> # -maccumulate-outgoing-args -malign-stringops -mcx16 -mfancy-math-387
> # -mfp-ret-in-387 -mfsgsbase -mfxsr -mglibc -mieee-fp -mlong-double-80
> # -mmmx -mpclmul -mpopcnt -mpush-args -mred-zone -msahf -msse -msse2 -msse3
> # -msse4 -msse4.1 -msse4.2 -mssse3 -mtls-direct-seg-refs

Now guess, which switches -march=core-avx-i enable and -march=native not: -maes -mavx and some more. Now we build march.cc again with this two switches disabled:

gcc -fverbose-asm -march=core-avx-i -mno-aes -mno-avx march.cc -S
sed -i 1,/options\ enabled/d march.s
diff march.s native.s

< # -mmmx -mpclmul -mpopcnt -mpush-args -mrdrnd -mred-zone -msahf -msse
< # -msse2 -msse3 -msse4 -msse4.1 -msse4.2 -mssse3 -mtls-direct-seg-refs
---
> # -mmmx -mpclmul -mpopcnt -mpush-args -mred-zone -msahf -msse -msse2 -msse3
> # -msse4 -msse4.1 -msse4.2 -mssse3 -mtls-direct-seg-refs

One switch we need to disable too: -mrdrnd

Finally set CFLAGS:

FILE /etc/portage/make.conf
CFLAGS="-march=core-avx-i -mno-avx -mno-aes -mno-rdrnd -O2 -pipe"

x86/amd64

Intel

Haswell

Core i3/i5/i7 & Xeon E3/E5/E7 *V2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 60
model name	: Intel(R) Xeon(R) CPU E3-1271 v3 @ 3.60GHz
…
model           : 60
model name      : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=core-avx2 -O2 -pipe"
CXXFLAGS="${CFLAGS}"
Note
core-avx2 march support was introduced with GCC 4.7. If you have an earlier version of GCC, use -march=native or find cpu-specific options manually.

Ivy Bridge

Core i3/i5/i7 & Xeon E3/E5/E7 *V1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 58
model name      : Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz 
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=core-avx-i -O2 -pipe"
CXXFLAGS="${CFLAGS}"
Pentium
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Pentium(R) CPU G2020 @ 2.90GHz
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=core-avx-i -mno-avx -mno-aes -mno-rdrnd -O2 -pipe"
CXXFLAGS="${CFLAGS}"

Sandy Bridge

Core i3/i5/i7 & Xeon E3/E5/E7
vendor_id	: GenuineIntel
cpu family	: 6
…
model		: 42
model name	: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
…
model		: 45
model name	: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
… 
model		: 42
model name	: Intel(R) Xeon(R) CPU E31245 @ 3.30GHz
… 
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=corei7-avx -O2 -pipe"
CXXFLAGS="${CFLAGS}"
Pentium
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Pentium(R) CPU B960 @ 2.20GHz
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=corei7-avx -mno-avx -mno-aes -mno-rdrnd -O2 -pipe"
CXXFLAGS="${CFLAGS}"

Nehalem/Westmere

Core i3/i5/i7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=corei7 -O2 -pipe"
CXXFLAGS="${CFLAGS}"

Intel Core

vendor_id       : GenuineIntel
cpu family      : 6
…
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz
…
model           : 15
model name      : Intel(R) Xeon(R) CPU            3040  @ 1.86GHz
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -O2 -pipe"
CXXFLAGS="${CFLAGS}"

Older microarchitecture

Pentium M (Dothan)
vendor_id	: GenuineIntel
cpu family	: 6
model		: 13
model name	: Intel(R) Pentium(R) M processor 2.13GHz
FILE /etc/portage/make.conf
CHOST="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium-m -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"

AMD

A4/A6/A8-XXXX / XXXXM

vendor_id	: AuthenticAMD
cpu family	: 18
model		: 1
model name	: AMD A8-3500M APU with Radeon(tm) HD Graphics
FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=amdfam10 -mcx16 -mpopcnt -pipe"
CXXFLAGS="${CFLAGS}"

FX-XXXX

vendor_id	: AuthenticAMD
cpu family	: 21
model		: 1
model name	: AMD FX(tm)-8150 Eight-Core Processor

Make sure and check the number listed by model on your system, the -march flag should be bdverX where X is the model number.

Due to the FPU design of the Bulldozer architecture, -mprefer-avx128 gives better FPU preformance at the cost of precision.

The -mvzeroupper switch will give much less of a preformance penalty when using both SSE and AVX, it isn’t much of a performance boost for most apps, but can provide a significant improvement when used with stuff like media-video/ffmpeg, net-misc/bfgminer or some apps in games-emulation that have some sections coded in assembly.

FILE /etc/portage/make.conf
CHOST="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=bdver1 -mprefer-avx128 -mvzeroupper -pipe"
CXXFLAGS="${CFLAGS}"

Geode LX

vendor_id	: AuthenticAMD
cpu family	: 5
model		: 10
model name	: Geode(TM) Integrated Processor by AMD PCS
FILE /etc/portage/make.conf
CHOST="i486-pc-linux-gnu"
CFLAGS="-Os -pipe -march=geode -mmmx -m3dnow -fno-align-jumps -fno-align-functions -fno-align-labels -fno-align-loops -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"

arm

Note
To identify the respective ARM core of the SoC on your board, List of ARM microarchitectures and List of applications of ARM cores on Wikipedia may help.

Cortex-A

ARMv7-A/Cortex-A9 MPCore

with optional VFPv3 FPU
processor       : 0
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 2.00
Features        : half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x1
CPU part        : 0xc09
CPU revision    : 0

processor       : 1
model name      : ARMv7 Processor rev 0 (v7l)
BogoMIPS        : 2.00
Features        : half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x1
CPU part        : 0xc09
CPU revision    : 0

Hardware        : NVIDIA Tegra SoC (Flattened Device Tree)
Revision        : 0000
Serial          : 0000000000000000
FILE /etc/portage/make.conf
CHOST="armv7a-hardfloat-linux-gnueabi"
CFLAGS="-O2 -march=armv7-a -mtune=cortex-a9 -mfpu=vfpv3-d16 -mfloat-abi=hard -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"
Note
This ARM core (equipped with the optional vfpv3d16 FPU but missing the NEON extension) is used in the Toshiba AC100/Dynabook AZ/Compal Paz00 Board.

ARM11

ARMv6/ARM1176JZF-S

processor	: 0
model name	: ARMv6-compatible processor rev 7 (v6l)
BogoMIPS	: 697.95
Features	: half thumb fastmult vfp edsp java tls 
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xb76
CPU revision	: 7

Hardware	: BCM2835
Revision	: 0000
Serial		: 000000000XXXXXXX
FILE /etc/portage/make.conf
CHOST="armv6j-hardfloat-linux-gnueabi"
CFLAGS="-O2 -pipe -mcpu=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard"
CXXFLAGS="${CFLAGS}"
Note
This ARM core is used in the first generation of the Raspberry Pi.

ARMv6/ARM1136JF-S

Processor       : ARMv6-compatible processor rev 5 (v6l)
BogoMIPS        : 791.34
Features        : swp half thumb fastmult vfp edsp java 
CPU implementer : 0x41
CPU architecture: 6TEJ
CPU variant     : 0x1
CPU part        : 0xb36
CPU revision    : 5

Hardware        : IMAPX200
Revision        : 0000
Serial          : 0000000000000000
FILE /etc/portage/make.conf
CHOST="armv6j-hardfloat-linux-gnueabi"
CFLAGS="-Os -march=armv6j -mcpu=arm1136jf-s -mfpu=vfp -mfloat-abi=hard -pipe -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"

ppc/ppc64

Note
-march=native almost never works on PowerPC.

POWER8

processor       : 0
cpu             : POWER8E (raw), altivec supported
clock           : 3026.000000MHz
revision        : 2.1 (pvr 004b 0201)

timebase        : 512000000
platform        : pSeries
model           : IBM pSeries (emulated by qemu)
machine         : CHRP IBM pSeries (emulated by qemu)
Note
Currently Gentoo does not support POWER8, but safe CFLAGS for it would look like the following.
FILE /etc/portage/make.conf
CHOST="powerpc64le-linux-gnu"
CFLAGS="-mcpu=power8 -mtune=power8 -O2 -pipe -mabi=elfv2 -mabi=altivec -maltivec"
CXXFLAGS="${CFLAGS}"

Cell

processor	: 0
cpu		: Cell Broadband Engine, altivec supported
clock		: 3192.000000MHz
revision	: 5.1 (pvr 0070 0501)

processor	: 1
cpu		: Cell Broadband Engine, altivec supported
clock		: 3192.000000MHz
revision	: 5.1 (pvr 0070 0501)

timebase	: 79800000
platform	: PS3
model		: SonyPS3
FILE /etc/portage/make.conf
CHOST="powerpc-unknown-linux-gnu"
CFLAGS="-mcpu=cell -mtune=cell -O2 -pipe -mabi=altivec -maltivec"
CXXFLAGS="${CFLAGS}"
Note
GCC’s -mspe and -mabi=spe options are not targetting PS3 systems and IBM Cell. Instead, those options are dedicated to IBM e500. More info: https://lists.debian.org/debian-devel/2011/06/msg00592.html https://wiki.debian.org/PowerPCSPEPort

G4

PPC 7447A

processor	: 0
cpu		: 7447A, altivec supported
clock		: 1666.666000MHz
revision	: 1.5 (pvr 8003 0105)
bogomips	: 33.28
timebase	: 8320000
platform	: PowerMac
model		: PowerBook5,9
machine		: PowerBook5,9
motherboard	: PowerBook5,9 MacRISC3 Power Macintosh 
detected as	: 287 (PowerBook G4 17")
pmac flags	: 00000018
L2 cache	: 512K unified
pmac-generation	: NewWorld
FILE /etc/portage/make.conf
CHOST="powerpc-unknown-linux-gnu"
CFLAGS="-mcpu=7450 -mtune=7450 -O2 -maltivec -mabi=altivec -fno-strict-aliasing -pipe"
CXXFLAGS="${CFLAGS}"

G3 (PPC 7XX)

processor       : 0
cpu             : 740/750
clock           : 400.000000MHz
revision        : 131.0 (pvr 0008 8300)
bogomips        : 49.93
timebase        : 24966218
platform        : PowerMac
model           : PowerBook3,1
machine         : PowerBook3,1
motherboard     : PowerBook3,1 MacRISC2 MacRISC Power Macintosh
detected as     : 70 (PowerBook Pismo)
pmac flags      : 0000001f
L2 cache        : 1024K unified
pmac-generation : NewWorld
FILE /etc/portage/make.conf
CHOST="powerpc-unknown-linux-gnu"
CFLAGS="-mcpu=750 -Os -pipe -fno-strict-aliasing"
CXXFLAGS="${CFLAGS}"

m68k

Not in the list?

Try the following command to see if you will see something useful.

You will get something like below…

Notice the -march reply?

Get this info from Distcc Wiki Page[1]

root #gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
 /mnt/livecd/usr/x86_64-pc-linux-gnu/gcc-bin/4.7.3/../../../libexec/gcc/x86_64-pc-linux-gnu/4.7.3/cc1 -E -quiet -v -iprefix /mnt/livecd/usr/x86_64-pc-linux-gnu/gcc-bin/4.7.3/../../../lib/gcc/x86_64-pc-linux-gnu/4.7.3/ - 
-march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mavx -mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=10240 -mtune=corei7-avx

서진우

슈퍼컴퓨팅 전문 기업 클루닉스/ 상무(기술이사)/ 정보시스템감리사/ 시스존 블로그 운영자

You may also like...