Blog Post

TPC-E Benchmark Analysis by CPU Type

,

As an experiment, I decided to import the current official TPC-E results spreadsheet into a SQL Server 2008 R2 database, so I could easily query and do some basic analysis against the results. I did a little data cleansing so that the data would always use the same terms for CPU model, SQL Server version, etc. I also added a few columns that are not in the original official results spreadsheet, such as the MemorySize, SpindleCount, and OriginalDatabaseSize, and added that data to each row in the table, using the information from the Executive Summary for each submitted TPC-E result. Luckily, there were only 41 TPC-E results at the time this was written.

After that effort, I was ready to write a few queries to see if anything interesting might reveal itself from the actual raw data. Since SQL Server is licensed by physical socket (when you buy a processor license), I thought it would be interesting to rank the results by TpsE per Socket, by simply dividing the TpsE score by the number of sockets. This gives you a pretty good idea of which processor does the best on that benchmark, assuming the rest of the system was properly optimized (which is probably a pretty safe assumption given the time and cost of doing an official TPC-E submission). This gives you an idea of which processor gives you the most “bang for the buck” on the TPC-E benchmark.

Looking at the results this way, the two systems at the top of the list both use the Intel Xeon X5680 processor. After that, the next seven systems use the Intel Xeon X7560 processor, and they show pretty good scaling as you go from four sockets to eight sockets (i.e. the eight socket scores are pretty close to double the four socket scores for that processor), which shows the effectiveness of NUMA. Next in line is a two socket AMD Opteron 6176 SE system, which does about 10% better than a trio of Intel Xeon X5570 systems that are about a year older than the AMD Opteron 6176 SE system. The next four systems are a mix of Intel “Nehalem” and AMD “Magny Cours” based systems that conclude the top tier of results in this query. The lower limit of this “top tier” is shown in green in the table below.

The interesting thing is how much of a drop we see with the next several systems that use the older, six core Intel Xeon X7460 (which was no slouch in its day), showing a drop of nearly 50% in the TpsE per socket score compared to the lowest system in the top tier. Even more interesting, is the difference between the four socket Intel Xeon X7460 systems and the four socket Intel Xeon X7560 systems, which shows what a huge improvement the Intel “Nehalem” is compared to the older Intel “Dunnington”. You can also see that the newer two socket Intel systems (X5680 and X5570) do much better than the older four socket X7460 systems, in both absolute and per socket terms. The two listed sixteen socket Intel Xeon X7460 systems do quite poorly, showing the weakness of the old shared front-side bus architecture in older Intel Xeon processors.

CPU Type

Sockets

Cores

Threads

TpsE

TpsE Per Socket

Intel Xeon X5680

2

12

24

1110.1

555.05

Intel Xeon X5680

2

12

24

1074.14

537.07

Intel Xeon X7560

4

32

64

2046.96

511.74

Intel Xeon X7560

4

32

64

2022.64

505.66

Intel Xeon X7560

4

32

64

2022.64

505.66

Intel Xeon X7560

4

32

64

2001.12

500.28

Intel Xeon X7560

4

32

64

1933.96

483.49

Intel Xeon X7560

8

64

128

3800

475

Intel Xeon X7560

8

64

128

3800

475

AMD Opteron 6176 SE

2

24

24

887.38

443.69

Intel Xeon X5570

2

8

16

817.15

408.57

Intel Xeon X5570

2

8

16

800

400

Intel Xeon X5570

2

8

16

798

399

Intel Xeon X7560

8

64

128

3141.76

392.72

Intel Xeon X5570

2

8

16

766.47

383.23

AMD Opteron 6174

4

48

48

1464.12

366.03

AMD Opteron 6176 SE

4

48

48

1400.14

350.035

Intel Xeon X7460

4

24

24

729.65

182.41

Intel Xeon X7460

4

24

24

721.4

180.35

Intel Xeon X7460

4

24

24

695.24

173.81

Intel Xeon X7460

4

24

24

671.35

167.83

AMD Opteron 8384

4

16

16

635.43

158.85

Intel Xeon X5460

2

8

8

317.45

158.72

Intel Xeon X5460

2

8

8

295.27

147.63

Intel Xeon X7460

8

48

48

1165.56

145.69

Intel Xeon X5355

1

4

4

144.88

144.88

Intel Xeon X5460

2

8

8

268

134

Intel Xeon X7460

16

96

96

2012.77

125.79

Intel Xeon X7350

4

16

16

492.34

123.08

Intel Xeon X7350

4

16

16

479.51

119.87

Intel Xeon X7460

12

64

64

1400

116.66

Intel Xeon X7350

4

16

16

451.29

112.82

Intel Xeon X7350

4

16

16

419.8

104.95

Intel Xeon X7350

8

32

32

804

100.5

Intel Xeon X7460

16

64

64

1568.22

98.01

Intel Xeon X7460

16

64

64

1493.42

93.33

Intel Xeon 5160

2

4

4

169.59

84.79

Intel Xeon X7350

16

64

64

1250

78.12

Intel Xeon 7140

4

8

16

220

55

Intel Xeon 7140

16

32

64

660.85

41.30

Intel Itanium 9150N

32

64

64

1126.49

35.20

 

My takeaway from this relatively simple analysis is that any system with processors older than the Intel “Nehalem” or the AMD “Magny Cours” (Intel 55xx, Intel 75xx or AMD 61xx) will be pretty severely handicapped compared to a system with a “modern” processor. This is especially evident with older Intel four socket systems that use Xeon 74xx or older processors, which are easily eclipsed by two socket systems with Intel Xeon X5570 or X5680 processors.  This analysis also shows how well the 32nm Intel Xeon X5680 “Westmere-EP” processor does on the TPC-E workload, which in my mind, also shows that it is an excellent processor for OLTP workloads in general.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating