Benchmarking Ampere’s ARM CPU in Google Cloud Platform

While creating an instance today I noticed GCP offers ARM based CPUs made by Ampere, a company based in Santa Clara with a large office in Portland. The monthly cost runs about $30/mo for a single CPU with 4 GB RAM – a bit pricier than comparable N1, but slightly less than a comparable T2D, which is the ultra-fast AMD EPYC Milan platform.

Since I mostly run basic Debian packages and python scripts, CPU platform really wasn’t an issue, so I was curious to have a quick bake-off using a basic 16 thread sysbench test to mimic a light to moderate load. Here’s the results

t2a-standard-1

These are based on Ampere Altra and cost $29/mo in us-central1

CPU speed:
    events per second:  3438.95

General statistics:
    total time:                          10.0024s
    total number of events:              34401

Latency (ms):
         min:                                    0.28
         avg:                                    4.63
         max:                                   80.31
         95th percentile:                       59.99
         sum:                               159394.13

Threads fairness:
    events (avg/stddev):           2150.0625/4.94
    execution time (avg/stddev):   9.9621/0.03

t2d-standard-1

These are based on the new 3rd gen AMD Milan platform and cost $32/mo in us-central1

CPU speed:
    events per second:  3672.67

General statistics:
    total time:                          10.0027s
    total number of events:              36738

Latency (ms):
         min:                                    0.27
         avg:                                    4.34
         max:                                  100.28
         95th percentile:                       59.99
         sum:                               159498.26

Threads fairness:
    events (avg/stddev):           2296.1250/3.24
    execution time (avg/stddev):   9.9686/0.02

n1-standard-1

These are based on the older Intel Skylake platform and cost $25/mo in us-central1

Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:   913.60

General statistics:
    total time:                          10.0072s
    total number of events:              9144

Latency (ms):
         min:                                    1.08
         avg:                                   17.45
         max:                                   89.10
         95th percentile:                       61.08
         sum:                               159544.06

Threads fairness:
    events (avg/stddev):           571.5000/1.00
    execution time (avg/stddev):   9.9715/0.03

n2d-custom2-4096

These are based on 2nd generation AMD EPYC Rome and cost $44/mo in us-central1

CPU speed:
    events per second:  1623.41

General statistics:
    total time:                          10.0046s
    total number of events:              16243

Latency (ms):
         min:                                    0.89
         avg:                                    9.82
         max:                                   97.24
         95th percentile:                       29.19
         sum:                               159485.50

Threads fairness:
    events (avg/stddev):           1015.1875/3.13
    execution time (avg/stddev):   9.9678/0.02

n2-custom-2-4096

These are based in Intel Cascade Lake and cost $50/mo in us-central1

CPU speed:
    events per second:  1942.56

General statistics:
    total time:                          10.0036s
    total number of events:              19435

Latency (ms):
         min:                                    1.01
         avg:                                    8.21
         max:                                   57.04
         95th percentile:                       29.19
         sum:                               159499.92

Threads fairness:
    events (avg/stddev):           1214.6875/8.62
    execution time (avg/stddev):   9.9687/0.02

e2-medium

These are based on availability and have 1-2 shared CPU cores and cost $25/mo in us-central1

CPU speed:
    events per second:  1620.67

General statistics:
    total time:                          10.0055s
    total number of events:              16217

Latency (ms):
         min:                                    0.85
         avg:                                    9.84
         max:                                   65.18
         95th percentile:                       29.19
         sum:                               159647.07

Threads fairness:
    events (avg/stddev):           1013.5625/3.43
    execution time (avg/stddev):   9.9779/0.02

Summary

Amphere’s ARM CPUs offered slightly lower performance against the latest goodies from AMD. It did however beat it in the bang for buck ratio thanks to costing $1 less per month to run.

But, the key take away is both platforms completely blow away the older CPU platforms from Intel. Here’s some nice little charts visualizing the numbers.

Advertisement