Some more benchmarks

January 5th, 2013 by

CPU benchmarks

Here are some geekbench readings (32bit tryout version) for some of our servers and for comparison some Amazon EC2 images.

server Geekbench
Dual Hex Core 2Ghz Sandybridge (debian) (E5-2630L) 18265
Hex Core 2Ghz Sandybridge (debian) (E5-2630L) 11435
Quad Core 2.3Ghz Ivy Bridge (ubuntu) (i7-3615QM) 12105
Quad Core 2.0Ghz Sandy Bridge (debian) (i7-2635QM) 9135
Dual Core 2.3Ghz Sandy Bridge (debian) (i5-2415M) 6856
Dual Core 2.66Ghz Core 2 Duo (debian) (P8800) 3719
Dual Core 1.83Ghz Core 2 Duo (debian) (T5600) 2547
Toshiba z930 laptop (Ivy Bridge i7-3667U) 6873
Amazon EC2 t1.micro instance (ubuntu) (E5430 1 virtual core) 2550
Amazon EC2 c1.xlarge instance (ubuntu) (E5506 8 virtual cores) 7830
Amazon EC2 hi1.4xlarge instance (ubuntu) (E5620 16 virtual cores) 10849
Azure Small (1 core AMD Opteron(tm) Processor 4171 HE @ 2.09 GHz / 1.75GB) 2510
Azure Extra Large (8 core AMD Opteron(tm) Processor 4171 HE 2.09Ghz / 14GB) 7471
Elastic Hosts ‘2000Mhz’ single core VM (Opteron 6128) 2163
ElasticHosts ‘20000Mhz’ eight core VM (Opteron 6128) 6942
Linode 512MB VDS (L5520 4 virtual cores) 4469
Mythic Beasts 1GB VDS (L5630 1 virtual core) 2966
Mythic Beasts 64GB VDS (L5630 4 virtual cores) 4166

The method here is pretty simple. Take the default OS install, copy geekbench 32 bit tryout edition onto the machine. Run it and record the results.

It’s important to remember that geekbench performs a mixture of tests, some of which don’t parallelise. This means a server with a fast core will receive a higher score than one with lots of slower cores. As a result the sandybridge and ivybridge machines score very highly because the servers will increase the performance of a single core if the other cores are idle.

Disk benchmarks

We have several disk subsystems available. Single disk, dual disk mirrored software RAID, dual disk mirrored hardware RAID, 8 disk array hardware RAID and PCI-E SSD accelerator card.

Read only benchmarks

The benchmark here is carried out with iops, a small python script that does random reads.

4kb reads

IO Subsystem IOPS Data rate
Single SATA disk 60.5 242kB/sec
Mirrored SATA disk 149 597kB/sec
Hardware RAID 1 SATA disk 160.2 640kB/sec
Hardware RAID 10 SATA 6-disk 349 1.4MB/sec
Hardware RAID 10 4 disk Intel 520 SSD 21426 83.7MB/sec
Hardware RAID 0 6 disk SAS 15krpm 104 416kB/sec
Intel 910 SSD 28811 112MB/sec
Apple 256GB SATA SSD 21943 85.7MB/sec
Intel 710 300GB SSD RAID1 Hardware BBU 24714 96.5MB/sec
Amazon micro instance (EBS) 557 2.2MB/sec
Amazon c1.xlarge instance (local) 1746 6.8MB/sec
Amazon c1.xlarge instance xvda (local) 325 1.2MB/sec
Amazon m1.xlarge EBS optimised, 2000IOPS EBS 69 277kB/sec
Amazon hi.4xlarge software RAID on 2x1TB SSD 22674 88.6MB/sec
Azure small (sda) 73.3 293kB/sec
Azure small (sdb) 16010 62.5MB/sec
Azure Extra Large (sda) 86.4 345kB/sec
Azure Extra Large (sdb) 10136 39.6MB/sec
Elastic Hosts Disk storage 54.1 216.6kB/sec
Elastic Hosts SSD storage 437 1.7MB/sec
Mythic Beasts 1G VDS 65.3 261KB/sec
Linode 512MB VDS 475 1.9MB/sec

1MB reads

IO Subsystem IOPS Data rate
Single SATA disk n/a n/a
Mirrored SATA disk 48.7 48.7MB/sec
Hardware RAID 1 SATA disk 24.9 24.9MB/sec
Hardware RAID 10 SATA disk 23.2 23.2MB/sec
Intel 910 SSD 525 524MB/sec
Apple 256GB SATA SSD 477 477MB/sec
Intel 710 300GB SSD RAID1 Hardware BBU 215 215MB/sec
Hardware RAID 10 4 disk Intel 520 SSD 734 734MB/sec
Hardware RAID 0 6 disk SAS 15krpm 24 24MB/sec
Amazon micro instance (EBS) 71 71MB/sec
Amazon c1.xlarge instance xvdb (local) 335 335MB/sec
Amazon c1.xlarge instance xvda (local) 81.4 114MB/sec
Amazon m1.xlarge EBS optimised, 2000IOPS EBS 24 24MB/sec
Amazon hi.4xlarge software RAID on 2x1TB SSD 888 888MB/sec
Azure small (sda) n/a n/a
Azure small (sdb)
Azure Extra Large(sda) n/a n/a
Azure Extra Large(sdb) 1817 1.8GB/sec
Elastic Hosts Disk storage n/a n/a
Elastic Hosts SSD storage 49.6 49.6MB/sec
Mythic Beasts 1G VDS 44.7 44.7MB/sec
Linode 512MB VDS 28 28MB/sec

It’s worth noting that with 64MB reads the Intel 910 delivers 1.2GB/sec, the hi.4xlarge instance 1.1GB/sec (curiously the Amazon machine was quicker with 16MB blocks). At the smaller block sizes the machine appears to be bottlenecked on CPU rather than the PCI-E accelerator card. The RAID10 array had a stripe size of 256kB so the 1MB read requires a seek on every disk – hence performance similar to that of a single disk as the limitation is seek rather than transfer time. There’s a reasonable argument that a more sensible setup is RAID1 pairs and then LVM striping to have much larger stripe sizes than the controller natively supports.

We’re not sure why the SAS array benchmarks so slowly, it is an old machine (five years old) but is set up for performance not reliability.

Write only benchmarks

I went back to rate.c, a synchronous disk benchmarking tool we wrote when investigating and improving UML disk performance back in 2006. What I did was generate a 2G file, run random sized synchronous writes into it and then read out the performance for 4k and 1M block sizes. The reasoning for a 2GB file is that our Linode instance is a 32bit OS and rate.c does all the benchmarking into a single file limited to 2GB.

Write performance

IO Subsystem IOPS at 4k IOPS at 1M
Software RAID 1 84 31
Linode 512MB VM 39 25
Mythic Beasts 1G VM 116 119
Mythic Beasts 1G VM 331 91
Mythic Beasts 1G VM 425 134
2x2TB RAID1 pair with BBU 746 54
6x2TB RAID10 pair with BBU 995 99
400GB Intel 910 SSD 2148 379
256GB Apple SATA SSD 453 96
2x300GB Intel 710 SSD RAID1 pair with BBU 3933 194
Hardware RAID 10 with 4xIntel 520 SSD 3113 623
Hardware RAID 0 with 6x15krpm SAS 2924 264
Amazon EC2 micro, EBS 78 23
Amazon EC2 m1.xlarge, EBS 275 24
Amazon EC2 m1.xlarge, EBS provisioned with 600IOPS 577 35
Amazon EC2 m1.xlarge, instance storage 953 45
Amazon EC2 m1.xlarge, EBS optimised, EBS 246 27
Amazon EC2 m1.xlarge, EBS optimised, EBS with 2000IOPS 670 42
Amazon EC2 hi.4xlarge, software RAID on 2x1TB SSD 2935 494
Azure small (sda) 24.5 5.8
Azure small (sdb) 14 11
Azure Extra Large (sda) 34 6
Azure Extra Large (sdb) 6.1 5.1
Elastic Hosts disk storage 12.8 7.7
Elastic Hosts ssd storage 585 50

I think there’s a reasonable argument that this is reading high for small writes on the BBU controllers (including the VMs & Linode VM). It’s entirely possible that the controllers manage to cache the vast majority of writes in RAM and the performance wouldn’t be sustained in the longer term.

Real world test

We presented these results to one of our customers who has a moderately large database (150GB). Nightly they take a database backup, post process it then reimport it to another database server in order to do some statistical processing on it. The bottleneck in their process is the database import. We borrowed their database and this is the timing data for a postgresql restore. The restore file is pulled from the same media the database is written to.

Server Time for import
Hex core 2.0Ghz Sandy Bridge, 128GB RAM, 2TB SATA hardware RAID 1 with BBU 2h 35m 24s
Hex core 2.0Ghz Sandy Bridge, 128GB RAM, 400GB Intel 910 SSD 1h 45m 8s
Hex core 2.0Ghz Sandy Bridge, 128GB RAM, 2x300GB Intel 710 SSD hardware RAID 1 with BBU 2h 0m 33s
Quad core 2.3Ghz Ivy Bridge, 4GB RAM, 1TB SATA software RAID 1 4h 16m 14s
Quad core 2.3Ghz Ivy Bridge, 16GB RAM, 1TB SATA software RAID 1 3h 38m 3s
Quad core 2.3Ghz Ivy Bridge, 16GB RAM, 256GB SATA SSD 1h 54m 38s
Quad core E3-1260L 2.4Ghz Ivy Bridge, 32GB RAM, 4xIntel 520 SSD hardware RAID 10 with BBU 1h 29m 33s
Hex core E5450 3Ghz 24GB RAM, 6x15krpm SAS hardware RAID 0 with BBU 1h 58m
Amazon EC2 m1.xlarge with 200GB of 600IOPS EBS 5h 55m 36s
Amazon EC2 m1.xlarge with 200GB of 2000IOPS EBS 4h 53m 45s
Amazon EC2 hi.4xlarge with 2x1TB RAID1 SSD 2h 9m 27s
Azure Extra Large sdb (ephemeral storage) 6h 18m 29s
ElasticHosts 4000Mhz / 4GB / 200GB hard disk 5h 57m 39s
ElasticHosts 20000Mhz / 32GB / 200GB SSD 3h 16m 55s
KVM Virtual Machine (8GB / 8 cores) running on 16GB 2.3Ghz Ivy Bridge Server, software RAID1 with unsafe caching 4h 10m 30s

The postgres import is mostly single threaded – usually the servers sit at 100% CPU on one core with the others idle with only occasionaly bursts of parallelism. Consequently usually the CPU is bursting to 2.5Ghz (Sandy Bridge) or 3.3Ghz (Ivy Bridge). The Ivy Bridge RAID1 machine is actually a Mac Mini. In many ways this is an application perfectly suited to ‘the cloud’ because you’d want to spin up a fast VM, import the database then start querying it. It’s important to note that the estimated lifetime of the Intel 520 RAID 10 array in this test is six months, the performance gain there over the 910SSD is entirely due to faster single threaded performance on the CPU.

Bias

Whilst I’ve tried to be impartial obviously these results are biased. When Mythic Beasts choose hardware for our dedicated and virtual server platforms we deliberately search out the servers that we think offer the best value, so to some extent our servers have been chosen because historically they’ve performed well at the type of benchmarks we test with. There’s also publication bias, if the results said emphatically that our servers were slow and overpriced we’d have fixed our offering, then rewritten the article based on the newer faster servers we now had.

Notes

The real world test covers two scenarios, the delay in getting a test copy of the database for querying for which temporary storage may be fine, plus in the event of something going hideously wrong a measure of the downtime of your site until it comes back up again in which case persistent storage is terrifically important.

I plan to add an m2.xlarge + EBS instance, and a hi1.4xlarge instance. I originally didn’t include the hi1.4xlarge because they don’t have EBS optimised volumes for persistent storage. I might also add some test Mythic Beasts VMs with both safe and unsafe storage (i.e. you cache all writes in the host RAM ignoring sync calls) which is a cheap and easy way to achieve instance storage with a huge performance benefit. I excluded the Linode VM from the final test as it’s too small.