Make sure to check out our approach and disclaimer at the bottom
This test shows the performance over time of a saturating workload of random operations on the database. It is designed to emulate a typical real-world workload that exhibits more reads than writes. These results are limited in performance by the SSD subsystem. Steady-state performance is 890,000 operations per second while initial burst performance is roughly 1,450,000 operations per second
This test shows the performance over time of a saturating workload of random operations on the database. It is designed to stress the overall performance of the storage, networking, and transaction processing subsystems. Steady state performance is 235,000 operations per second while initial burst performance is roughly 380,000 operations per second
This test shows the performance over time of a saturating workload of random reads on set of commonly accessed keys. This test emulates a system with a database too large to fit in memory, but that can be effectively cached. Performance is about half as fast as a pure memcached cluster at 3,750,000 reads per second but provides absolute consistency with the database. (It takes about 10 seconds to warm the cache.)
This test shows the read throughput achieved by random reads of various-sized ranges of key-value pairs. This is designed to test the performance of FoundationDB when used for streaming or analytical applications, as well as retrieving a single logical element, which may be encoded as a contiguous range of sub-keys. At a range size of only 63 keys (approximately 4k of data returned) FoundationDB is already largely limited by the gigabit networking subsystem to reading 26 million keys per second (about 1,800 MB/s)
This test is designed to show the write performance of FoundationDB when manipulating keys that are adjacent. In real workloads, updates are often made to keys that are close to each other (in sorted order). FoundationDB can coalesce these operations to improve efficiency and achieve 1,500,000 writes per second
This test shows the speed over time of loading an empty database with 100 million small key-values. It is designed to exercise data distribution algorithms and write bandwidth. Note that this bulk load is done using standard API transactions with strong ACID properties, not a separate optimized path. For large datasets, FoundationDB sustains about 85 MB/s, loading almost 5 billion key-values per hour It is limited by the performance of the SSDs.
This test shows the relative performance of a typical 90/10 read/write workload over a variety of cluster size configurations. The dataset is sized proportionally to the cluster. The results for the 24-machine cluster are faster than those above because the dataset used for this test is smaller per-machine than the canonical dataset. Near-linear scaling is exhibited.
FoundationDB works hard to achieve low latencies even at moderate or high utilization. This test measures individual key read latencies (in ms) by percent system load. sub-millisecond read latencies are achieved over a broad range of system loads.
FoundationDB works hard to achieve low latencies even at moderate or high utilization. This test measures the latency to durably commit a cross-node transaction (in ms) by percent system load. About 10 millisecond commit latency is achieved over a broad range of system loads, with complete durability.
FoundationDB works hard to achieve low latencies even at moderate or high utilization. This test measures the mean latency (in ms) of a blend of two types of cross-node transactions: 20% doing 5 reads and 5 writes and 80% doing 10 reads. The latency is shown as a function of percent system load. About 10 millisecond mean cross-node transaction latency is achieved over a broad range of system loads, with complete durability.
We took performance seriously right from the start of FoundationDB's design process. Some distributed systems are "scalable" to many nodes, but each node is slowed by the weight of the distributed programming techniques. We've developed new data structures, memory management tools, communications models, and even our own programming language in the service of not just scalability, but also raw speed per node. When you combine our raw speed with scalability, we deliver awesome performance with modest hardware.
We measure 50+ different key performance metrics every night during a 10-hour test. Above are performance details for a few of the most important metrics.
This is a tricky subject, as we all know how easy performance numbers are to sway one way or another. High numbers can be achieved by testing with small datasets that fit entirely in RAM, workloads with few writes, reduced replication factors, durability guarantees disabled, lots of consecutive changes to single keys, datasets that have just been freshly loaded, or, for ACID systems, transactions that conveniently never involve data elements stored on different nodes. Likewise, low numbers usually come from tests with low numbers of outstanding requests so that sufficient parallelism to get the system really humming is never achieved, or from insufficient load from the test clients.
We don't use any of these tricks in the numbers posted above.
24-machine Ethernet cluster
E3-1240v1 CPU, 16GB RAM, 2x 200GB SATA SSD
~2.0kW peak power consumption in 9U of rack space
2 billion key-value pairs with 16 byte keys and random 8-100 byte values. The dataset on disk is 3.5 times as large as the page cache. Data stored with triple replication.
*Unless otherwise noted
All operations are fully ACID transactional at the highest possible level of isolation (serializable) and durability (commit means flushed to disk in three places).
The database has been "aged" by running many hours of random mutations to avoid unrealistically-clean data structures.