DistributedCC Experiments

Results on March 26th

Cluster Stats:

4 c6i.4xlarge EC2 instances all in the same cluster placement group
16 Xeon Platinum
32 GiB of RAM

EBS (Disk) Stats:

80 GiB general purpose2 SSD rated at 240 IOPS
Reads about 15 million graph updates/s from binary graph streams

Networking Performance

Latency to each from main to worker

Using ping round trip latency is measured between 0.096 ms and 0.157 ms

Throughput

Measured with Iperf

To install:

sudo amazon-linux-extras install -y epel
sudo yum install -y iperf

Results

iperf -s on worker nodes. Then iperf -c <worker_addr> on the main node. Using default iperf tcp options, network bandwidth of 9.03-9.09 Gibabits/s from main to worker.

DistributedStreamingCC: Kron16, cold file cache, WorkerCluster::num_batches=512

Used comamnd sync; echo 3 | sudo tee -a /proc/sys/vm/drop_caches to clear file cache

machines, worker_proc	1, 16	2, 16	4, 16	2, 32	4, 48
ingestion (million/s)	1.867	2.014	2.570	3.818	5.411
CC algorithm time (s)	0.43	0.14	0.14	0.14	0.14
memory usage (main)	7.70 GiB	7.70 GiB	7.70 GiB	8.84 GiB	9.8 GiB
memory usage (worker)	112 MiB	148 MiB	148 MiB	148 MiB	138 MiB

DistributedStreamingCC: Kron16, pre-populated file cache, WorkerCluster::num_batches=512

Used command cat /mnt/ssd1/kron_16_stream_binary > /dev/null to prepopulate

machines, worker_proc	1, 16	2, 16	4, 16	2, 32	4, 48
ingestion (million/s)	1.869	2.019	2.584	3.829	5.463
CC algorithm time (s)	0.45	0.14	0.14	0.14	0.14

Kron17 results

4 machines, 48 workers, num_batches=512

Kron17 dataset	cold cache	pre-pop
ingestion (million/s)	5.550	5.593
CC algorithm time (s)	0.31	0.43
memory usage (main)	18.5 GiB	N/A
memory usage (worker)	167 MiB	N/A

pre-populating has less affect than it might have otherwise because we can't fit the entire file in RAM much less sketches and the file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DistributedCC Experiments

Results on March 26th

Cluster Stats:

EBS (Disk) Stats:

Networking Performance

Latency to each from main to worker

Throughput

Measured with Iperf

Results

DistributedStreamingCC: Kron16, cold file cache, WorkerCluster::num_batches=512

DistributedStreamingCC: Kron16, pre-populated file cache, WorkerCluster::num_batches=512

Kron17 results

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally