HuggingFace Buckets Benchmarks

Xet-backed Buckets vs plain S3 • m5dn.24xlarge (96 vCPUs, 384GB RAM, 100 Gbps) in us-east-1huggingface_hub 1.7.1 + hf-xet 1.4.2Buckets-only view →

Upload Throughput S3 vs Buckets

Raw upload speed with unique random files per run (no dedup). S3 uses direct PUT, Buckets use CDC chunking + dedup.

SizeRunS3Bucket (CLI)
100MB01.20s (83.6 MB/s)6.44s (15.5 MB/s)
100MB11.72s (58.3 MB/s)2.28s (43.9 MB/s)
100MB21.16s (86.1 MB/s)2.48s (40.3 MB/s)
1GB03.90s (262.8 MB/s)6.70s (152.9 MB/s)
1GB13.60s (284.2 MB/s)7.10s (144.2 MB/s)
1GB23.15s (325.2 MB/s)6.69s (153.0 MB/s)
5GB012.93s (395.8 MB/s)17.31s (295.8 MB/s)
5GB113.11s (390.4 MB/s)16.70s (306.6 MB/s)
5GB212.08s (423.7 MB/s)17.26s (296.7 MB/s)
10GB023.29s (439.7 MB/s)33.88s (302.3 MB/s)
10GB125.44s (402.5 MB/s)29.31s (349.3 MB/s)
10GB226.01s (393.8 MB/s)30.11s (340.1 MB/s)
50GB0120.60s (424.5 MB/s)129.31s (396.0 MB/s)
50GB1122.50s (418.0 MB/s)127.90s (400.3 MB/s)
50GB2121.80s (420.3 MB/s)129.54s (395.2 MB/s)
100GB0249.20s (410.9 MB/s)252.90s (404.9 MB/s)
100GB1242.30s (422.6 MB/s)254.50s (402.4 MB/s)
100GB2246.50s (415.4 MB/s)249.75s (410.0 MB/s)

Download Throughput S3 vs Buckets

CDN cold = first download (cache miss, origin fetch). Warm = subsequent downloads from edge.

SizeRunS3BucketCDN
100MB0-21.14s, 1.03s, 1.01s (avg 94.6 MB/s)n/a
100MBcold1.54s (65.0 MB/s)cold
100MBwarm 0-21.31s, 0.51s, 0.67s (avg 140.7 MB/s)warm
1GB0-23.30s, 3.15s, 3.47s (avg 310.5 MB/s)n/a
1GBcold3.92s (261.1 MB/s)cold
1GBwarm 0-23.67s, 2.11s, 1.91s (avg 433.9 MB/s)warm
5GB0-212.95s, 12.19s, 13.46s (avg 398.6 MB/s)n/a
5GBcold7.51s (681.9 MB/s)cold
5GBwarm 0-27.72s, 4.34s, 4.46s (avg 996.7 MB/s)warm
10GB0-224.62s, 25.10s, 24.85s (avg 412.0 MB/s)n/a
10GBcold13.13s (779.9 MB/s)cold
10GBwarm 0-213.53s, 8.71s, 7.72s (avg 1086.4 MB/s)warm
50GB0-2121.5s, 123.8s, 122.1s (avg 418.1 MB/s)n/a
50GBcold46.12s (1110.2 MB/s)cold
50GBwarm 0-254.68s, 46.56s, 45.32s (avg 1055.3 MB/s)warm
100GB0-2246.2s, 250.1s, 243.8s (avg 415.1 MB/s)n/a
100GBcold93.92s (1090.3 MB/s)cold
100GBwarm 0-2107.57s, 88.10s, 81.33s (avg 1124.4 MB/s)warm

Deduplication S3 vs Buckets

500MB base file, then upload variants with N% of bytes modified. S3 always re-uploads the full file.

ScenarioChangedS3 transferredBucket transferredSaved
Base upload100%500 MB500 MB
1% modified1%500 MB5.5 MB99%
5% modified5%500 MB27.5 MB95%
10% modified10%500 MB55 MB89%
50% modified50%500 MB275 MB45%

Incremental Update S3 vs Buckets

500MB file, 2% modified then re-uploaded.

ScenarioS3Bucket
Initial upload2.14s4.87s
Update (2% changed)2.32s (full)1.08s (deduped)

Directory Sync S3 vs Buckets

100 files x 1MB, sync then modify 20% of files and re-sync.

ScenarioS3Bucket
Initial sync2.08s5.48s
Incremental (20% changed)1.07s5.20s

CDN Cache Performance Buckets

Cold vs warm CDN download. Warm = xorbs served from CloudFront edge.

SizeCold (MB/s)Warm avg (MB/s)Warm peak (MB/s)Speedup
100MB65.0140.7195.92.2x
1GB261.1433.9536.21.7x
5GB681.9996.71180.21.5x
10GB779.91086.41326.71.4x
50GB1110.21055.31129.61.0x
100GB1090.31124.41259.01.0x

Dedup Efficiency Buckets

Bytes actually transferred as a function of file change percentage. CDC chunks are ~64KB.

Rust Direct Upload xet-data

Upload using xet-data crate directly (no Python, no SHA256). Unique file per run. CLI now nearly matches Rust on large files thanks to hf-xet 1.4.2.

SizeRunCLI Python (hub 1.7.1)Rust directSpeedup
100MBavg3.73s (33.2 MB/s)2.28s (43.9 MB/s)1.3x
1GBavg6.83s (150.0 MB/s)6.29s (162.8 MB/s)1.1x
5GBavg17.09s (299.7 MB/s)15.97s (320.8 MB/s)1.1x

Upload Warm-up Effect Buckets

Upload throughput per run with unique random files. Run 0 includes cold start overhead (JIT, connection setup).