S3 implementation comparison
2024-05-09
One question: Where to pilfer which S3 implementation to reuse?
I have no desire to write S3 from scratch, because the protocol, even though simple, has a lot of small details that require attention and cannot be ignored, because users always want maximum compatibility with Amazon S3.
At the time of writing this post, the following options were found: Minio, SeaweedFS, Ceph (RadosGW), Zenko CloudServer, OpenStack Swift, and Deuxfleurs Garage.
The main interest during comparison is comparing the S3 frontend, the external part of the server, because the storage layer will anyway be replaced with our own one (Vitastor).
S3 tests from Ceph were run during comparison, but in the simplest configuration, without setting up additional integrations possibly required for some of the tests.
Let’s try to look at the features of each implementation!
Minio
- Good compatibility with S3.
- Ceph tests:
311 failed, 321 passed, 97 skipped, 80 deselected, 1 xfailed, 4 warnings, 52 errors in 1025.25s (0:17:05)
. - Feature-wise: no Bucket LoggingConfiguration, Website, UserPolicy, PostObject (HTML forms), BucketACL, CORS, STS. Lifecycle seems to be implemented, but through some own policies. Should work with additional configuration: SSE-KMS - seems to require additional software, and S3Select - disabled by default. Also there is a number of tests which hit minor issues.
- Slightly shitty code, 437 files with 242000 lines of code in one directory without subdirectories (!).
- The storage layer is somehow separated (although not perfectly) - see type ObjectLayer. But most of the code seems to imply that it works over an FS. For example, listings scan FS hierarchy…
- Go is both a pro and a con - on the one hand, it requires a libvitastor-go binding, on the other hand, it provides multithreading out of the box.
- AGPL license. Not a huge deal, but not compatible with my VNPL.
SeaweedFS
- Compatibility with S3 is definitely worse than both Minio and Ceph.
- Ceph tests:
176 failed, 56 passed, 37 skipped, 80 deselected, 1 xfailed, 4 warnings, 510 errors in 554.13s (0:09:14)
- Feature-wise: no versioning (which is the main reason for failing these 510 tests with ERROR), Bucket LoggingConfiguration, Website, PostObject, BucketPolicy, BucketACL, CORS, STS, SSE-C, SSE-KMS, S3Select, Lifecycle. Should work with additional configuration: UserPolicy — it seems to be launched via --iam on a separate port.
- Oddly enough, it also emulates S3 over FS, the only difference is that in the case of SeaweedFS this FS is internal (Filer).
- Also written in Go.
- Apache License.
Ceph (RadosGW)
- Best S3 compatibility.
- Ceph tests:
69 failed, 576 passed, 140 skipped, 40 deselected, 1 xfailed, 4 warnings, 34 errors in 879.16s (0:14:39)
- C++ - on the one hand, does not require bindings, on the other hand, most likely is still hard to develop, because idiomatic C++ in ceph is quite heavy.
- The storage layer is separated in some way, but not simple at all; the standard RADOS storage layer takes up as many as 57,000 lines.
- LGPL-2.1 or LGPL-3.0 license.
Zenko CloudServer
A truly unexpected option.
- Suddenly, very good compatibility with S3 - worse than Ceph, but better than Minio.
- Ceph tests:
253 failed, 382 passed, 97 skipped, 80 deselected, 1 xfailed, 4 warnings, 47 errors in 220.78s (0:03:40)
- Feature-wise:
- No Bucket LoggingConfiguration, SSE-C, UserPolicy, PostObject, S3Select, STS.
- Should work with additional configuration: Website, IAM (BucketACL, BucketPolicy), SSE-KMS (some tests for it were even successful).
- Lifecycle is available through a separate component — Backbeat, but the implementation seems to be rather strange — it requires Kafka and Zookeeper and relies on reading the object metadata log from Kafka, forwarded there via Kafka Connect from MongoDB OpLog — in other words, the implementation is specific to the metadata storage backend. The file backend, however, also supports uploading the log to Kafka, but this is apparently used only for testing during development.
- Quotas: sort of supported, but through a separate service “SCUBA” (Scality Consumption Utilization and Billing API), which is not available - it’s closed-source. Moreover, it seems that SCUBA counts new/changed objects itself, because cloudserver doesn’t have this logic anywhere in the code.
- Storage Classes are not supported - the only STANDARD class is hardcoded. However, there is support for multiple storage “locations” - but only in the form of Location Constraints for buckets. But Storage Class support is very easy to add, literally in a couple of lines.
- Zenko is not suitable for “free” use without Vitastor 😈, because both “okayish” Scality’s implementations of the storage layer are internal and not published at all.
- Zenko is the second implementation after Ceph not tied to the FS abstraction, harmful for S3. Test were ran with the FS backend though, which is why some of the listing tests were failed.
- NodeJS — on the one hand is the simplicity of development, on the other — it also requires a binding
and “girls were swimming in the lake, node_modules found” (700 megabytes of node_modules).
But this can probably be fixed by bundling it with webpack into one large JS file (typically
this allows you to pound everything into a “binary” of 5 megabytes or so). Also the code is
written without async/await using crutches like
async.waterfall()
- in all fairness, all of this should have been rewritten long ago. - Apache License.
Openstack Swift
- Ceph tests weren’t run, but the compatibility with S3 is poor, probably similar to SeaweedFS - no versioning, Lifecycle, Policy, Website. At the same time, it supports an its own Swift API (not S3) which nobody needs.
- What’s interesting is that there is also a fork of Swift from OpenIO, apparently more compatible - it has at least versioning and Lifecycle, not present in original Swift.
- Python (eww). Probably 100500 million lines of code — all of OpenStack is written by “strong programmers”. I don’t even want to test it, because I don’t want to write anything in Python anyway.
- Apache License.
Deuxfleurs Garage
Another dark horse, definitely not an option, I’m just including it for completeness.
- The system itself is something like Elliptics in Rust from the French. Why Elliptics? It also has DHT…
- S3 compatibility is weak, although in some places it is slightly better than Swift. It has no object versioning and SSE-KMS (but it has SSE-C), no Lifecycle, no Object Lock, no Tagging, no BucketNotification, no geo-replication, no BucketPolicy and BucketACL (they claim to have their own replacement).
- The development language is Rust. The storage layer is not separated at all. It is hardly tied to the rest of the system. :-)
Summary
Ceph > Zenko CloudServer > Minio > SeaweedFS.
According to ceph s3-tests: 576 > 382 > 321 > 56.
There is no desire to suffer with sawing off RADOS from the RadosGW, so we’ll try Zenko!