Behavior & Limits
Guarantees, limits, and platform-specific behavior for Alien Storage.
Alien's Storage binding is built on the Apache Arrow object_store crate — the same library used by DataFusion, Delta Lake, and the wider Arrow ecosystem.
Guarantees
On cloud platforms (AWS, GCP, Azure), Alien provisions and manages the storage backing service. These guarantees apply:
Atomic Writes. A put() call either succeeds completely or fails completely. No partial writes are visible to other readers.
Strong Read-After-Write. After a successful put(), any subsequent get() returns the new data immediately. All three cloud providers guarantee this.
Strong List Consistency. After a put() or delete(), list() reflects the change immediately.
Durability. AWS S3 and GCP Cloud Storage: 99.999999999% (11 nines). Azure Blob Storage: 12+ nines with zone-redundant storage.
Conditional Operations. put() with ifNotExists is atomic — if two concurrent callers create the same key, exactly one succeeds. Backed by object_store's PutMode::Create.
Conditional Updates (ETag). Compare-and-swap via ETags — update an object only if it hasn't changed since you last read it.
Limits
| Limit | Value |
|---|---|
| Max object size | 5 TB (multipart required above 5 GB) |
| Max key (path) length | 1,024 bytes (UTF-8) |
| Path segments | Max 255 bytes each |
| Path charset | UTF-8, no control chars, no ./.. segments, no leading/trailing / |
| Multipart minimum part | 5 MiB (except last part) |
Alien does not impose rate limits — backend rate limits apply (see platform details).
Operation Semantics
These come from the object_store crate and hold across all backends:
List ordering. list() does not guarantee ordering. Sort on the client if needed.
List prefix matching. Segment-based: prefix "foo/bar" matches "foo/bar/x.json" but not "foo/bar_baz/x.json".
Copy and rename. copy() is atomic when the backend supports it. rename() is copy + delete — not atomic.
Multipart uploads. Parts uploaded concurrently in any order. Finalized atomically on complete(). Abandoned uploads cleaned up by the backend.
Platform Details
AWS (S3)
- Request rates auto-scale per prefix. No ramp-up required.
- Baseline: 5,500 GET/HEAD + 3,500 PUT/DELETE per second per prefix.
- Versioning and lifecycle rules fully supported via stack configuration.
GCP (Cloud Storage)
- Baseline: ~5,000 reads/sec + ~1,000 writes/sec per bucket.
- Must ramp up gradually (double every 20 minutes) for sustained high load.
- IAM/ACL changes are eventually consistent (~1 minute).
Azure (Blob Storage)
- Per-blob limit: 500 requests/sec, 60 MB/s throughput.
- Versioning not supported per container. The
versioningstack option is ignored with a warning. - Lifecycle rules not supported per container. The
lifecyclestack option is ignored. - Requires a Storage Account (provisioned automatically by Alien).
Local
- Filesystem-backed directory. Durability depends on OS fsync behavior.
- Deleting a non-existent key returns a not-found error (unlike cloud platforms where it's a no-op).
Triggers
| Platform | Storage Triggers |
|---|---|
| AWS | S3 event notifications |
| GCP | Cloud Storage notifications |
| Azure | Dapr blob storage binding |
| Local | LocalTriggerService |
Design Decisions
Built on object_store. Battle-tested foundation rather than a custom abstraction. Same library used by DataFusion and Delta Lake.
No Alien-level rate limiting. Cloud provider native rate limits apply. AWS auto-scales, GCP requires ramp-up — we document the difference rather than hiding it.
Permissive key charset. Unlike KV, Storage allows any valid UTF-8 key. Matches developer expectations for file paths.