AWS S3
General Overview
- Object storage service for scalable, durable data storage.
- 99.999999999% (11 9's) durability; 99.99% availability for most classes.
- Unlimited storage; pay for usage (storage, requests, data transfer).
- Global via multi-Region access; integrates with AWS services (EC2, Lambda, etc.).
- Data Model:
- Bucket – top-level container (unique name, global namespace)
- Object – file + metadata
- Key – full path to object within a bucket
- There is no concept of directory in General-purpose S3.
- Objects size is limited at 5GB objects with more than 5GB, must use "multi-part" upload.
- Objects can have key-value pairs of Metadata and can have key-value tags (useful for security/lifecycles)
Storage Classes
| Class | Use Case | Durability | Availability | Retrieval Time | Minimum Storage Duration | Retrieval Fee |
|---|---|---|---|---|---|---|
| S3 Standard | Frequent access | 11 9s | 99.99% | Instant | None | No |
| S3 Intelligent-Tiering | Unknown access patterns | 11 9s | 99.9–99.99% | Instant | None | No |
| S3 Standard-IA | Infrequent access | 11 9s | 99.9% | Instant | 30 days | Yes * |
| S3 One Zone-IA | Non-critical infrequent data | 11 9s | 99.5% | Instant | 30 days | Yes |
| S3 Glacier Instant Retrieval | Rarely accessed, quick retrieval | 11 9s | 99.9% | ms | 90 days | Yes |
| S3 Glacier Flexible Retrieval | Archive w/ minutes–hours access | 11 9s | 99.99% | minutes–hours | 90 days | Yes |
| S3 Glacier Deep Archive | Long-term cold storage | 11 9s | 99.99% | hours (12h typical) | 180 days | Yes |
* : Retrieval is priced per GB.
Glacier Retrieval Options
| Tier | Flexible Retrieval | Deep Archive |
|---|---|---|
| Expedited | 1-5 minutes | N/A |
| Standard | 3-5 hours | 12 hours |
| Bulk | 5-12 hours | 48 hours |
Versioning
Lifecycle rule actions
Transition rule actions
- (R1) Transition current versions of objects between storage classes.
- Storage class transitions (Target storage class).
- Days after object creation.
- (R2) Transition noncurrent versions of objects between storage classes.
- Storage class transitions.
- Days after objects become noncurrent.
- Number of newer versions to retain.
Deletion/Expiration rule actions
- (R3) Expire current versions of objects.
- Days after object creation
- (R4) Permanently delete noncurrent versions of objects.
- Days after objects become noncurrent
- Number of newer versions to retain - Optional
- (R5) Delete expired object delete markers or incomplete multipart uploads.
- Delete expired object delete markers
- Delete incomplete multipart uploads
Object deletion in a versioned bucket.
- Delete an object with
Show versionsoff -> Soft Delete -> Delete Marker created and is the current version shadowing all other versions. - Delete an object with
Show versionson -> Permanent Delete for the chosen version -> if current is deleted the latest non current becomes current. - No promotion is supported. If an old version is wanted it should be copied over the latest version to create a new one with the content of the old one.
- Lifecycle rule actions (R3) creates a delete marker and promotes it as current version.
- The Expiration rule (R3) only applies to actual object versions, not delete markers.
Replication
Cross-Region Replication (CRR) vs Same-Region Replication (SRR)
| Feature | Details |
|---|---|
| Prerequisites | Versioning enabled on both source and destination |
| Replication scope | All objects, prefix, or tags |
| What's replicated | New objects after enabling, metadata, ACLs, tags |
| Not replicated | Existing objects (need S3 Batch), lifecycle actions, objects in Glacier/Deep Archive |
| Delete behavior | Delete markers can be replicated (optional), version deletes not replicated |
| Replication Time Control (RTC) | 99.99% within 15 minutes (SLA) |
| Batch Replication | Replicate existing objects, failed replications |
Two-way replication
- Enable bidirectional replication between buckets
- Prevents replication loops automatically
Security
Encryption at Rest
| Type | Key Management | Performance |
|---|---|---|
| SSE-S3 | AWS managed (AES-256) | No impact |
| SSE-KMS | AWS KMS keys | KMS API limits apply |
| SSE-C | Customer-provided keys | Customer manages keys |
| Client-side | Encrypt before upload | Customer responsibility |
- Bucket default encryption: Applied to new objects without specified encryption
- Enforce encryption: Use bucket policy to deny unencrypted uploads
Encryption in Transit
- SSL/TLS (HTTPS) endpoints available
- Enforce with bucket policy:
aws:SecureTransportcondition
Access Control
Priority order: Explicit DENY → Explicit ALLOW → Implicit DENY
| Method | Scope | Use Case |
|---|---|---|
| IAM Policies | User/role level | Control who can access S3 |
| Bucket Policies | Bucket level | Cross-account, public access, IP restrictions |
| ACLs (legacy) | Bucket/object level | Simple permissions (avoid for new implementations) |
| Access Points | Subset of bucket | Simplify permissions for shared datasets |
| Presigned URLs | Object level | Temporary access without credentials |
Block Public Access (BPA)
- Four settings: Block public ACLs, Ignore public ACLs, Block public policies, Restrict public buckets
- Applied at account or bucket level
- Overrides bucket policies and ACLs
S3 Access Points
- Named network endpoints with dedicated policies
- Each access point has own DNS name
- Supports VPC-only access
- Simplifies managing access for shared datasets
- Can restrict to specific VPC/VPCE
Event Notifications
Destinations: SNS, SQS, Lambda, EventBridge
Events:
- Object created (PUT, POST, COPY, CompleteMultipartUpload)
- Object deleted, restored
- Replication events
- Lifecycle events
- Intelligent-Tiering changes
EventBridge advantages:
- Advanced filtering (JSON rules)
- Multiple destinations
- Archive, replay events
- 18+ AWS service targets
S3 Directory Buckets
- New bucket type optimized for high performance
- Used with S3 Express One Zone storage class
- Single-digit millisecond latency
- Up to 100GB/s throughput per bucket
- Consistent hashing for predictable performance
- Different naming:
bucket-name--azid--x-s3
Performance
Multipart Upload
- Required for objects > 5GB
- Recommended for objects > 100MB
- Parts: 1-10,000 parts, 5MB-5GB each (except last)
- Benefits: Parallel uploads, pause/resume, start before knowing final size
Transfer Acceleration
- Uses CloudFront edge locations
- URL:
bucket-name.s3-accelerate.amazonaws.com - Up to 50-500% faster for global users
- Additional cost per GB
- Test speed: AWS provides comparison tool
Performance Baseline
- 3,500 PUT/COPY/POST/DELETE requests per second per prefix
- 5,500 GET/HEAD requests per second per prefix
- No limit on prefixes per bucket
- Spread objects across prefixes for higher throughput
Byte-Range Fetches
- Request specific byte ranges of object
- Parallelize downloads
- Resilient to network failures (retry smaller range)
S3 Select & Glacier Select
- Retrieve subset of data using SQL
- Filter at S3 side (up to 400% faster, 80% cheaper)
- Works with CSV, JSON, Parquet
- Supports compression (GZIP, BZIP2)