πŸ’Ύ

Storage - S3, EBS, EFS

S3, EBS, EFS & Storage Gateway

⏱️ Estimated reading time: 25 minutes

Amazon S3 - Simple Storage Service

S3 is an object storage service offering scalability, data availability, security, and performance.

Key Concepts:
- Bucket: Object container with globally unique name
- Object: Stored file (up to 5TB per object)
- Key: Unique object name within bucket
- Regions: Buckets created in specific region

Features:
- Durability of 99.999999999% (11 nines)
- Availability of 99.99% (Standard)
- Object versioning
- Encryption at rest and in transit
- Granular access control

🎯 Key Points

  • βœ“ S3 is object storage, not file system
  • βœ“ Bucket names must be globally unique
  • βœ“ Large objects use multipart upload (>100MB recommended)
  • βœ“ Strong read-after-write consistency since Dec 2020
  • βœ“ S3 has no total storage limit

πŸ’» Basic S3 operations

# Create bucket
aws s3 mb s3://my-unique-bucket-12345

# Upload file
aws s3 cp file.txt s3://my-unique-bucket-12345/

# List objects
aws s3 ls s3://my-unique-bucket-12345/

# Sync directory
aws s3 sync ./my-folder s3://my-unique-bucket-12345/folder/

# Enable versioning
aws s3api put-bucket-versioning \n  --bucket my-unique-bucket-12345 \n  --versioning-configuration Status=Enabled

S3 Storage Classes

S3 offers different storage classes to optimize costs based on access patterns:

S3 Standard:
- General purpose, frequent access
- Low latency, high throughput
- Durability 11 nines, availability 99.99%

S3 Intelligent-Tiering:
- Automatically moves objects between tiers based on access
- No retrieval charges, small monitoring fee
- 4 tiers: Frequent, Infrequent, Archive, Deep Archive

S3 Standard-IA (Infrequent Access):
- Less frequent access but fast when needed
- Lower storage cost, retrieval charge
- Min 30 days, min 128KB per object

S3 One Zone-IA:
- Like Standard-IA but in single AZ
- 20% cheaper than Standard-IA
- Availability 99.5%

S3 Glacier:
- Long-term archive, minutes to hours retrieval
- Retrieval: Expedited (1-5 min), Standard (3-5 hrs), Bulk (5-12 hrs)

S3 Glacier Deep Archive:
- Lowest cost storage
- Retrieval: Standard (12 hrs), Bulk (48 hrs)
- Min 180 days retention

🎯 Key Points

  • βœ“ Lifecycle policies automate transitions between classes
  • βœ“ Standard: frequent access, IA: occasional, Glacier: archive
  • βœ“ Intelligent-Tiering eliminates need for lifecycle policies
  • βœ“ One Zone-IA loses data if AZ fails
  • βœ“ Glacier has retrieval charges based on speed

S3 Security

S3 offers multiple security layers to protect your data:

Access Control:
- IAM Policies: Control which users/roles can access S3
- Bucket Policies: JSON policies at bucket level (cross-account, public)
- ACLs: Legacy access control (not recommended)
- S3 Access Points: Simplify access for specific applications

Encryption:
- At rest (Server-Side):
- SSE-S3: AWS manages keys (AES-256)
- SSE-KMS: AWS KMS manages keys (audit, rotation)
- SSE-C: Client provides keys
- In transit: HTTPS/TLS
- Client-Side: Client encrypts before upload

Additional Features:
- Versioning: Protects against accidental deletion
- MFA Delete: Requires MFA to delete objects
- Object Lock: WORM (write-once-read-many), compliance
- Pre-signed URLs: Temporary access to private objects

🎯 Key Points

  • βœ“ By default all buckets and objects are private
  • βœ“ Block Public Access should be enabled (best practice)
  • βœ“ Bucket policies allow cross-account access
  • βœ“ SSE-KMS enables audit of key access
  • βœ“ Pre-signed URLs expire after configured time

πŸ’» Bucket Policy for public read access

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadGetObject",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-public-bucket/*"
    }
  ]
}

πŸ’» S3 security configuration

# Generate pre-signed URL (valid 1 hour)
aws s3 presign s3://my-bucket/private-file.pdf --expires-in 3600

# Enable default encryption
aws s3api put-bucket-encryption \n  --bucket my-bucket \n  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "AES256"
      }
    }]
  }'

Amazon EBS - Elastic Block Store

EBS provides persistent block storage volumes for EC2 instances.

EBS Volume Types:

SSD-backed:
- gp3 (General Purpose): Price/performance balance, 3000 IOPS baseline, up to 16000 IOPS
- gp2: Previous generation, IOPS scales with size (3 IOPS/GB)
- io2 Block Express: Maximum performance, up to 256,000 IOPS, 99.999% durability
- io1: Previous high IOPS generation, up to 64,000 IOPS

HDD-backed:
- st1 (Throughput Optimized): Big data, data warehouses, log processing. Not boot volume
- sc1 (Cold HDD): Infrequently accessed data. Most economical. Not boot volume

Features:
- Limited to specific AZ
- Can be detached and attached to another instance (same AZ)
- Snapshots allow copy to other regions/AZ
- Resize/type change on the fly

🎯 Key Points

  • βœ“ gp3 is default recommendation for most workloads
  • βœ“ io2 for databases with extreme IOPS requirements
  • βœ“ HDD cannot be boot volumes
  • βœ“ EBS Multi-Attach allows up to 16 Nitro instances in same AZ
  • βœ“ Delete on Termination: root=true, additional=false by default

EBS Snapshots

Snapshots are incremental backups of EBS volumes stored in S3.

Features:
- Incremental: Only changed blocks are copied
- Storage: S3 (AWS-managed, not visible)
- Region: Snapshots belong to region, but can be copied
- Restore: You can create volumes from snapshots in any AZ of region

Best Practices:
- Not necessary to unmount volume, but recommended for consistency
- First snapshot is full, subsequent are incremental
- Deleting snapshot only removes data not used by other snapshots
- Data Lifecycle Manager (DLM) automates creation/retention/deletion

Advanced Features:
- Fast Snapshot Restore (FSR): Eliminates initialization latency
- Snapshot Archive: Economical tier (75% cheaper), restore 24-72 hrs
- Recycle Bin: Recover accidentally deleted snapshots

🎯 Key Points

  • βœ“ Snapshots can be copied between regions (DR, migration)
  • βœ“ You can create AMIs from root volume snapshots
  • βœ“ FSR has additional cost but eliminates warming period
  • βœ“ DLM can create automatic snapshots on schedule
  • βœ“ Recycle Bin retains deleted snapshots from 1 day to 1 year

πŸ’» EBS snapshot management

# Create snapshot
aws ec2 create-snapshot \n  --volume-id vol-0123456789abcdef0 \n  --description "Daily production DB backup"

# Copy snapshot to another region
aws ec2 copy-snapshot \n  --source-region us-east-1 \n  --source-snapshot-id snap-0123456789abcdef0 \n  --region eu-west-1 \n  --description "DR backup in EU"

# Create volume from snapshot
aws ec2 create-volume \n  --snapshot-id snap-0123456789abcdef0 \n  --availability-zone us-east-1a \n  --volume-type gp3

Amazon EFS - Elastic File System

EFS is a fully managed, scalable NFS file system that can be mounted by multiple EC2 instances simultaneously.

Features:
- Multi-AZ: Data automatically replicated across multiple AZ
- Auto-scaling: Grows and shrinks automatically based on use
- Compatibility: NFSv4.1 protocol, Linux compatible
- Performance: Two modes: General Purpose (low latency) and Max I/O (high throughput)
- Throughput: Bursting (scales with size) or Provisioned (fixed)

Storage Classes:
- Standard: Frequent access, multiple AZ
- Infrequent Access (IA): Lower cost, access charge
- One Zone: Data in single AZ (more economical)
- One Zone-IA: Combination of One Zone + IA (minimum cost)

Use Cases:
- Content management and web serving
- Data analytics and shared processing
- Shared home directories
- Shared development environments

🎯 Key Points

  • βœ“ EFS only compatible with Linux (not Windows)
  • βœ“ More expensive than EBS but offers multi-instance sharing
  • βœ“ Lifecycle policy moves files to IA automatically
  • βœ“ EFS Access Points simplify permission management
  • βœ“ Encryption at rest with KMS, encryption in transit with TLS

πŸ’» Mount EFS on Linux

# Mount EFS on EC2 instance
sudo mkdir -p /mnt/efs
sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 \n  fs-0123456789abcdef.efs.us-east-1.amazonaws.com:/ /mnt/efs

# Add to /etc/fstab for auto-mount
echo "fs-0123456789abcdef.efs.us-east-1.amazonaws.com:/ /mnt/efs nfs4 defaults,_netdev 0 0" \n  | sudo tee -a /etc/fstab