Ephemeral Storage Limits in AWS Lambda
Serverless geospatial processing demands predictable I/O boundaries, especially when ingesting, transforming, or exporting large raster and vector datasets. The /tmp directory in AWS Lambda serves as the primary scratch space for temporary file operations, intermediate GDAL caches, and staging outputs before persistence to S3 or other durable storage. Understanding Ephemeral Storage Limits in AWS Lambda is critical for Cloud GIS engineers, Python backend developers, and platform architects designing resilient spatial data pipelines.
Historically capped at 512 MB, Lambda’s ephemeral storage can now be provisioned up to 10 GB, directly impacting how geospatial workloads handle multi-band GeoTIFFs, point cloud conversions, and vector tile generation. Proper configuration prevents runtime failures, optimizes cold start behavior, and aligns with broader Serverless Geospatial Architecture & Platform Limits design principles.
Understanding the /tmp Boundary in Serverless GIS
Lambda mounts an NVMe-backed block volume at /tmp for every function invocation. This storage operates under strict isolation and lifecycle rules:
- Invocation-Scoped Isolation: No cross-function or cross-account sharing. Each concurrent execution receives its own isolated volume.
- Ephemeral Lifecycle: Data is wiped between invocations. However, during warm starts, the
/tmpcontents persist until the execution environment is reclaimed by AWS. - Independent Provisioning: Since late 2022, ephemeral storage is decoupled from function memory. You can allocate 512 MB to 10,240 MB regardless of the RAM configuration.
Geospatial libraries like GDAL, rasterio, and geopandas heavily utilize /tmp for virtual raster (VRT) staging, tile extraction buffers, write-ahead logs, and GDAL cache directories. When processing high-resolution orthomosaics, LiDAR derivatives, or large GeoPackage exports, default limits quickly become bottlenecks. The official AWS Lambda quotas and limits documentation outlines the exact provisioning syntax, concurrency boundaries, and billing implications for expanded /tmp allocations.
Architecture and Provisioning Boundaries
Ephemeral storage in Lambda is billed per GB-second, meaning over-provisioning directly impacts operational costs. Conversely, under-provisioning triggers DiskSpaceExhausted errors that terminate spatial transformations mid-stream. The allocation model is straightforward but requires deliberate infrastructure-as-code (IaC) configuration.
Because compute and storage scale independently, teams must balance memory allocation with /tmp sizing based on workload characteristics. Memory dictates CPU share and network bandwidth, while /tmp dictates scratch capacity. For a detailed breakdown of how RAM provisioning influences raster processing throughput, consult Memory and CPU Allocation for Raster Workloads.
Configuration Methods
AWS Management Console: Navigate to Configuration → General configuration → Edit → Ephemeral storage. Adjust the slider between 512 MB and 10,240 MB.
AWS SAM (template.yaml):
Resources:
GeoTransformFunction:
Type: AWS::Serverless::Function
Properties:
Runtime: python3.10
EphemeralStorage:
Size: 4096 # MB
MemorySize: 2048
Timeout: 300
Terraform (aws_lambda_function):
resource "aws_lambda_function" "spatial_processor" {
function_name = "spatial-processor"
runtime = "python3.10"
memory_size = 2048
timeout = 300
ephemeral_storage {
size = 4096 # MB
}
}
Step-by-Step Configuration Workflow
Deploying a geospatial Lambda with optimized /tmp boundaries requires a repeatable workflow that accounts for package size, environment variables, and runtime initialization.
- Baseline Workload Assessment: Profile your largest expected input. A 2 GB compressed GeoTIFF may require 3–4 GB of
/tmpfor decompression, VRT generation, and block caching. - Define IaC Parameters: Set
EphemeralStorageto 1.5× your peak scratch requirement. Reserve headroom for GDAL’s internal temp files. - Configure Environment Variables: Direct library temp directories explicitly to
/tmpto prevent fallback to restricted paths.
import os
os.environ["CPL_TMPDIR"] = "/tmp/gdal_cache"
os.environ["GDAL_CACHEMAX"] = "512" # MB
os.environ["PYTHON_TMPDIR"] = "/tmp"
- Package Optimization: Use Lambda Layers or container images to isolate heavy geospatial binaries. Large deployment packages increase cold start latency, which compounds when
/tmpinitialization overlaps with runtime bootstrapping. For strategies on minimizing initialization overhead, review Cold Start Mapping for Python GDAL. - Validate with Synthetic Payloads: Run load tests using representative raster sizes. Monitor
/tmputilization via CloudWatch custom metrics or embeddedshutil.disk_usage("/")logging.
Optimizing Geospatial I/O and Cache Behavior
Raw /tmp allocation is only half the equation. How geospatial libraries consume that space determines pipeline stability. GDAL and rasterio default to aggressive caching and temporary file creation that can exhaust even 10 GB allocations if left unconfigured.
GDAL Cache and Temp Directories
GDAL uses /tmp for block caching, VRT intermediates, and format-specific scratch files. You can control this behavior via configuration options:
GDAL_CACHEMAX: Limits in-memory cache (MB). Setting this too high forces spills to disk.CPL_TMPDIR: Redirects all CPL/GDAL temporary files. Always point to/tmp/gdal_cacheand ensure directory creation at runtime.GDAL_DISABLE_READDIR_ON_OPEN: Set toEMPTY_DIRto prevent unnecessary directory scans that generate temp metadata files.
For a complete reference of tunable parameters, consult the official GDAL Configuration Options documentation.
Python tempfile and Rasterio Integration
Python’s tempfile module defaults to the system temp directory, which maps to /tmp in Lambda. However, rasterio and fiona may spawn additional temporary files during write operations. Explicitly configure your Python environment to use /tmp and implement deterministic cleanup:
import tempfile
import rasterio
import os
def process_raster(input_uri, output_uri):
# Force Python tempfile to /tmp
tempfile.tempdir = "/tmp"
os.makedirs("/tmp/gdal_cache", exist_ok=True)
# Use context managers to guarantee cleanup
with rasterio.open(input_uri) as src:
profile = src.profile.copy()
profile.update(dtype=rasterio.float32, count=1)
with tempfile.NamedTemporaryFile(suffix=".tif", dir="/tmp", delete=False) as tmp:
with rasterio.open(tmp.name, "w", **profile) as dst:
# Process blocks here
for _, window in src.block_windows():
data = src.read(window=window)
dst.write(data, window=window)
# Upload to S3, then remove
# s3.upload_file(tmp.name, bucket, key)
os.unlink(tmp.name)
When extracting specific tiles or performing on-the-fly clipping, /tmp usage scales non-linearly with resolution and band count. For advanced patterns on bounding scratch space during high-throughput extraction, see Managing /tmp Storage Limits for GeoTIFF Extraction.
Monitoring, Failure Modes, and Cost Implications
Proactive monitoring prevents cascading failures in automated spatial pipelines. Lambda does not emit /tmp utilization metrics by default, so teams must instrument custom logging or use CloudWatch Embedded Metric Format (EMF).
Key Metrics to Track
DiskSpaceExhausted: Runtime error indicating/tmpoverflow.InitDuration&Duration: Correlate cold starts with/tmpprovisioning size.- Custom
tmp_usage_mb: Logshutil.disk_usage("/tmp").usedat function start, mid-process, and exit.
Common Failure Modes
- Silent Cache Overflow: GDAL writes beyond
CPL_TMPDIRwhenGDAL_CACHEMAXis unset, causing untracked/tmpgrowth. - Warm Start Accumulation: Reused execution environments retain
/tmpdata. Subsequent invocations may inherit leftover files if cleanup logic is missing. - Permission Denials: Attempting to write to
/var/task(read-only) instead of/tmptriggersOSError: [Errno 30] Read-only file system.
Cost Modeling
Ephemeral storage pricing is calculated per GB-second. At current rates, 10 GB allocated for a 30-second execution costs approximately $0.0000000167 per GB-second. While negligible for infrequent runs, high-concurrency pipelines processing 4K rasters can accumulate measurable monthly overhead. Right-size allocations using historical peak usage plus a 20% buffer.
Production Best Practices for Spatial Pipelines
- Stream Over Stage: Whenever possible, use
rasterio’sblock_windows()orfionagenerators to process data in chunks rather than loading entire datasets into/tmp. - Idempotent Cleanup: Wrap all
/tmpoperations intry/finallyblocks. AWS does not guarantee environment recycling; assume/tmpmay contain stale data on warm starts. - Security Boundaries: Restrict IAM policies to least-privilege S3 paths.
/tmpis isolated but not encrypted by default; avoid writing sensitive credentials or unredacted PII to scratch space. - Container Image Optimization: When using Lambda container images, set
WORKDIR /tmponly if your entrypoint explicitly manages scratch space. Otherwise, keep/var/taskclean to reduce cold start overhead. - Fallback Strategies: Implement graceful degradation. If
/tmputilization exceeds 85%, abort non-critical transformations and queue payloads for step-function retry with increased storage allocation.
By treating ephemeral storage as a first-class architectural constraint rather than an afterthought, teams can build serverless GIS pipelines that scale predictably, avoid disk-bound failures, and maintain cost efficiency across variable geospatial workloads.