Skip to content

Ephemeral Storage Limits in AWS Lambda

Serverless geospatial processing demands predictable I/O boundaries, especially when ingesting, transforming, or exporting large raster and vector datasets. The /tmp directory in AWS Lambda serves as the primary scratch space for temporary file operations, intermediate GDAL caches, and staging outputs before persistence to S3 or other durable storage. Understanding Ephemeral Storage Limits in AWS Lambda is critical for Cloud GIS engineers, Python backend developers, and platform architects designing resilient spatial data pipelines.

Historically capped at 512 MB, Lambda’s ephemeral storage can now be provisioned up to 10 GB, directly impacting how geospatial workloads handle multi-band GeoTIFFs, point cloud conversions, and vector tile generation. Proper configuration prevents runtime failures, optimizes cold start behavior, and aligns with broader Serverless Geospatial Architecture & Platform Limits design principles.

Understanding the /tmp Boundary in Serverless GIS

Lambda mounts an NVMe-backed block volume at /tmp for every function invocation. This storage operates under strict isolation and lifecycle rules:

  • Invocation-Scoped Isolation: No cross-function or cross-account sharing. Each concurrent execution receives its own isolated volume.
  • Ephemeral Lifecycle: Data is wiped between invocations. However, during warm starts, the /tmp contents persist until the execution environment is reclaimed by AWS.
  • Independent Provisioning: Since late 2022, ephemeral storage is decoupled from function memory. You can allocate 512 MB to 10,240 MB regardless of the RAM configuration.

Geospatial libraries like GDAL, rasterio, and geopandas heavily utilize /tmp for virtual raster (VRT) staging, tile extraction buffers, write-ahead logs, and GDAL cache directories. When processing high-resolution orthomosaics, LiDAR derivatives, or large GeoPackage exports, default limits quickly become bottlenecks. The official AWS Lambda quotas and limits documentation outlines the exact provisioning syntax, concurrency boundaries, and billing implications for expanded /tmp allocations.

Architecture and Provisioning Boundaries

Ephemeral storage in Lambda is billed per GB-second, meaning over-provisioning directly impacts operational costs. Conversely, under-provisioning triggers DiskSpaceExhausted errors that terminate spatial transformations mid-stream. The allocation model is straightforward but requires deliberate infrastructure-as-code (IaC) configuration.

Because compute and storage scale independently, teams must balance memory allocation with /tmp sizing based on workload characteristics. Memory dictates CPU share and network bandwidth, while /tmp dictates scratch capacity. For a detailed breakdown of how RAM provisioning influences raster processing throughput, consult Memory and CPU Allocation for Raster Workloads.

Configuration Methods

AWS Management Console: Navigate to Configuration → General configuration → Edit → Ephemeral storage. Adjust the slider between 512 MB and 10,240 MB.

AWS SAM (template.yaml):

yaml
Resources:
  GeoTransformFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: python3.10
      EphemeralStorage:
        Size: 4096 # MB
      MemorySize: 2048
      Timeout: 300

Terraform (aws_lambda_function):

hcl
resource "aws_lambda_function" "spatial_processor" {
  function_name = "spatial-processor"
  runtime       = "python3.10"
  memory_size   = 2048
  timeout       = 300
  
  ephemeral_storage {
    size = 4096 # MB
  }
}

Step-by-Step Configuration Workflow

Deploying a geospatial Lambda with optimized /tmp boundaries requires a repeatable workflow that accounts for package size, environment variables, and runtime initialization.

  1. Baseline Workload Assessment: Profile your largest expected input. A 2 GB compressed GeoTIFF may require 3–4 GB of /tmp for decompression, VRT generation, and block caching.
  2. Define IaC Parameters: Set EphemeralStorage to 1.5× your peak scratch requirement. Reserve headroom for GDAL’s internal temp files.
  3. Configure Environment Variables: Direct library temp directories explicitly to /tmp to prevent fallback to restricted paths.
python
  import os
  os.environ["CPL_TMPDIR"] = "/tmp/gdal_cache"
  os.environ["GDAL_CACHEMAX"] = "512" # MB
  os.environ["PYTHON_TMPDIR"] = "/tmp"
  1. Package Optimization: Use Lambda Layers or container images to isolate heavy geospatial binaries. Large deployment packages increase cold start latency, which compounds when /tmp initialization overlaps with runtime bootstrapping. For strategies on minimizing initialization overhead, review Cold Start Mapping for Python GDAL.
  2. Validate with Synthetic Payloads: Run load tests using representative raster sizes. Monitor /tmp utilization via CloudWatch custom metrics or embedded shutil.disk_usage("/") logging.

Optimizing Geospatial I/O and Cache Behavior

Raw /tmp allocation is only half the equation. How geospatial libraries consume that space determines pipeline stability. GDAL and rasterio default to aggressive caching and temporary file creation that can exhaust even 10 GB allocations if left unconfigured.

GDAL Cache and Temp Directories

GDAL uses /tmp for block caching, VRT intermediates, and format-specific scratch files. You can control this behavior via configuration options:

  • GDAL_CACHEMAX: Limits in-memory cache (MB). Setting this too high forces spills to disk.
  • CPL_TMPDIR: Redirects all CPL/GDAL temporary files. Always point to /tmp/gdal_cache and ensure directory creation at runtime.
  • GDAL_DISABLE_READDIR_ON_OPEN: Set to EMPTY_DIR to prevent unnecessary directory scans that generate temp metadata files.

For a complete reference of tunable parameters, consult the official GDAL Configuration Options documentation.

Python tempfile and Rasterio Integration

Python’s tempfile module defaults to the system temp directory, which maps to /tmp in Lambda. However, rasterio and fiona may spawn additional temporary files during write operations. Explicitly configure your Python environment to use /tmp and implement deterministic cleanup:

python
import tempfile
import rasterio
import os

def process_raster(input_uri, output_uri):
    # Force Python tempfile to /tmp
    tempfile.tempdir = "/tmp"
    os.makedirs("/tmp/gdal_cache", exist_ok=True)
    
    # Use context managers to guarantee cleanup
    with rasterio.open(input_uri) as src:
        profile = src.profile.copy()
        profile.update(dtype=rasterio.float32, count=1)
        
        with tempfile.NamedTemporaryFile(suffix=".tif", dir="/tmp", delete=False) as tmp:
            with rasterio.open(tmp.name, "w", **profile) as dst:
                # Process blocks here
                for _, window in src.block_windows():
                    data = src.read(window=window)
                    dst.write(data, window=window)
            
            # Upload to S3, then remove
            # s3.upload_file(tmp.name, bucket, key)
            os.unlink(tmp.name)

When extracting specific tiles or performing on-the-fly clipping, /tmp usage scales non-linearly with resolution and band count. For advanced patterns on bounding scratch space during high-throughput extraction, see Managing /tmp Storage Limits for GeoTIFF Extraction.

Monitoring, Failure Modes, and Cost Implications

Proactive monitoring prevents cascading failures in automated spatial pipelines. Lambda does not emit /tmp utilization metrics by default, so teams must instrument custom logging or use CloudWatch Embedded Metric Format (EMF).

Key Metrics to Track

  • DiskSpaceExhausted: Runtime error indicating /tmp overflow.
  • InitDuration & Duration: Correlate cold starts with /tmp provisioning size.
  • Custom tmp_usage_mb: Log shutil.disk_usage("/tmp").used at function start, mid-process, and exit.

Common Failure Modes

  1. Silent Cache Overflow: GDAL writes beyond CPL_TMPDIR when GDAL_CACHEMAX is unset, causing untracked /tmp growth.
  2. Warm Start Accumulation: Reused execution environments retain /tmp data. Subsequent invocations may inherit leftover files if cleanup logic is missing.
  3. Permission Denials: Attempting to write to /var/task (read-only) instead of /tmp triggers OSError: [Errno 30] Read-only file system.

Cost Modeling

Ephemeral storage pricing is calculated per GB-second. At current rates, 10 GB allocated for a 30-second execution costs approximately $0.0000000167 per GB-second. While negligible for infrequent runs, high-concurrency pipelines processing 4K rasters can accumulate measurable monthly overhead. Right-size allocations using historical peak usage plus a 20% buffer.

Production Best Practices for Spatial Pipelines

  1. Stream Over Stage: Whenever possible, use rasterio’s block_windows() or fiona generators to process data in chunks rather than loading entire datasets into /tmp.
  2. Idempotent Cleanup: Wrap all /tmp operations in try/finally blocks. AWS does not guarantee environment recycling; assume /tmp may contain stale data on warm starts.
  3. Security Boundaries: Restrict IAM policies to least-privilege S3 paths. /tmp is isolated but not encrypted by default; avoid writing sensitive credentials or unredacted PII to scratch space.
  4. Container Image Optimization: When using Lambda container images, set WORKDIR /tmp only if your entrypoint explicitly manages scratch space. Otherwise, keep /var/task clean to reduce cold start overhead.
  5. Fallback Strategies: Implement graceful degradation. If /tmp utilization exceeds 85%, abort non-critical transformations and queue payloads for step-function retry with increased storage allocation.

By treating ephemeral storage as a first-class architectural constraint rather than an afterthought, teams can build serverless GIS pipelines that scale predictably, avoid disk-bound failures, and maintain cost efficiency across variable geospatial workloads.