Earth Engine Operations Guide

Last Updated: 2025-10-22 Status: Production-Ready

Overview

The Earth Engine adapter provides access to 100+ Google Earth Engine assets through the env-agents framework. This guide covers operational best practices, common challenges, and proven solutions for running production workloads.

Key Operational Challenges:

Hanging queries due to slow API responses
Out-of-range temporal queries causing null errors
Quota management and rate limiting
Optimizing query efficiency for large-scale pipelines

Production Performance:

4,789 clusters processed in ~2.7 hours
60-second timeout protection prevents indefinite hangs
Automatic temporal fallback for out-of-range dates
83% reduction in API calls through geometry optimization

Timeout Handling

Problem: Hanging Queries

Earth Engine's .getInfo() makes blocking HTTP requests through C extension code that can hang indefinitely when servers are slow or experiencing issues.

Symptoms:

SRTM: 2%|▍ | 72/4789 [04:22<∞, ???] (hangs forever - no progress, no error)

Root Cause:

Not quota-related (can restart immediately)
Blocking C extension HTTP calls cannot be interrupted by signal-based timeouts
21-minute gaps observed between successful queries
Consistent hangs after 70-90 queries

Solution: Threading-Based Timeout

Why signal.alarm() fails:

# DOES NOT WORK - signal can't interrupt C extension code
signal.alarm(60)
img.reduceRegion(...).getInfo()  # Will still hang

Working solution using threading:

def run_with_timeout(func, args=(), kwargs=None, timeout_sec=60):
    """Run a function with timeout using threading."""
    result = [None]
    exception = [None]

    def target():
        try:
            result[0] = func(*args, **kwargs)
        except Exception as e:
            exception[0] = e

    thread = threading.Thread(target=target, daemon=True)
    thread.start()
    thread.join(timeout=timeout_sec)

    if thread.is_alive():
        raise TimeoutError(f"Earth Engine query exceeded {timeout_sec}s timeout")

    if exception[0]:
        raise exception[0]

    return result[0]

Usage:

# Wrap the blocking call
def get_stats():
    return img.reduceRegion(
        reducer=ee.Reducer.mean(),
        geometry=region,
        scale=scale,
        bestEffort=True
    ).getInfo()

stats = run_with_timeout(get_stats, timeout_sec=60)

Why this works:

Daemon threads allow process to abandon hung HTTP calls
thread.join(timeout) returns control after timeout
Hung threads don't block process exit
Can retry after timeout with backoff

Retry Configuration

When timeout occurs, implement exponential backoff:

# In acquire_environmental_data.py
max_retries = 3
backoff_seconds = 60

if any(keyword in error_msg for keyword in ['quota', 'rate limit',
                                              'too many requests',
                                              'user rate limit exceeded',
                                              'timeout']):
    logger.warning(f"Transient error for {service_name} cluster {cluster_id}, "
                   f"attempt {attempt+1}/{max_retries}. "
                   f"Retrying after {backoff}s...")
    time.sleep(backoff)
    continue  # Retry

Behavior:

Query times out after 60s
Wait 60s (backoff)
Retry (usually succeeds)
After 3 failures, mark as "error" and move on
Can retry later with --clear SRTM --clear-status failed

Implementation Location

File: env_agents/adapters/earth_engine/production_adapter.py

All .getInfo() calls are wrapped with timeout protection:

Data queries: reduceRegion(...).getInfo()
Asset type detection: Cached, one-time check
ImageCollection metadata: Band names and time series data

Optimization Best Practices

Minimize .getInfo() Calls

Each .getInfo() is a round-trip HTTP request. Eliminate unnecessary calls.

Before (6 calls per query):

# 1 call for data
stats = img.reduceRegion(...).getInfo()

# 5 calls for geometry WKT!
geom_wkt = f"POLYGON(({region.coordinates().getInfo()[0][0][0]} " +
           f"{region.coordinates().getInfo()[0][0][1]}, " +
           f"{region.coordinates().getInfo()[0][1][0]} " +
           f"{region.coordinates().getInfo()[0][1][1]}, ..."

After (1 call per query):

# Pass bbox through from _fetch_rows to _query_image
minlon, minlat, maxlon, maxlat = bbox

# 1 call for data
stats = img.reduceRegion(...).getInfo()

# Zero additional calls - construct WKT from bbox we already have
wkt = f"POLYGON(({minlon} {minlat}, {maxlon} {minlat}, " +
      f"{maxlon} {maxlat}, {minlon} {maxlat}, {minlon} {minlat}))"

Impact: 83% reduction in API calls

Use Correct Geometries

Don't impose artificial buffers that lose spatial information.

Before (wrong):

# Always use 1km buffer, regardless of actual cluster extent
tight_minlat = center_lat - 0.005  # ~500m at equator
tight_maxlat = center_lat + 0.005
tight_minlon = center_lon - 0.005
tight_maxlon = center_lon + 0.005

After (correct):

# Use actual cluster bbox from DBSCAN clustering
if minlat == maxlat and minlon == maxlon:
    # Single point - add small buffer for environmental context (~500m)
    minlat = center_lat - 0.005
    maxlat = center_lat + 0.005
    minlon = center_lon - 0.005
    maxlon = center_lon + 0.005
else:
    # Multi-point - use actual extent
    return Geometry(type="bbox", coordinates=[minlon, minlat, maxlon, maxlat])

Cluster Distribution:

4,240 (89%): Single points → get 500m buffer
290 (6%): <1km multi-point clusters
233 (5%): 1-5km clusters
24 (<1%): 5-11km clusters
2 (<1%): >11km clusters (up to 44km extent)

Efficient Querying Patterns

Best Practices:

Cache metadata that doesn't change
- Asset types (Image vs ImageCollection)
- Band names
- Units and scale factors

Pass computed values through method chains

# Don't recompute bbox in every method
def _fetch_rows(self, spec):
    bbox = self._compute_bbox(spec)
    return self._query_image(bbox)  # Pass it through

Use bestEffort=True for reduceRegion

stats = img.reduceRegion(
    reducer=ee.Reducer.mean(),
    geometry=region,
    scale=scale,
    bestEffort=True  # Allows automatic scale adjustment
).getInfo()

Filter before processing

# Filter ImageCollection early
ic = ee.ImageCollection(asset_id) \
    .filterDate(start_date, end_date) \
    .filterBounds(region)

# Check size before processing
if ic.size().getInfo() == 0:
    # Handle empty collection

Review Checklist

When reviewing Earth Engine adapter code:

Are all blocking .getInfo() calls wrapped with timeouts?
Can we reduce the number of API calls per query?
Are we using correct spatial geometries (not artificial buffers)?
Do we have retry logic for transient failures?
Is there proper error handling and logging?
Are we caching expensive metadata lookups?
Is bestEffort=True used for reduceRegion?

Temporal Fallback Strategy

Problem: Out-of-Range Dates

Some Earth Engine ImageCollections have finite temporal coverage:

MODIS_LANDCOVER (MODIS/006/MCD12Q1): 2000-2019 (ends at 2019)
GOOGLE_EMBEDDINGS (GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL): 2017-present (sparse coverage)

Error when requesting out-of-range dates:

ic = ee.ImageCollection(asset_id).filterDate("2021-01-01", "2021-12-31")
count = ic.size().getInfo()  # Returns 0
first = ic.first()  # Returns null
bands = first.bandNames().getInfo()  # ERROR: "Parameter 'image' is required and may not be null"

Previous Behavior:

100% failure rate for MODIS_LANDCOVER (all 4,789 clusters)
Intermittent failures for GOOGLE_EMBEDDINGS at sparse locations

Solution: Automatic Fallback with Metadata

The adapter automatically falls back to closest available data and annotates observations with complete metadata.

Algorithm:

Try filtering with requested date range
If ic.size().getInfo() == 0, trigger fallback
Get available temporal range at this location
Select fallback strategy:
- Requested date after dataset end → Use most recent year
- Requested date before dataset start → Use oldest year
- Requested range overlaps but no data → Use full available range
Re-filter with fallback dates
Annotate all observations with fallback metadata

Implementation

Located in env_agents/adapters/earth_engine/production_adapter.py:283-454:

def _query_image_collection(self, region, bbox, center_lat, center_lon,
                            start_date: str, end_date: str) -> List[Dict]:
    """Query ImageCollection asset with automatic temporal fallback"""

    # Store original requested dates
    requested_start = start_date
    requested_end = end_date
    fallback_applied = False
    fallback_reason = None

    # Try requested range
    ic = ee.ImageCollection(self.asset_id).filterDate(start_date, end_date).filterBounds(region)
    count = ic.size().getInfo()

    # Fallback if empty
    if count == 0:
        logger.info(f"No data in requested range {start_date} to {end_date}, checking available range...")

        # Get actual available range
        full_collection = ee.ImageCollection(self.asset_id).filterBounds(region)
        dates = full_collection.aggregate_array('system:time_start').getInfo()

        if not dates:
            return []  # No data at this location at all

        # Convert timestamps to dates
        from datetime import datetime
        available_dates = [datetime.utcfromtimestamp(ts/1000).strftime('%Y-%m-%d')
                          for ts in dates]
        available_start = min(available_dates)
        available_end = max(available_dates)

        # Select fallback strategy
        if requested_start > available_end:
            # Requested date too late → use most recent year
            end_year = available_end[:4]
            start_date = f"{end_year}-01-01"
            end_date = f"{end_year}-12-31"
            fallback_reason = f"requested_date_{requested_start}_after_dataset_end_{available_end}"
        elif requested_end < available_start:
            # Requested date too early → use oldest year
            start_year = available_start[:4]
            start_date = f"{start_year}-01-01"
            end_date = f"{start_year}-12-31"
            fallback_reason = f"requested_date_{requested_end}_before_dataset_start_{available_start}"
        else:
            # Overlap but no data → use full range
            start_date = available_start
            end_date = available_end
            fallback_reason = f"no_data_in_overlap_using_full_range_{available_start}_to_{available_end}"

        fallback_applied = True
        logger.info(f"Temporal fallback applied: {fallback_reason}")
        logger.info(f"Using date range: {start_date} to {end_date}")

        # Re-filter with fallback dates
        ic = ee.ImageCollection(self.asset_id).filterDate(start_date, end_date).filterBounds(region)

Metadata Annotation

All observations returned after fallback include these attributes:

Field	Type	Description	Example
`requested_date_range`	string	Original requested dates	`"2021-01-01_to_2021-12-31"`
`actual_date_range`	string	Dates actually used for query	`"2019-01-01_to_2019-12-31"`
`temporal_fallback_applied`	boolean	Whether fallback was triggered	`true` or `false`
`temporal_fallback_reason`	string	Explanation of why fallback occurred	`"requested_date_2021-01-01_after_dataset_end_2019-12-31"`

Example observation with fallback:

{
  "observation_id": "ee_MODIS_006_MCD12Q1_2019-01-01_LC_Prop1",
  "dataset": "EARTH_ENGINE",
  "time": "2019-01-01",
  "variable": "ee:LC_Prop1",
  "value": 7.583912611717975,
  "latitude": 37.8,
  "longitude": -122.4,
  "attributes": {
    "asset_id": "MODIS/006/MCD12Q1",
    "scale_m": 500,
    "requested_date_range": "2021-01-01_to_2021-12-31",
    "actual_date_range": "2019-01-01_to_2019-12-31",
    "temporal_fallback_applied": true,
    "temporal_fallback_reason": "requested_date_2021-01-01_after_dataset_end_2019-12-31"
  }
}

Configuration Updates

For datasets with known temporal limits:

# scripts/acquire_environmental_data.py
"MODIS_LANDCOVER": {
    "asset_id": "MODIS/006/MCD12Q1",
    "time_range": ("2019-01-01", "2019-12-31"),  # Use most recent available year
    # Adapter will automatically fall back if needed
}

Downstream Analysis

Filter observations by fallback status:

import pandas as pd

# Load observations
df = pd.read_parquet("observations.parquet")

# Identify fallback observations
df['fallback'] = df['attributes'].apply(
    lambda x: x.get('temporal_fallback_applied', False)
)

# Filter to only requested dates
df_requested = df[~df['fallback']]

# Analyze fallback observations separately
df_fallback = df[df['fallback']]
print(f"Fallback rate: {len(df_fallback) / len(df) * 100:.1f}%")

Test Results

Test Script: notebooks/test_temporal_fallback.py

Test	Requested	Actual	Fallback	Result
MODIS_LANDCOVER	2021	2019	✅ Applied	13 obs from 2019
GOOGLE_EMBEDDINGS	2021	2021	❌ Not needed	64 obs from 2021
MODIS_NDVI	2025	2025	❌ Not needed	192 obs from 2025

Production Impact:

Before: 100% failure rate for MODIS_LANDCOVER
After: 0% failure rate with automatic 2019 fallback

Quota Management

Rate Limiting

Earth Engine has per-user quotas for:

Concurrent requests
Requests per second
Total daily computation

Implemented Strategy:

# Sequential processing with rate limiting
for cluster in clusters:
    try:
        result = adapter.fetch(spec)
        time.sleep(2.0)  # 2-second delay between queries
    except QuotaError:
        time.sleep(60)  # 60-second backoff on quota
        continue

Configuration:

Base delay: 2 seconds between queries
Quota backoff: 60 seconds after quota error
Max retries: 3 attempts per cluster
Timeout: 60 seconds per attempt

Retry Strategies

Sequential vs Parallel Execution:

# Sequential (Recommended for Earth Engine)
for spec in specs:
    try:
        result = router.fetch(dataset="EARTH_ENGINE", spec=spec)
    except TimeoutError:
        time.sleep(60)
        result = router.fetch(dataset="EARTH_ENGINE", spec=spec)

# Parallel (Use with caution)
from concurrent.futures import ThreadPoolExecutor

# Limit workers to avoid quota
with ThreadPoolExecutor(max_workers=2) as executor:
    results = executor.map(fetch_with_retry, specs)

Recommendation: Use sequential processing for large workloads (>100 queries) to avoid quota issues.

Error Keywords for Retry

TRANSIENT_ERRORS = [
    'quota',
    'rate limit',
    'too many requests',
    'user rate limit exceeded',
    'timeout',
    'server error',
    '503',
    '429'
]

if any(keyword in error_msg.lower() for keyword in TRANSIENT_ERRORS):
    # Retry with backoff
    time.sleep(backoff_seconds)
    continue

Troubleshooting

Issue: Query Hangs Indefinitely

Symptoms:

No progress for >5 minutes
No error messages
Can restart immediately (not quota)

Solution:

Ensure timeout protection is enabled (see Timeout Handling section)
Check timeout is set appropriately (60s recommended)
Verify threading-based timeout is used (not signal-based)

Diagnosis:

# Add logging around .getInfo() calls
logger.info(f"Starting Earth Engine query at {time.time()}")
result = run_with_timeout(get_stats, timeout_sec=60)
logger.info(f"Completed Earth Engine query at {time.time()}")

Issue: "Parameter 'image' is required" Error

Symptoms:

Error: Image.bandNames: Parameter 'image' is required and may not be null

Root Cause:

Requested date range outside dataset's temporal coverage
ImageCollection filter returns empty collection

Solution:

Enable temporal fallback (see Temporal Fallback Strategy section)
Verify date ranges align with dataset availability
Check Earth Engine Data Catalog for temporal coverage

Diagnosis:

# Check if collection is empty before processing
ic = ee.ImageCollection(asset_id).filterDate(start, end).filterBounds(region)
count = ic.size().getInfo()

if count == 0:
    logger.warning(f"Empty collection for {asset_id} at {start} to {end}")
    # Apply fallback

Issue: Quota Exceeded

Symptoms:

EEException: User memory limit exceeded
EEException: Too many concurrent requests

Solution:

Increase delay between queries
Reduce concurrent requests (use sequential processing)
Reduce spatial extent (smaller bboxes)
Use bestEffort=True for reduceRegion

Configuration:

# Increase delay
time.sleep(5.0)  # 5 seconds instead of 2

# Reduce scale for faster queries
stats = img.reduceRegion(
    reducer=ee.Reducer.mean(),
    geometry=region,
    scale=1000,  # 1km instead of 500m
    bestEffort=True
).getInfo()

Issue: Wrong Spatial Context

Symptoms:

Single-point clusters appear correct
Multi-point clusters only return data for single point
Missing spatial variability

Root Cause:

Using uniform 1km buffer for all clusters
Ignoring actual cluster extents from DBSCAN

Solution:

Use actual cluster bounding boxes (see Optimization Best Practices)
Add buffer only for single-point clusters
Verify bbox computation in get_cluster_geometry()

Diagnosis:

# Check cluster extent
print(f"Cluster {cluster_id}:")
print(f"  Points: {num_points}")
print(f"  Extent: {minlon:.4f},{minlat:.4f} to {maxlon:.4f},{maxlat:.4f}")
print(f"  Width: {(maxlon - minlon) * 111:.1f} km")
print(f"  Height: {(maxlat - minlat) * 111:.1f} km")

Issue: Excessive API Calls

Symptoms:

Slow query performance (>5s per cluster)
High quota usage
Many .getInfo() calls in logs

Root Cause:

Redundant geometry fetches
Recomputing metadata on every query
Not caching asset information

Solution:

Pass bbox through method chain (see Optimization Best Practices)
Cache asset type, band names, and metadata
Minimize calls to .getInfo()

Diagnosis:

# Count .getInfo() calls
import functools

original_getInfo = ee.ComputedObject.getInfo
call_count = [0]

def counted_getInfo(self, *args, **kwargs):
    call_count[0] += 1
    return original_getInfo(self, *args, **kwargs)

ee.ComputedObject.getInfo = counted_getInfo

# Run query
result = adapter.fetch(spec)
print(f"Total .getInfo() calls: {call_count[0]}")

Performance Metrics

Production Workload

Scale:

4,789 clusters
100+ Earth Engine assets
Multiple temporal ranges

Performance:

Total time: ~2.7 hours
Average query time: ~2 seconds
Timeout rate: <1% (timeouts recover on retry)
Fallback rate (MODIS_LANDCOVER): 100% (expected, uses 2019 data)

Before vs After Optimization

Metric	Before	After	Improvement
Hangs	Infinite (after 70-90 queries)	0 (60s timeout)	✅ Eliminated
API calls per query	6 (1 data + 5 geometry)	1 (data only)	83% reduction
Spatial accuracy	Wrong (1km uniform)	Correct (actual extent)	✅ Fixed
Temporal coverage	Fails on out-of-range	Auto-fallback with metadata	✅ Fixed
User experience	Manual intervention	Fully automated	✅ Production-ready

API Configuration: Legacy vs Cloud Project

Current Configuration

env-agents uses the earthengine-legacy API by default.

This is the correct choice for this project. Here's why.

Implementation

File: env_agents/adapters/earth_engine/production_adapter.py:134

def _ensure_ee_authenticated():
    """Authenticate with Earth Engine (singleton)"""
    # ... find credentials ...

    if credentials_path:
        credentials = ee.ServiceAccountCredentials(email=None, key_file=str(credentials_path))
        ee.Initialize(credentials)  # ← No project specified = earthengine-legacy

Why no project is specified:

Service account: gee-agent@ecognita-470619.iam.gserviceaccount.com
When project parameter is omitted, Earth Engine defaults to earthengine-legacy
This is visible in API error messages: https://earthengine.googleapis.com/v1/projects/earthengine-legacy/value:compute

Legacy API vs Cloud Project API

Feature	Legacy API (`earthengine-legacy`)	Cloud Project API (`ecognita-470619`)
Status	✅ Production-ready, working	❌ Requires additional IAM permissions
Authentication	Service account credentials only	Service account + Cloud Project IAM roles
Performance	Fast (2-7s for complex queries)	Unknown (permission denied in testing)
Quota Management	Shared pool, generous for research	Project-specific, transparent monitoring
Cost	Free for research use	Free for research, but requires billing setup
Monitoring	Limited visibility	Full Cloud Console monitoring
Setup Complexity	✅ Simple (just credentials file)	⚠️ Complex (requires IAM configuration)
Permissions Required	Service account credentials	`earthengine.computations.create` role

Performance Benchmark Results

Date: 2025-10-21 Test: 5 query types × 2-3 iterations each

Legacy API Performance (earthengine-legacy):

✅ SRTM Small bbox (9 samples): 2.02s avg - 6 rows
✅ SRTM Large bbox (25 samples): 3.96s avg - 22 rows
✅ MODIS Single month (9×2): 3.48s avg - 139 rows
✅ MODIS Full year (1×23): 0.69s avg - 276 rows
✅ MODIS Large bbox (25×2): 7.34s avg - 600 rows

Cloud Project API Performance (ecognita-470619):

❌ Permission 'earthengine.computations.create' denied on resource 'projects/ecognita-470619'

Conclusion: Legacy API is working perfectly and Cloud Project API requires additional configuration.

Why Legacy API is the Right Choice

Advantages:

✅ Works out of the box - No additional IAM setup required
✅ Proven performance - All tests pass, fast query times
✅ Free for research - No billing concerns
✅ Simple deployment - Just drop credentials file
✅ Stable - Long-term supported by Google for backward compatibility

When Cloud Project API would be better:

You need detailed usage monitoring per project
You're hitting rate limits (we're not)
You need project-specific quota increases
You have full Cloud Project admin access
Google deprecates the legacy API (not currently planned)

Cloud Project API Configuration (If Needed)

To switch to Cloud Project API, you would need:

Grant IAM permissions:

Service Account: gee-agent@ecognita-470619.iam.gserviceaccount.com
Required Roles:
- roles/earthengine.viewer (read access)
- roles/earthengine.writer (create computations) ← Currently missing

Update code:

import json

# Load project_id from credentials
with open(credentials_path, 'r') as f:
    creds_data = json.load(f)
    project_id = creds_data.get('project_id')

# Initialize with explicit project
credentials = ee.ServiceAccountCredentials(email=None, key_file=str(credentials_path))
ee.Initialize(credentials, project=project_id)

Verify billing:
- Check: https://console.cloud.google.com/billing?project=ecognita-470619
- Ensure Earth Engine usage won't incur unexpected charges
- Research usage is typically free, but verify project settings
Test thoroughly:
- Run tests/benchmark_ee_api_performance.py
- Compare performance with legacy API
- Verify all query types work
- Check for any quota differences

Recommendation

Keep using earthengine-legacy unless:

You encounter rate limiting (not currently happening)
You need project-specific usage analytics
Google announces deprecation (monitor Earth Engine announcements)

Current performance is excellent:

Most queries: 2-4 seconds
Complex grid sampling: 7 seconds for 600 observations
Zero permission issues
Well-tested and production-proven

Testing API Configuration

Benchmark script: tests/benchmark_ee_api_performance.py

Run to compare APIs:

python tests/benchmark_ee_api_performance.py

What it tests:

Single Image queries (SRTM elevation)
ImageCollection queries (MODIS NDVI)
Small and large bboxes
Grid sampling with multiple spatial points
Time series queries

Expected behavior:

Legacy API: All tests pass, fast performance
Cloud API: Permission denied (without IAM setup)

Best Practices Summary

For Operators

Always enable timeout protection - Use threading-based timeouts for all .getInfo() calls
Monitor fallback rates - Check temporal_fallback_applied in observations
Use sequential processing - Avoid parallel execution for large workloads
Configure appropriate delays - 2s between queries is optimal for most workloads
Implement retry logic - 3 retries with 60s backoff for transient errors

For Developers

Minimize .getInfo() calls - Pass computed values through method chains
Use correct geometries - Don't impose artificial buffers
Cache metadata - Store asset types, band names, and temporal ranges
Enable fallback - Handle out-of-range dates gracefully
Log extensively - Track timing, fallbacks, and errors for debugging

Review Checklist

Before deploying Earth Engine adapter changes:

All .getInfo() calls wrapped with 60s timeout
Geometry computed once and passed through
Temporal fallback enabled with metadata annotation
Retry logic includes 'timeout' keyword
Logging covers timing, fallbacks, and errors
Test with out-of-range dates (e.g., MODIS_LANDCOVER 2021)
Test with hanging query (timeout fires and recovers)
Verify spatial context (multi-point clusters use actual extent)

Phase 1 Robustness Fixes (2025-10-22)

Overview

Comprehensive testing revealed edge cases with high-frequency data and unsupported asset types. Phase 1 fixes address 2 of 3 identified issues using simple heuristics.

Test Configuration:

17 assets across 3 priority tiers (MODIS, SMAP, Landsat, etc.)
Texas bounding box (270km × 165km)
Full year temporal range (2020)
Resolution: medium (3×3 or 5×5 grid)

Results: 15/17 assets passing (88%)

Issues Addressed

1. Asset Type Validation ✅ FIXED

Problem: Trying to sample FeatureCollection/Table assets as raster Images produced 25 repeated error messages and returned 0 rows without clear explanation.

Fix: Early asset type detection with clear error message

# production_adapter.py:207-225
asset_info = ee.data.getAsset(self.asset_id)
actual_type = asset_info.get('type', 'UNKNOWN')

if actual_type in ['FeatureCollection', 'TABLE', 'Table']:
    raise ValueError(
        f"Asset '{self.asset_id}' is type '{actual_type}', not supported. "
        f"This adapter only supports Image and ImageCollection (raster) assets."
    )

Impact:

ESA/WorldCereal: 25 errors → single fast failure (0.45s)
Clear guidance to users about unsupported types

2. High-Frequency Data Handling ✅ PARTIALLY FIXED

Problem: Daily collections with hundreds of images exceeded Earth Engine's 5000-element FeatureCollection limit.

Fix: Three-part strategy

Adaptive spatial sampling - Reduce to 3×3 grid for collections with >100 images
Conservative batching threshold - Trigger batching at 1000 samples (was 2000)
Smaller batch size - Target 800 samples/batch (was 2000)

# production_adapter.py:691-718
if count > 100:  # Daily or more frequent
    adaptive_max_samples = min(max_samples, 9)  # Force 3×3

needs_batching = (n_samples * count) > 1000  # Conservative threshold

max_images_per_batch = max(10, 800 // n_samples)  # Smaller batches

Impact:

✅ MODIS/MOD09GA_006_NDSI: FIXED (was failing with >5000 elements)
- 365 days × 9 points × 1 band = 3,285 samples ✅
⚠️ NASA/SMAP: Still fails with memory limit exceeded (209s)
- 365 days × 9 points × 46 bands = 152,010 samples ❌

Test Results Summary

Priority	Success Rate	Notes
PRIORITY_1 (Core)	7/7 (100%)	All MODIS, SoilGrids, SRTM passing
PRIORITY_2 (Additional)	3/4 (75%)	SMAP still problematic
PRIORITY_3 (Edge Cases)	5/6 (83%)	NDSI fixed, WorldCereal fails fast

Performance Highlights:

MOD13Q1: 4.45s (6,900 rows, 25 coords)
MOD11A2: 1.93s (756 rows, 22 coords)
NDSI: 19.97s (5,102 rows, 9 coords) ← Was failing!

Known Limitations

SMAP Soil Moisture (NASA/SMAP/SPL4SMGP/008)

Issue: 46 bands × 365 days × 9 spatial points = 152,010 values exceeds Earth Engine memory limit

Workarounds:

Reduce temporal range (recommended):

time_range=("2020-01-01", "2020-01-31")  # Single month

Use low resolution (single centroid):

resolution="low"  # Aggregated to single point

Explicit spatial cap:

extra={"max_samples": 4}  # Force 2×2 grid

Root Cause: Cannot filter bands via Earth Engine sampleRegions() API - must query all 46 bands or none.

Table/FeatureCollection Assets

Unsupported asset types:

ESA/WorldCereal (TABLE)
LARSE/GEDI (FeatureCollection)

Error message:

ValueError: Asset 'ESA/WorldCereal/AEZ/v100' is type 'TABLE', which is not supported.
This adapter only supports Image and ImageCollection (raster) assets.
Vector/Table assets require different query methods.

Why Simple Heuristics?

Current Approach: Count-based thresholds (count > 100, product > 1000, batch = 800)

Limitations:

No asset metadata about temporal frequency (daily vs weekly vs monthly)
Cannot query band count efficiently without extra API calls
Earth Engine lacks standardized period/cadence fields

Future Enhancement (Phase 2): Could query band count at initialization and use volume-based thresholds:

estimated_volume = spatial_samples × temporal_count × n_bands

if estimated_volume > 100_000:
    # Force aggressive batching

See docs/adapters/SAMPLING_STRATEGY.md for detailed analysis and Phase 2 proposal.

Testing

Comprehensive test: tests/test_comprehensive_assets.py

Tests 17 assets across 3 priority tiers
Validates grid sampling, batching, error handling
Run time: ~5 minutes

python tests/test_comprehensive_assets.py

Change History

Date	Change	Status
2025-09-29	Threading-based timeout implemented	✅ Deployed
2025-09-29	Geometry optimization (1 API call instead of 6)	✅ Deployed
2025-09-29	Actual cluster bboxes (not uniform 1km)	✅ Deployed
2025-09-30	Temporal fallback with metadata annotation	✅ Deployed
2025-09-30	Consolidated operations guide created	✅ Deployed
2025-10-20	Grid sampling for Images and ImageCollections	✅ Deployed
2025-10-21	API configuration documentation (legacy vs Cloud Project)	✅ Deployed
2025-10-21	Server-side batching optimization (30-150× speedup)	✅ Deployed
2025-10-22	Phase 1 robustness fixes (asset validation, adaptive sampling, batch tuning)	✅ Current

FilesExpand file tree

EARTH_ENGINE_NOTES.md

Latest commit

History

EARTH_ENGINE_NOTES.md

File metadata and controls

Earth Engine Operations Guide

Overview

Timeout Handling

Problem: Hanging Queries

Solution: Threading-Based Timeout

Retry Configuration

Implementation Location

Optimization Best Practices

Minimize .getInfo() Calls

Use Correct Geometries

Efficient Querying Patterns

Review Checklist

Temporal Fallback Strategy

Problem: Out-of-Range Dates

Solution: Automatic Fallback with Metadata

Implementation

Metadata Annotation

Configuration Updates

Downstream Analysis

Test Results

Quota Management

Rate Limiting

Retry Strategies

Error Keywords for Retry

Troubleshooting

Issue: Query Hangs Indefinitely

Issue: "Parameter 'image' is required" Error

Issue: Quota Exceeded

Issue: Wrong Spatial Context

Issue: Excessive API Calls

Performance Metrics

Production Workload

Before vs After Optimization

API Configuration: Legacy vs Cloud Project

Current Configuration

Implementation

Legacy API vs Cloud Project API

Performance Benchmark Results

Why Legacy API is the Right Choice

Cloud Project API Configuration (If Needed)

Recommendation

Testing API Configuration

Best Practices Summary

For Operators

For Developers

Review Checklist

Phase 1 Robustness Fixes (2025-10-22)

Overview

Issues Addressed

1. Asset Type Validation ✅ FIXED

2. High-Frequency Data Handling ✅ PARTIALLY FIXED

Test Results Summary

Known Limitations

SMAP Soil Moisture (NASA/SMAP/SPL4SMGP/008)

Table/FeatureCollection Assets

Why Simple Heuristics?

Testing

Related Documentation

Change History