API INTEGRATION · DATA PIPELINE
Cross-Portal Data Synchronization for Debris Programs
How Watershed GeoData builds automated pipelines that keep parcel and operational data synchronized between monitoring firm portals and program ArcGIS Online environments without manual reconciliation.
SERVICE TYPE
System Design & Development
DOMAIN
Data Integration
PLATFORM
Python · ArcGIS API · OAuth 2.0
ARCHITECTURE
Two-Part Pipeline
The Problem
Two systems, one truth needed
On debris programs with separate monitoring firms, two authoritative data environments typically operate in parallel. The monitoring firm maintains a portal with parcel status data authoritative for field operations. The program maintains an ArcGIS Online system authoritative for management, eligibility tracking, and FEMA reporting.
Neither system communicates with the other. Status updates in the monitoring portal are not reflected in program dashboards, and vice versa. Program managers make decisions based on stale data, and manual reconciliation consumes significant staff time that should be spent on operations.
Our Approach
Automated bidirectional awareness
We build two-part Python pipelines that authenticate against both systems via OAuth, pull current parcel status from the monitoring portal, map field schemas to the program’s standardized format, and perform intelligent upsert operations against the master feature service.
The two-part architecture separates data extraction from the merge phase, allowing each to run independently for troubleshooting and enabling partial updates when one system is temporarily unavailable.
Two-Part Pipeline Architecture
Part 1: External Portal Extraction
Authenticates against the monitoring firm’s portal via OAuth. Queries their feature services for current parcel status and field operation records. Transforms field schemas to match the program’s standardized naming conventions. Stages transformed data in an intermediate feature service for validation before merging.
Part 2: Master Feature Merge
Reads staged data from the intermediate service. Performs parcel-level matching using unique identifiers. Executes upsert operations which updates existing records, inserts new ones, preserves program-only fields that don’t exist in the external system. Tags all records with source and sync timestamp for full data provenance.
Design Decisions
Two-Part Separation: Extraction runs independently from the merge, allowing troubleshooting of source-side issues without touching production data.
Upsert vs. Truncate: Unlike a consolidation tool’s truncate-and-reload approach, synchronization pipelines use upsert operations to preserve program-originated fields (eligibility notes, appeal status) that do not exist in the external system.
Conflict Resolution: External data takes precedence for operational fields (removal status, dates). Program data takes precedence for management fields (eligibility, appeal status). Conflicts are logged for manual review.
Authentication: OAuth 2.0 via ArcGIS Pro portal session for both systems. No stored credentials in the script.
Error Isolation: Per-parcel error handling means a single malformed record does not halt the entire sync. Failed records are logged and skipped with a summary report.
Audit Trail: Every sync tags records with source system, timestamp, and batch ID. This enables forensic review of when specific data entered the system and from which source.
Python · ArcGIS API for Python · OAuth 2.0 · REST API · Upsert Operations · Field Schema Mapping · Data Provenance
Need to Synchronize Data Between Disconnected Systems?
Watershed GeoData builds automated sync pipelines that keep multiple platforms aligned without manual data entry or copy-paste workflows.
