Skip to content

[FEATURE] Integrate with geopandas for Native Geospatial Data Handling #530

Open
@amrutha97

Description

@amrutha97

Goal

Add built-in support for reading, transforming, and displaying geospatial data via the geopandas library, making it easier to build map-based dashboards and spatial analytics apps with Preswald.


📌 Motivation

Preswald users are increasingly working with geospatial datasets—such as shapefiles, GeoJSON, and spatial CSVs—but currently must manually transform and flatten geometry fields before displaying them.

By natively integrating with geopandas, Preswald can:

  • Seamlessly support geospatial file formats
  • Simplify loading and preprocessing of spatial data
  • Prepare data for use with upcoming geo() component or Plotly maps
  • Unlock use cases in real estate, environment, logistics, and more

✅ Acceptance Criteria

  • Add geopandas as a supported backend dependency (pip install geopandas)
  • Automatically use geopandas.read_file() when:
    • type = "geojson" or type = "shapefile" in preswald.toml
  • Handle loading of:
    • .geojson, .shp, .gpkg
  • Convert GeoDataFrame to a regular DataFrame with flattened geometry (WKT or GeoJSON format)
  • Add flatten_geometry = true|false toggle in data config
  • Ensure compatibility with get_df() and downstream components (table(), plotly(), etc.)
  • Raise informative errors if geopandas is missing or file path is invalid

🛠 Implementation Plan

1. Update Data Loader in data.py

import geopandas as gpd

def load_geospatial_source(config):
    df = gpd.read_file(config["path"])
    
    if config.get("flatten_geometry", True):
        df["geometry"] = df["geometry"].apply(lambda g: g.__geo_interface__)
    
    return df

Detect .geojson, .shp, .gpkg, or type = "geojson" in preswald.toml.

2. Example preswald.toml

[data.city_boundaries]
type = "geojson"
path = "data/cities.geojson"
flatten_geometry = true

🧪 Testing Plan

  • Load sample .geojson and .shp files
  • Confirm connect() and get_df() return valid DataFrame
  • Use table(df) and plotly() to inspect spatial columns
  • Test behavior with and without flatten_geometry

📚 Docs To Update

  • docs/configuration.mdx → Add type = "geojson"/shapefile + flatten_geometry
  • docs/sdk/geo.mdx (future) → Add examples using geometry column
  • Add note about installing geopandas via extras:
    pip install preswald[geo]

🧩 Related Files

  • preswald/engine/managers/data.py
  • preswald.toml
  • Optional sample: examples/earthquakes.geojson

🔮 Future Enhancements

  • Detect and reproject coordinates (.to_crs())
  • Add spatial filter DSL (where geometry intersects...)
  • Integrate with geo() map-rendering component
  • Support live streaming geospatial data

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions