Lebovits/issu 1016 ndvi #1171

nlebovits · 2025-04-13T00:33:06Z

New Feature: Add NDVI and One-Year NDVI Change to ETL Pipeline

Description

This PR addresses issue #1016. It uses Planetary Computer to query Sentinel 2 satellite imagery for the most recent summer and the summer prior to that. It returns a raster with the most recent summer's NDVI at 10m resolution, plus the one-year change in NDVI from the prior summer, each as a separate band in the raster. It saves the raster as a file in the tmp directory (currently coming in around 65MB), then uses exactextract to get the mean NDVI and mean NDVI change for each parcel in the primary feature layer dataset (fall 500k+ properties in Philadelphia). The code is set up to check whether the data for the most recent summer already exists in the tmp folder. If it does, it skips straight to extraction. The full code without cached data runs in 10-15 minutes. With the cached data, it takes about 3 minutes. I've also added unit tests in test/test_ndvi.py.

vercel · 2025-04-13T00:33:12Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
vacant-lots-proj	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Apr 14, 2025 0:03am

adamzev

I like the number and small size of the helper classes. With our services, I'm planning on moving to a pattern where the service takes the feature layer but then calls to a transform and merge function that just rely on gdfs so they can be tested without worrying about the feature layer class.

I added some suggestions to reduce how much we need to mock and also had a concern regarding whether the row length of results could ever be different than primary_featurelayer.gdf.

This should be some very interesting and helpful functionality!

adamzev · 2025-04-13T01:51:08Z

data/src/new_etl/data_utils/ndvi.py

+from ..constants.services import CENSUS_BGS_URL
+
+
+def get_current_summer_year() -> int:


This may produce surprising behavior around the end of May/early June if we are trying to rerun a pipeline or recreate old results. we may want to add an optional date as a parameter so we can control this when needed. That would eliminate the need for freezegun in the tests.

I haven't looked yet how this is used elsewhere but is there a lag between when it becomes summer and when data is available for that summer we need to worry about?

adamzev · 2025-04-13T01:52:55Z

data/src/new_etl/data_utils/ndvi.py

+        return today.year
+
+
+def get_bbox_from_census_data() -> Tuple[Polygon, Tuple[float, float, float, float]]:


To make this easier to test, we may want this function to take an optional gdf rather than always reading from CENSUS_BGS_URL.

adamzev · 2025-04-13T01:57:59Z

data/src/test/test_ndvi.py

+        self.assertFalse(is_cache_valid(wrong_year_path, 2025))
+        self.assertTrue(is_cache_valid(self.test_raster_path, 2025))
+
+    @patch("geopandas.read_file")


See if patch/mocks can be removed after allowing an optional df being sent to this function

adamzev · 2025-04-13T01:59:47Z

data/src/test/test_ndvi.py

+    @patch("new_etl.data_utils.ndvi.odc.stac.stac_load")
+    @patch("new_etl.data_utils.ndvi.get_current_summer_year")
+    @patch("new_etl.data_utils.ndvi.pystac_client.Client.open")
+    def test_generate_ndvi_data_failure(


Unless you are actually concerned the correct exception won't be hit, I'd just remove this test.

adamzev · 2025-04-13T02:01:03Z

data/src/test/test_ndvi.py

+
+    @patch("tempfile.gettempdir")
+    @patch("new_etl.data_utils.ndvi.get_current_summer_year")
+    @patch("new_etl.data_utils.ndvi.get_bbox_from_census_data")


it is hard to reason about tests when so much is patched. See if we can eliminate at least 2 of the patches from my suggestions.

adamzev · 2025-04-13T02:08:59Z

data/src/new_etl/data_utils/ndvi.py

+
+    # Update the primary feature layer with the new columns from results
+    # (instead of creating a new one)
+    for column in results.columns:


Given that results is on a cleaned version of the gdf and we're adding columns into a version that hasn't been cleaned, is there the possibility for misalignment or different row lengths here when merging in the columns?

It is probably worth making this logic a separate function and trying out some filtering on a df and making sure things merge okay.

github-actions · 2025-04-21T00:32:22Z

This PR has been marked as stale because it has been open for 7 days with no activity.

nlebovits added 2 commits April 12, 2025 21:00

feat: add ndvi and ndvi change to new etl pipeline

3a27cff

clean up tests

8cb3630

nlebovits requested a review from adamzev April 13, 2025 00:33

github-actions bot added backend frontend labels Apr 13, 2025

vercel bot deployed to Preview April 13, 2025 00:33 View deployment

write to same tmp dir as elsewhere in project

4daa396

vercel bot deployed to Preview April 13, 2025 00:44 View deployment

adamzev reviewed Apr 13, 2025

View reviewed changes

task: clean up unit tests

611c439

vercel bot deployed to Preview April 14, 2025 00:03 View deployment

github-actions bot added the stale label Apr 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lebovits/issu 1016 ndvi #1171

Lebovits/issu 1016 ndvi #1171

Uh oh!

nlebovits commented Apr 13, 2025

Uh oh!

vercel bot commented Apr 13, 2025 •

edited

Loading

Uh oh!

adamzev left a comment

Uh oh!

adamzev Apr 13, 2025

Uh oh!

adamzev Apr 13, 2025

Uh oh!

adamzev Apr 13, 2025

Uh oh!

adamzev Apr 13, 2025

Uh oh!

adamzev Apr 13, 2025

Uh oh!

adamzev Apr 13, 2025

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

Uh oh!

		from ..constants.services import CENSUS_BGS_URL


		def get_current_summer_year() -> int:

		return today.year


		def get_bbox_from_census_data() -> Tuple[Polygon, Tuple[float, float, float, float]]:

Lebovits/issu 1016 ndvi #1171

Are you sure you want to change the base?

Lebovits/issu 1016 ndvi #1171

Uh oh!

Conversation

nlebovits commented Apr 13, 2025

New Feature: Add NDVI and One-Year NDVI Change to ETL Pipeline

Description

Uh oh!

vercel bot commented Apr 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamzev left a comment

Choose a reason for hiding this comment

Uh oh!

adamzev Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

adamzev Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

adamzev Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

adamzev Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

adamzev Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

adamzev Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

Uh oh!

vercel bot commented Apr 13, 2025 •

edited

Loading