From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow this link.

SWOT Shapefile Data Conversion to CSV

Notebook showcasing how to merge/concatenate multiple shapefiles into a single file.

  • Utilizing the merged shapefile and converting it to a csv file.
  • Option to query the new dataset based on users choice; either ‘reach_id’ or water surface elevation (‘wse’), etc.
  • Using the queried variable to export it as a csv or shapefile.

Import libraries

import geopandas as gpd
import glob
from pathlib import Path
import pandas as pd
import os
import zipfile
import earthaccess

Before you start

Before you beginning this tutorial, make sure you have an account in the Earthdata Login, which is required to access data from the NASA Earthdata system. Please visit https://urs.earthdata.nasa.gov to register for an Earthdata Login account. It is free to create and only takes a moment to set up.

auth = earthaccess.login() 

Search for SWOT data

Let’s start our search for River Vector Shapefiles in North America. SWOT files come in “reach” and “node” versions in the same collection, here we want the 10km reaches rather than the nodes. We will also only get files for North America, or ‘NA’ and can call out a specific pass number that we want. Each dataset has it’s own shortname associate with it, for the SWOT River shapefiles, it is SWOT_L2_HR_RiverSP_2.0.

results = earthaccess.search_data(short_name = 'SWOT_L2_HR_RIVERSP_2.0', 
                                  #temporal = ('2024-02-01 00:00:00', '2024-02-29 23:59:59'), # can also specify by time
                                  granule_name = '*Reach*_009_NA*') # here we filter by Reach files (not node), pass=009, continent code=NA
Granules found: 5

During the science orbit, a pass will be repeated once every 21 days. A particular location may have different passes observe it within the 21 days, however. See the SWOT swath visualizer for your location!

Download the Data into a folder

earthaccess.download(results, "../datasets/data_downloads/SWOT_files/")
folder = Path("../datasets/data_downloads/SWOT_files")
 Getting 5 granules, approx download size: 0.03 GB
File SWOT_L2_HR_RiverSP_Reach_008_009_NA_20231214T141139_20231214T141150_PIC0_01.zip already downloaded

Unzip shapefiles in existing folder

for item in os.listdir(folder): # loop through items in dir
    if item.endswith(".zip"): # check for ".zip" extension
        zip_ref = zipfile.ZipFile(f"{folder}/{item}") # create zipfile object
        zip_ref.extractall(folder) # extract file to dir
        zip_ref.close() # close file

Opening multiple shapefiles from within a folder

Lets open all the shapefiles we’ve downloaded together into one database. This approach is ideal for a small number of granules, but if you’re looking to create large timeseries, consider using the PO.DAAC Hydrocron tool.

# Initialize list of shapefiles containing all dates
SWOT_HR_shps = []

# Loop through queried granules to stack all acquisition dates
for j in range(len(results)):
    filename = earthaccess.results.DataGranule.data_links(results[j], access='external')
    filename = filename[0].split("/")[-1]
    filename_shp = filename.replace('.zip','.shp')
    filename_shp_path = f"{folder}\{filename_shp}"
    SWOT_HR_shps.append(gpd.read_file(filename_shp_path)) 
# Combine granules from all acquisition dates into one dataframe
SWOT_HR_df = gpd.GeoDataFrame(pd.concat(SWOT_HR_shps, ignore_index=True))

# Sort dataframe by reach_id and time
SWOT_HR_df = SWOT_HR_df.sort_values(['reach_id', 'time'])

SWOT_HR_df
reach_id time time_tai time_str p_lat p_lon river_name wse wse_u wse_r_u ... p_wid_var p_n_nodes p_dist_out p_length p_maf p_dam_id p_n_ch_max p_n_ch_mod p_low_slp geometry
0 71224500951 -1.000000e+12 -1.000000e+12 no_data 48.517717 -93.692086 Rainy River -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 1480.031 53 244919.492 10586.381484 -1.000000e+12 0 1 1 0 LINESTRING (-93.76076 48.51651, -93.76035 48.5...
931 71224500951 -1.000000e+12 -1.000000e+12 no_data 48.517717 -93.692086 Rainy River -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 1480.031 53 244919.492 10586.381484 -1.000000e+12 0 1 1 0 LINESTRING (-93.76076 48.51651, -93.76035 48.5...
1854 71224500951 -1.000000e+12 -1.000000e+12 no_data 48.517717 -93.692086 Rainy River -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 1480.031 53 244919.492 10586.381484 -1.000000e+12 0 1 1 0 LINESTRING (-93.76076 48.51651, -93.76035 48.5...
2789 71224500951 -1.000000e+12 -1.000000e+12 no_data 48.517717 -93.692086 Rainy River -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 1480.031 53 244919.492 10586.381484 -1.000000e+12 0 1 1 0 LINESTRING (-93.76076 48.51651, -93.76035 48.5...
3728 71224500951 -1.000000e+12 -1.000000e+12 no_data 48.517717 -93.692086 Rainy River -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 1480.031 53 244919.492 10586.381484 -1.000000e+12 0 1 1 0 LINESTRING (-93.76076 48.51651, -93.76035 48.5...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
930 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
1853 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
2788 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
3727 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
4666 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...

4667 rows × 127 columns

Querying a Shapefile

Let’s get the attributes from a particular reach of the merged shapefile. If you want to search for a specific reach id or a specific length of river reach that is possible through a spatial query using Geopandas. Here, we’ll look at a river reach on Cook Slough in Oregon, ID: 78310700041. River IDs can be identified in the SWORD Database.

reach = SWOT_HR_df.query("reach_id == '77125000273'")
reach
reach_id time time_tai time_str p_lat p_lon river_name wse wse_u wse_r_u ... p_wid_var p_n_nodes p_dist_out p_length p_maf p_dam_id p_n_ch_max p_n_ch_mod p_low_slp geometry
930 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
1853 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
2788 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
3727 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...
4666 77125000273 -1.000000e+12 -1.000000e+12 no_data 17.952683 -99.906755 no_data -1.000000e+12 -1.000000e+12 -1.000000e+12 ... 283915.163 49 484179.822 9729.640027 -1.000000e+12 0 2 1 0 LINESTRING (-99.93256 17.94746, -99.93273 17.9...

5 rows × 127 columns

Converting to CSV

We can convert the merged timeseries geodataframe for this reach into a csv file.

gdf.to_csv(folder / 'csv_77125000273.csv')