import glob
import os
import requests
import s3fs
import fiona
import netCDF4 as nc
import h5netcdf
import xarray as xr
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
import hvplot.xarray
import earthaccess
from earthaccess import Auth, DataCollections, DataGranules, Store
For an updated notebook using the latest data, see this notebook in the PO.DAAC Cookbook
SWOT Hydrology Dataset Exploration in the Cloud
Accessing and Visualizing SWOT Datasets
Requirement:
This tutorial can only be run in an AWS cloud instance running in us-west-2: NASA Earthdata Cloud data in S3 can be directly accessed via earthaccess
python library; this access is limited to requests made within the US West (Oregon) (code: us-west-2
) AWS region.
Learning Objectives:
- Access SWOT HR data prodcuts (archived in NASA Earthdata Cloud) within the AWS cloud, without downloading to local machine
- Visualize accessed data for a quick check
SWOT Level 2 KaRIn High Rate Version 1.1 (where available) Datasets:
River Vector Shapefile - SWOT_L2_HR_RIVERSP_1.1
Lake Vector Shapefile - SWOT_L2_HR_LAKESP_1.1
Water Mask Pixel Cloud NetCDF - SWOT_L2_HR_PIXC_1.1
Water Mask Pixel Cloud Vector Attribute NetCDF - SWOT_L2_HR_PIXCVec_1.1
Raster NetCDF - SWOT_L2_HR_Raster_1.1
Single Look Complex Data product - SWOT_L1B_HR_SLC_1.1
Notebook Author: Cassie Nickles, NASA PO.DAAC (Aug 2023) || Other Contributors: Zoe Walschots (PO.DAAC Summer Intern 2023), Catalina Taglialatela (NASA PO.DAAC), Luis Lopez (NASA NSIDC DAAC)
Last updated: 4 Dec 2023
Libraries Needed
Earthdata Login
An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. If you don’t already have one, please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up. We use earthaccess
to authenticate your login credentials below.
= earthaccess.login() auth
Single File Access
1. River Vector Shapefiles
The s3 access link can be found using earthaccess
data search. Since this collection consists of Reach and Node files, we need to extract only the granule for the Reach file. We do this by filtering for the ‘Reach’ title in the data link.
Alternatively, Earthdata Search (see tutorial) can be used to search in a map graphic user interface.
For additional tips on spatial searching of SWOT HR L2 data, see also PO.DAAC Cookbook - SWOT Chapter tips section.
Search for the data of interest
# Retrieves granule from the day we want, in this case by passing to `earthdata.search_data` function the data collection shortname, temporal bounds, and for restricted data one must specify the search count
= earthaccess.search_data(short_name = 'SWOT_L2_HR_RIVERSP_1.1',
river_results = ('2023-04-08 00:00:00', '2023-04-22 23:59:59'),
temporal = '*Reach*_024_NA*') # here we filter by Reach files (not node), pass #24 and continent code=NA for North America
granule_name # granule_name = '*Reach*_013_NA*') # here we filter by Reach files (not node), pass #13 and continent code=NA
Granules found: 15
Set up an s3fs
session for Direct Cloud Access
s3fs
sessions are used for authenticated access to s3 bucket and allows for typical file-system style operations. Below we create session by passing in the data access information.
= earthaccess.get_s3fs_session(results=river_results) fs_s3
Create Fiona session to work with zip and embedded shapefiles in the AWS Cloud
The native format for this data is a .zip file, and we want the .shp file within the .zip file, so we will create a Fiona AWS session using the credentials from setting up the s3fs session above to access the shapefiles within the zip files. If we don’t do this, the alternative would be to download the data to the cloud environment (e.g. EC2 instance, user S3 bucket) and extract the .zip file there.
=fiona.session.AWSSession(
fiona_session=fs_s3.storage_options["key"],
aws_access_key_id=fs_s3.storage_options["secret"],
aws_secret_access_key=fs_s3.storage_options["token"]
aws_session_token )
# Get the link for the first zip file
= earthaccess.results.DataGranule.data_links(river_results[0], access='direct')[0]
river_link
# We use the zip+ prefix so fiona knows that we are operating on a zip file
= f"zip+{river_link}"
river_shp_url
with fiona.Env(session=fiona_session):
= gpd.read_file(river_shp_url)
SWOT_HR_shp1
#view the attribute table
SWOT_HR_shp1
reach_id | time | time_tai | time_str | p_lat | p_lon | river_name | wse | wse_u | wse_r_u | ... | p_wid_var | p_n_nodes | p_dist_out | p_length | p_maf | p_dam_id | p_n_ch_max | p_n_ch_mod | p_low_slp | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 71185400013 | 7.342856e+08 | 7.342856e+08 | 2023-04-08T16:12:43Z | 55.405348 | -106.628388 | no_data | 3.864838e+02 | 1.139410e+00 | 1.135850e+00 | ... | 7863771.149 | 48 | 61917.017 | 9521.873154 | -1.000000e+12 | 0 | 10 | 2 | 0 | LINESTRING (-106.60903 55.44509, -106.60930 55... |
1 | 71185400021 | 7.342856e+08 | 7.342856e+08 | 2023-04-08T16:12:43Z | 55.452342 | -106.601114 | no_data | -1.000000e+12 | -1.000000e+12 | -1.000000e+12 | ... | 0.000 | 10 | 53346.297 | 1902.305299 | -1.000000e+12 | 0 | 5 | 1 | 0 | LINESTRING (-106.59293 55.45986, -106.59320 55... |
2 | 71185400033 | -1.000000e+12 | -1.000000e+12 | no_data | 55.632220 | -106.451323 | no_data | -1.000000e+12 | -1.000000e+12 | -1.000000e+12 | ... | 758315.173 | 14 | 28676.430 | 2858.149671 | -1.000000e+12 | 0 | 7 | 2 | 0 | LINESTRING (-106.47121 55.62881, -106.47073 55... |
3 | 71185400041 | 7.342856e+08 | 7.342856e+08 | 2023-04-08T16:12:43Z | 55.361687 | -106.646694 | no_data | 3.861999e+02 | 9.139000e-02 | 1.588000e-02 | ... | 0.000 | 5 | 62976.523 | 1059.505878 | -1.000000e+12 | 0 | 5 | 1 | 0 | LINESTRING (-106.64608 55.36668, -106.64607 55... |
4 | 71185400053 | 7.342856e+08 | 7.342856e+08 | 2023-04-08T16:12:43Z | 55.350062 | -106.647210 | no_data | 3.861795e+02 | 1.022600e-01 | 4.855000e-02 | ... | 3214.190 | 8 | 64492.945 | 1516.422084 | -1.000000e+12 | 0 | 1 | 1 | 0 | LINESTRING (-106.64728 55.35669, -106.64736 55... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
594 | 75211000291 | -1.000000e+12 | -1.000000e+12 | no_data | 26.100287 | -98.270345 | Rio Bravo | -1.000000e+12 | -1.000000e+12 | -1.000000e+12 | ... | 123.027 | 53 | 238333.030 | 10660.888100 | -1.000000e+12 | 0 | 1 | 1 | 0 | LINESTRING (-98.25015 26.07251, -98.25039 26.0... |
595 | 75211000301 | -1.000000e+12 | -1.000000e+12 | no_data | 26.115209 | -98.305631 | Rio Grande | -1.000000e+12 | -1.000000e+12 | -1.000000e+12 | ... | 242.204 | 53 | 248976.010 | 10642.980241 | -1.000000e+12 | 0 | 1 | 1 | 0 | LINESTRING (-98.27467 26.11517, -98.27497 26.1... |
596 | 75211000683 | 7.342861e+08 | 7.342861e+08 | 2023-04-08T16:21:20Z | 25.955223 | -97.159176 | Rio Grande | 2.871000e-01 | 9.005000e-02 | 3.080000e-03 | ... | 436.214 | 18 | 9238.006 | 3611.160551 | -1.000000e+12 | 0 | 1 | 1 | 0 | LINESTRING (-97.14980 25.95092, -97.15011 25.9... |
597 | 75211000691 | 7.342861e+08 | 7.342861e+08 | 2023-04-08T16:21:20Z | 25.957129 | -97.209134 | Rio Grande | 3.374000e-01 | 9.102000e-02 | 1.360000e-02 | ... | 348.855 | 53 | 19926.935 | 10688.929343 | -1.000000e+12 | 0 | 1 | 1 | 0 | LINESTRING (-97.16943 25.96060, -97.16972 25.9... |
598 | 75211000701 | 7.342861e+08 | 7.342861e+08 | 2023-04-08T16:21:20Z | 25.945001 | -97.279869 | Rio Grande | 4.375000e-01 | 9.212000e-02 | 1.965000e-02 | ... | 203.786 | 53 | 30608.499 | 10681.563344 | -1.000000e+12 | 0 | 1 | 1 | 0 | LINESTRING (-97.25170 25.94769, -97.25200 25.9... |
599 rows × 127 columns
Quickly plot the SWOT river data
# Simple plot
= plt.subplots(figsize=(7,5))
fig, ax =ax, color='black') SWOT_HR_shp1.plot(ax
# # Another way to plot geopandas dataframes is with `explore`, which also plots a basemap
SWOT_HR_shp1.explore()