From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow this link.

Working with SWOT Level 2 Water Mask Raster Image Data Product:

Local Machine Download Version

Author: Nicholas Tarpinian, PO.DAAC

Summary & Learning Objectives

Notebook showcasing how to work with multiple files from the SWOT Raster Image data product version C (aka 2.0) on a local machine

  • Utilizing the earthaccess Python package. For more information visit: https://nsidc.github.io/earthaccess/
  • Option to query the new dataset based on user’s choice; choosing between two resolutions either by ‘100m’ or ‘250m’.
  • Visualizing multiple raster images on a single map.
  • Stacking multiple raster images and creating a time dimension to analyze over time.
  • Adjusting images based on quality flag

Requirements

1. Compute environment

This tutorial is written to run in the following environment: - Local compute environment e.g. laptop, server: this tutorial can be run on your local machine

2. Earthdata Login

An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.

Import libraries

import os
from os.path import isfile, basename, abspath
import xarray as xr
import numpy as np
from datetime import datetime
from pathlib import Path
import hvplot
import hvplot.xarray 
import earthaccess

Authentication with earthaccess

In this notebook, we will be calling the authentication in the below cell.

auth = earthaccess.login()

Search for SWOT Raster products using earthaccess

Each dataset has it’s own unique collection concept ID. For the SWOT_L2_HR_Raster_2.0 dataset, we can find the collection ID here.

For this tutorial, we are looking at the Lake Mead Reservoir in the United States.

We used bbox finder to get the exact coordinates for our area of interest.

results = earthaccess.search_data(
    short_name = 'SWOT_L2_HR_RASTER_2.0',
    bounding_box=(-115.112686,35.740939,-114.224167,36.937819),
    temporal =('2024-02-01 12:00:00', '2024-02-01 23:59:59'),
    granule_name = '*_100m_*',
    count =200
)
Granules found: 2
earthaccess.download(results, "data_downloads/SWOT_Raster_LakeMead/")
 Getting 2 granules, approx download size: 0.07 GB
Accessing cloud dataset using dataset endpoint credentials: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
Downloaded: data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_045F_20240201T183814_20240201T183835_PIC0_01.nc
Downloaded: data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_046F_20240201T183834_20240201T183855_PIC0_01.nc
['data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_045F_20240201T183814_20240201T183835_PIC0_01.nc',
 'data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_046F_20240201T183834_20240201T183855_PIC0_01.nc']

Visualizing Multiple Tiles

Let’s now visualize multiple raster tiles in a folder and explore the data.

Utilizing xarray.open_mfdataset which supports the opening of multiple files.

folder_path = "data_downloads/SWOT_Raster_LakeMead/"

file_paths = [os.path.join(folder_path, file) for file in os.listdir(folder_path) if file.endswith('.nc')]
ds = xr.open_mfdataset(file_paths, combine='nested', concat_dim='x')
ds
<xarray.Dataset>
Dimensions:                  (y: 2784, x: 3074)
Coordinates:
  * y                        (y) float64 3.899e+06 3.899e+06 ... 4.177e+06
  * x                        (x) float64 5.682e+05 5.683e+05 ... 6.928e+05
Data variables: (12/39)
    crs                      (x) object b'1' b'1' b'1' b'1' ... b'1' b'1' b'1'
    longitude                (y, x) float64 dask.array<chunksize=(512, 512), meta=np.ndarray>
    latitude                 (y, x) float64 dask.array<chunksize=(512, 512), meta=np.ndarray>
    wse                      (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    wse_qual                 (y, x) float32 dask.array<chunksize=(2784, 1536), meta=np.ndarray>
    wse_qual_bitwise         (y, x) float64 dask.array<chunksize=(768, 768), meta=np.ndarray>
    ...                       ...
    load_tide_fes            (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    load_tide_got            (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    pole_tide                (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    model_dry_tropo_cor      (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    model_wet_tropo_cor      (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    iono_cor_gim_ka          (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
Attributes: (12/49)
    Conventions:                   CF-1.7
    title:                         Level 2 KaRIn High Rate Raster Data Product
    source:                        Ka-band radar interferometer
    history:                       2024-02-05T08:50:33Z : Creation
    platform:                      SWOT
    references:                    V1.2.1
    ...                            ...
    x_min:                         568200.0
    x_max:                         721700.0
    y_min:                         3898600.0
    y_max:                         4052100.0
    institution:                   CNES
    product_version:               01
raster_plot = ds.wse.hvplot.quadmesh(x='x', y='y', rasterize=True, title=f'SWOT Raster 100m: Lake Mead Reservoir')
raster_plot.opts(width=700, height=600, colorbar=True)

Creating a Time Series

SWOT Raster product does not include a time dimension, each file is a snapshot in time, but it can be inserted by extracting from the file name.

  1. Expand the time range of your earthaccess search to get an adequate range.
  2. Extract the datetime from the s3 file name then concatenate based on the new time dimension.
  3. Save the output as a new NetCDF file.
time_results = earthaccess.search_data(
    short_name = 'SWOT_L2_HR_RASTER_2.0',
    bounding_box=(-114.502048,36.060175,-114.390983,36.210182),
    temporal =('2024-01-25 00:00:00', '2024-03-04 23:59:59'),
    granule_name = '*_100m_*',
    count =200
)
Granules found: 3
earthaccess.download(time_results, "data_downloads/SWOT_Raster_LakeMead_Time/")
 Getting 3 granules, approx download size: 0.12 GB
Accessing cloud dataset using dataset endpoint credentials: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
Downloaded: data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_205_109F_20240201T075048_20240201T075109_PIC0_01.nc
Downloaded: data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_496_046F_20240211T170050_20240211T170111_PIC0_01.nc
Downloaded: data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_011_205_109F_20240222T043554_20240222T043615_PIC0_01.nc
['data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_205_109F_20240201T075048_20240201T075109_PIC0_01.nc',
 'data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_496_046F_20240211T170050_20240211T170111_PIC0_01.nc',
 'data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_011_205_109F_20240222T043554_20240222T043615_PIC0_01.nc']
folder_path = "data_downloads/SWOT_Raster_LakeMead_Time/"
file_paths = [os.path.join(folder_path, file) for file in os.listdir(folder_path) if file.endswith('.nc')]
def add_time_dimension(ds):
    # Extract date/time string from filename
    file_date = os.path.basename(ds.encoding['source']).split("_")[-4][:15]
    # Convert the date string to a datetime object
    time_value = datetime.strptime(file_date, "%Y%m%dT%H%M%S")
    # Assign the time coordinate to the dataset
    ds.coords['time'] = time_value
    return ds
datasets = []
file_names = []

for file_path in file_paths:
    dataset = xr.open_dataset(file_path)
    datasets.append(add_time_dimension(dataset))
    file_names.append(os.path.basename(file_path))
    dataset.close()
# sorting the time dimension in correct order
datasets.sort(key=lambda ds: ds.time.values)
ds2 = xr.concat(datasets, dim='time')
ds2
<xarray.Dataset>
Dimensions:                  (x: 1549, y: 1549, time: 3)
Coordinates:
  * x                        (x) float64 6.788e+05 6.789e+05 ... 8.336e+05
  * y                        (y) float64 3.9e+06 3.901e+06 ... 4.055e+06
  * time                     (time) datetime64[ns] 2024-02-01T07:50:48 ... 20...
Data variables: (12/39)
    crs                      (time) object b'1' b'1' b'1'
    longitude                (time, y, x) float64 nan nan nan ... nan nan nan
    latitude                 (time, y, x) float64 nan nan nan ... nan nan nan
    wse                      (time, y, x) float32 nan nan nan ... nan nan nan
    wse_qual                 (time, y, x) float32 nan nan nan ... nan nan nan
    wse_qual_bitwise         (time, y, x) float64 nan nan nan ... nan nan nan
    ...                       ...
    load_tide_fes            (time, y, x) float32 nan nan nan ... nan nan nan
    load_tide_got            (time, y, x) float32 nan nan nan ... nan nan nan
    pole_tide                (time, y, x) float32 nan nan nan ... nan nan nan
    model_dry_tropo_cor      (time, y, x) float32 nan nan nan ... nan nan nan
    model_wet_tropo_cor      (time, y, x) float32 nan nan nan ... nan nan nan
    iono_cor_gim_ka          (time, y, x) float32 nan nan nan ... nan nan nan
Attributes: (12/49)
    Conventions:                   CF-1.7
    title:                         Level 2 KaRIn High Rate Raster Data Product
    source:                        Ka-band radar interferometer
    history:                       2024-02-05T12:55:01Z : Creation
    platform:                      SWOT
    references:                    V1.2.1
    ...                            ...
    x_min:                         680100.0
    x_max:                         829300.0
    y_min:                         3903300.0
    y_max:                         4052400.0
    institution:                   CNES
    product_version:               01
timeplot = ds2.wse.hvplot.image(y='y', x='x')
timeplot.opts(width=700, height=500, colorbar=True)

Let’s plot the wse quality flag, wse_qual which ranges 0-3 where 0=good, 1=suspect, 2=degraded, 3=bad (as described when printing variable with xarray).

timeplot = ds2.wse_qual.hvplot.image(y='y', x='x')
timeplot.opts(width=700, height=500, colorbar=True)

Masking a variable with its quaility flag

variable_to_mask = ds2['wse']
mask_variable = ds2['wse_qual']
# Define the condition for masking based on the range of the quaility flag
mask_condition = mask_variable < 2

masked_variable = variable_to_mask.where(mask_condition)
# Update the masked variable in the dataset
ds2['wse'] = masked_variable

ds2['wse'].hvplot.image(y='y', x='x').opts(width=700, height=500, colorbar=True)

Our end product is a time series of the data showing only the values where the quality flag is either good (0) or suspect (1).

We can also save the time series with quality flags implemented to a new netcdf file.

output_folder = 'data_downloads/SWOT_Raster_Time/'
os.makedirs(output_folder, exist_ok=True)
output_netcdf_path = os.path.join(output_folder, "Output_Time.nc")
ds2.to_netcdf(output_netcdf_path)

print(f"Output complete: {output_netcdf_path}")
Output complete: data_downloads/SWOT_Raster_Time/Output_Time.nc

Appendix: Alternate Plot

# # Alternate plotting with matplotlib
# %matplotlib inline

# import matplotlib.pyplot as plt
# from matplotlib import animation
# from matplotlib.animation import FuncAnimation, PillowWriter
# from IPython.display import display, Image, HTML

# variable_name = 'wse'
# data = ds2[variable_name]

# fig, ax = plt.subplots(figsize=(10, 8))
# fig.set_tight_layout({'rect': [0.01, 0.01, 1.0, 1.0]})

# contour = ax.contourf(data.isel(time=0), cmap='viridis')
# cbar = plt.colorbar(contour)
# cbar.set_label('Water Surface Elevation (meters)', fontsize=14) 
# times = ds2.time.values

# # Function to update the plot for each time step
# def update(frame):
#     ax.clear()
#     contour = ax.contourf(data.isel(time=frame), cmap='viridis',)
#     formatted_time = str(times[frame])[:-7]
#     ax.set_title(f'Date: {formatted_time}')
#     ax.set_xlabel('Longitude', fontsize=14)
#     ax.set_ylabel('Latitude', fontsize=14)
#     ax.text(0.5, 1.05, 'SWOT Raster 100M Lake Mead Reservoir', transform=ax.transAxes, ha='center', fontsize=14)
#     return contour,

# # Creating a gif animation
# ani = animation.FuncAnimation(fig, update, repeat=True, frames=len(data['time']), blit=True, interval=3000)

# output = ('./time_series.gif')
# ani.save(output, writer='pillow', fps=.5)

# with open(output,'rb') as f:
#     display(Image(data=f.read(), format='gif'))

# plt.close(fig)
# ds2.close()