From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow this link

Access SWOT Oceanography Data in the Cloud

Summary

This notebook will show direct access of PO.DAAC archived products in the Earthdata Cloud in AWS Simple Storage Service (S3). In this demo, we will showcase the usage of SWOT Level 2 Low Rate products:

  1. SWOT Level 2 KaRIn Low Rate Sea Surface Height Data Product - shortname SWOT_L2_LR_SSH_2.0
  2. SWOT Level 2 Nadir Altimeter Interim Geophysical Data Record with Waveforms - SSHA Version 2.0 - shortname SWOT_L2_NALT_IGDR_SSHA_2.0
    • This is a subcollection of the parent collection: SWOT_L2_NALT_IGDR_2.0

We will access the data from inside the AWS cloud (us-west-2 region, specifically) and load a time series made of multiple netCDF files into a single xarray dataset.

Requirement:

This tutorial can only be run in an AWS cloud instance running in us-west-2 region.

This instance will cost approximately $0.0832 per hour. The entire demo can run in considerably less time.

Learning Objectives:

  • authenticate for earthaccess Python Library using your NASA Earthdata Login
  • access DAAC data directly from the in-region S3 bucket without moving or downloading any files to your local (cloud) workspace
  • plot the first time step in the data

Note: no files are being downloaded off the cloud, rather, we are working with the data in the AWS cloud.

Libraries Needed:

import xarray as xr
import s3fs
import cartopy.crs as ccrs
from matplotlib import pyplot as plt
import earthaccess
from earthaccess import Auth, DataCollections, DataGranules, Store
%matplotlib inline

Earthdata Login

An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up. We use earthaccess to authenticate your login credentials below.

auth = earthaccess.login() 

1. SWOT Level 2 KaRIn Low Rate Sea Surface Height Data Product

Access Files without any Downloads to your running instance

Here, we use the earthaccess Python library to search for and then load the data directly into xarray without downloading any files. This dataset is currently restricted to a select few people, and can only be accessed using the version of earthaccess reinstalled above. If zero granules are returned, make sure the correct version ‘0.5.4’ is installed.

#retrieves granule from the day we want
karin_results = earthaccess.search_data(short_name = 'SWOT_L2_LR_SSH_2.0', 
                                        temporal = ("2024-02-01 12:00:00", "2024-02-01 19:43:00"), 
                                        granule_name = '*Expert*') # filter by files with "Expert" in file name. This collection has subcollections of 'Basic', 'Windwave', 'Unsmoothed' and 'Expert' granules.
Granules found: 10

Open with xarray

The files we are looking at are about 11-13 MB each. So the 10 we’re looking to access are about ~100 MB total.

#opens granules and load into xarray dataset
ds = xr.open_mfdataset(earthaccess.open(karin_results), combine='nested', concat_dim="num_lines", decode_times=False, engine='h5netcdf')
ds
Opening 10 granules, approx size: 0.32 GB
using endpoint: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
<xarray.Dataset>
Dimensions:                                (num_lines: 98660, num_pixels: 69,
                                            num_sides: 2)
Coordinates:
    latitude                               (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    longitude                              (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    latitude_nadir                         (num_lines) float64 dask.array<chunksize=(9866,), meta=np.ndarray>
    longitude_nadir                        (num_lines) float64 dask.array<chunksize=(9866,), meta=np.ndarray>
Dimensions without coordinates: num_lines, num_pixels, num_sides
Data variables: (12/98)
    time                                   (num_lines) float64 dask.array<chunksize=(9866,), meta=np.ndarray>
    time_tai                               (num_lines) float64 dask.array<chunksize=(9866,), meta=np.ndarray>
    ssh_karin                              (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    ssh_karin_qual                         (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    ssh_karin_uncert                       (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    ssha_karin                             (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    ...                                     ...
    swh_ssb_cor_source                     (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    swh_ssb_cor_source_2                   (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    wind_speed_ssb_cor_source              (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    wind_speed_ssb_cor_source_2            (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    volumetric_correlation                 (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
    volumetric_correlation_uncert          (num_lines, num_pixels) float32 dask.array<chunksize=(9866, 69), meta=np.ndarray>
Attributes: (12/62)
    Conventions:                                   CF-1.7
    title:                                         Level 2 Low Rate Sea Surfa...
    institution:                                   CNES
    source:                                        Ka-band radar interferometer
    history:                                       2024-02-03T22:27:17Z : Cre...
    platform:                                      SWOT
    ...                                            ...
    ellipsoid_semi_major_axis:                     6378137.0
    ellipsoid_flattening:                          0.0033528106647474805
    good_ocean_data_percent:                       76.4772191457865
    ssha_variance:                                 0.4263933333980923
    references:                                    V1.2.1
    equator_longitude:                             -5.36
    • Conventions :
      CF-1.7
      title :
      Level 2 Low Rate Sea Surface Height Data Product - Expert SSH with Wind and Wave
      institution :
      CNES
      source :
      Ka-band radar interferometer
      history :
      2024-02-03T22:27:17Z : Creation
      platform :
      SWOT
      reference_document :
      D-56407_SWOT_Product_Description_L2_LR_SSH
      contact :
      podaac@jpl.nasa.gov
      cycle_number :
      10
      pass_number :
      210
      equator_time :
      2024-02-01T11:57:39.935000Z
      short_name :
      L2_LR_SSH
      product_file_id :
      Expert
      crid :
      PIC0
      product_version :
      01
      pge_name :
      PGE_L2_LR_SSH
      pge_version :
      5.0.2
      time_coverage_start :
      2024-02-01T11:31:57.844839
      time_coverage_end :
      2024-02-01T12:23:25.880560
      geospatial_lon_min :
      270.91792399999997
      geospatial_lon_max :
      78.362457
      geospatial_lat_min :
      -78.271942
      geospatial_lat_max :
      78.27206799999999
      left_first_longitude :
      270.91792399999997
      left_first_latitude :
      78.27200599999999
      left_last_longitude :
      78.343086
      left_last_latitude :
      -77.05370099999999
      right_first_longitude :
      270.93575
      right_first_latitude :
      77.053837
      right_last_longitude :
      78.36245699999999
      right_last_latitude :
      -78.27186999999999
      wavelength :
      0.008385803020979021
      transmit_antenna :
      minus_y
      xref_l1b_lr_intf_file :
      SWOT_L1B_LR_INTF_010_210_20240201T113154_20240201T122329_PIC0_01.nc
      xref_l2_nalt_gdr_files :
      SWOT_IPN_2PfP010_209_20240201_104031_20240201_113158.nc, SWOT_IPN_2PfP010_210_20240201_113158_20240201_122325.nc, SWOT_IPN_2PfP010_211_20240201_122325_20240201_131452.nc
      xref_l2_rad_gdr_files :
      SWOT_IPRAD_2PaP010_209_20240201_104027_20240201_113202_PIC0_01.nc, SWOT_IPRAD_2PaP010_210_20240201_113154_20240201_122329_PIC0_01.nc, SWOT_IPRAD_2PaP010_211_20240201_122321_20240201_131455_PIC0_01.nc
      xref_int_lr_xover_cal_file :
      SWOT_INT_LR_XOverCal_20240131T233132_20240201T233223_PIC0_01.nc
      xref_statickarincal_files :
      SWOT_StaticKaRInCalAdjustableParam_20000101T000000_20991231T235959_20230823T210000_v106.nc
      xref_param_l2_lr_precalssh_file :
      SWOT_Param_L2_LR_PreCalSSH_20000101T000000_20991231T235959_20230815T120500_v301.nc
      xref_orbit_ephemeris_file :
      SWOT_POR_AXVCNE20240202_103657_20240131_225923_20240202_005923.nc
      xref_reforbittrack_files :
      SWOT_RefOrbitTrack125mPass1_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt, SWOT_RefOrbitTrack125mPass2_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt
      xref_meteorological_sealevel_pressure_files :
      SMM_PMA_AXVCNE20240201_164030_20240201_060000_20240201_060000, SMM_PMA_AXVCNE20240201_171613_20240201_120000_20240201_120000, SMM_PMA_AXVCNE20240202_030828_20240201_180000_20240201_180000
      xref_meteorological_wettroposphere_files :
      SMM_WEA_AXVCNE20240201_164030_20240201_060000_20240201_060000, SMM_WEA_AXVCNE20240201_171613_20240201_120000_20240201_120000, SMM_WEA_AXVCNE20240202_030828_20240201_180000_20240201_180000
      xref_meteorological_wind_files :
      SMM_VWA_AXVCNE20240201_164030_20240201_060000_20240201_060000, SMM_UWA_AXVCNE20240201_164030_20240201_060000_20240201_060000, SMM_UWA_AXVCNE20240201_171613_20240201_120000_20240201_120000, SMM_VWA_AXVCNE20240201_171613_20240201_120000_20240201_120000, SMM_UWA_AXVCNE20240202_030828_20240201_180000_20240201_180000, SMM_VWA_AXVCNE20240202_030828_20240201_180000_20240201_180000
      xref_meteorological_surface_pressure_files :
      SMM_PSA_AXVCNE20240201_174042_20240201_060000_20240201_060000, SMM_PSA_AXVCNE20240201_174042_20240201_120000_20240201_120000, SMM_PSA_AXVCNE20240202_054023_20240201_180000_20240201_180000
      xref_meteorological_temperature_files :
      SMM_T2M_AXPCNE20240201_174042_20240201_060000_20240201_060000.grb, SMM_T2M_AXPCNE20240201_174042_20240201_120000_20240201_120000.grb, SMM_T2M_AXPCNE20240202_054023_20240201_180000_20240201_180000.grb
      xref_meteorological_water_vapor_files :
      SMM_CWV_AXPCNE20240201_174042_20240201_060000_20240201_060000.grb, SMM_CWV_AXPCNE20240201_174042_20240201_120000_20240201_120000.grb, SMM_CWV_AXPCNE20240202_054023_20240201_180000_20240201_180000.grb
      xref_meteorological_cloud_liquid_water_files :
      SMM_CLW_AXPCNE20240201_174042_20240201_060000_20240201_060000.grb, SMM_CLW_AXPCNE20240201_174042_20240201_120000_20240201_120000.grb, SMM_CLW_AXPCNE20240202_054023_20240201_180000_20240201_180000.grb
      xref_model_significant_wave_height_files :
      SMM_SWH_AXPCNE20240201_174042_20240201_060000_20240201_060000.grb, SMM_SWH_AXPCNE20240201_174042_20240201_120000_20240201_120000.grb, SMM_SWH_AXPCNE20240202_054023_20240201_180000_20240201_180000.grb
      xref_gim_files :
      JPLQ0320.24I
      xref_pole_location_file :
      SMM_PO1_AXXCNE20240203_020000_19900101_000000_20240801_000000
      xref_dac_files :
      SMM_MOG_AXPCNE20240201_203002_20240201_060000_20240201_060000, SMM_MOG_AXPCNE20240201_203002_20240201_120000_20240201_120000, SMM_MOG_AXPCNE20240202_074502_20240201_180000_20240201_180000
      xref_precipitation_files :
      SMM_LSR_AXFCNE20240201_065551_20240201_060000_20240201_060000.grb, SMM_CRR_AXFCNE20240201_065551_20240201_060000_20240201_060000.grb, SMM_LSR_AXFCNE20240201_185554_20240201_120000_20240201_120000.grb, SMM_CRR_AXFCNE20240201_185554_20240201_120000_20240201_120000.grb, SMM_LSR_AXFCNE20240201_185554_20240201_180000_20240201_180000.grb, SMM_CRR_AXFCNE20240201_185554_20240201_180000_20240201_180000.grb
      xref_sea_ice_mask_files :
      SMM_ICS_AXFCNE20240202_042003_20240201_000000_20240201_235959.nc, SMM_ICN_AXFCNE20240202_041506_20240201_000000_20240201_235959.nc
      xref_wave_model_files :
      SMM_WMA_AXPCNE20240202_072016_20240201_030000_20240202_000000.grb
      xref_geco_database_version :
      v102
      ellipsoid_semi_major_axis :
      6378137.0
      ellipsoid_flattening :
      0.0033528106647474805
      good_ocean_data_percent :
      76.4772191457865
      ssha_variance :
      0.4263933333980923
      references :
      V1.2.1
      equator_longitude :
      -5.36
    • Cross Over Calibration Correction

      In order to get the corrected SSHA, we must compute a new column like the following:

      ds['ssha_karin_corrected'] = ds.ssha_karin + ds.height_cor_xover
      ds.ssha_karin_corrected
      <xarray.DataArray 'ssha_karin_corrected' (num_lines: 98660, num_pixels: 69)>
      dask.array<add, shape=(98660, 69), dtype=float64, chunksize=(9866, 69), chunktype=numpy.ndarray>
      Coordinates:
          latitude         (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
          longitude        (num_lines, num_pixels) float64 dask.array<chunksize=(9866, 69), meta=np.ndarray>
          latitude_nadir   (num_lines) float64 dask.array<chunksize=(9866,), meta=np.ndarray>
          longitude_nadir  (num_lines) float64 dask.array<chunksize=(9866,), meta=np.ndarray>
      Dimensions without coordinates: num_lines, num_pixels
      • Plot

        plt.figure(figsize=(15, 5))
        ax = plt.axes(projection=ccrs.PlateCarree())
        ax.set_global()
        ds.ssha_karin.plot.pcolormesh(
         ax=ax, transform=ccrs.PlateCarree(), x="longitude", y="latitude", vmin = -1, vmax=1, cmap='coolwarm', add_colorbar=True
        )
        ax.coastlines()

        2. SWOT Level 2 Nadir Altimeter Interim Geophysical Data Record with Waveforms - SSHA Version 1.0

        Access Files without any Downloads to your running instance

        Here, we use the earthaccess Python library to search for and then load the data directly into xarray without downloading any files.

        #retrieves granule from the day we want
        nadir_results = earthaccess.search_data(short_name = 'SWOT_L2_NALT_IGDR_SSHA_2.0', temporal = ("2024-01-30 12:00:00", "2024-01-30 19:43:00"))
        Granules found: 10
        for g in nadir_results:
            print(earthaccess.results.DataGranule.data_links(g, access='direct'))
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_154_20240130_113056_20240130_122223.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_155_20240130_122223_20240130_131350.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_156_20240130_131350_20240130_140516.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_157_20240130_140516_20240130_145643.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_158_20240130_145643_20240130_154810.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_159_20240130_154810_20240130_163937.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_160_20240130_163937_20240130_173104.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_161_20240130_173104_20240130_182230.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_162_20240130_182230_20240130_191357.nc']
        ['s3://podaac-swot-ops-cumulus-protected/SWOT_L2_NALT_IGDR_2.0/SWOT_IPR_2PfP010_163_20240130_191357_20240130_200524.nc']
        #opens granules and load into xarray dataset, for xarray to work, make sure 'group' is specified.
        ds_nadir = xr.open_mfdataset(earthaccess.open(nadir_results), combine='nested', concat_dim="time", decode_times=False, engine='h5netcdf', group='data_01')
        ds_nadir
        Opening 10 granules, approx size: 0.0 GB
        using endpoint: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
        <xarray.Dataset>
        Dimensions:                            (time: 27927)
        Coordinates:
          * time                               (time) float64 7.599e+08 ... 7.6e+08
            latitude                           (time) float64 dask.array<chunksize=(2806,), meta=np.ndarray>
            longitude                          (time) float64 dask.array<chunksize=(2806,), meta=np.ndarray>
        Data variables: (12/31)
            time_tai                           (time) float64 dask.array<chunksize=(2806,), meta=np.ndarray>
            surface_classification_flag        (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            rad_side_1_surface_type_flag       (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            rad_side_2_surface_type_flag       (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            alt_qual                           (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            rad_qual                           (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            ...                                 ...
            pole_tide                          (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            internal_tide_hret                 (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            wind_speed_alt                     (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            wind_speed_alt_mle3                (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            rad_water_vapor                    (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
            rad_cloud_liquid_water             (time) float32 dask.array<chunksize=(2806,), meta=np.ndarray>
        • time
          PandasIndex
          PandasIndex(Float64Index([759929456.5671339, 759929457.6690049,  759929458.770874,
                         759929459.872746, 759929460.9746141, 759929462.0764852,
                        759929463.1783538, 759929464.2802248, 759929465.3820939,
                        759929466.4839649,
                        ...
                        759960314.5902781, 759960315.6921468, 759960316.7940178,
                        759960317.8958869, 759960318.9977579,  759960320.099627,
                         759960321.201498, 759960322.3033671, 759960323.4052382,
                        759960324.5071082],
                       dtype='float64', name='time', length=27927))
      • Plot

        plt.figure(figsize=(15, 5))
        ax = plt.axes(projection=ccrs.PlateCarree())
        ax.set_global()
        ax.coastlines()
        plt.scatter(x=ds_nadir.longitude, y=ds_nadir.latitude, c=ds_nadir.depth_or_elevation, marker='.')
        plt.colorbar().set_label('Depth or Elevation (m)')

        A final word…

        Accessing data completely from S3 and in memory are affected by various things.

        1. The format of the data - archive formats like NetCDF, GEOTIFF, HDF vs cloud optimized data structures (Zarr, kerchunk, COG). Cloud formats are made for accessing only the pieces of data of interest needed at the time of the request (e.g. a subset, timestep, etc).
        2. Tools like xarray make a lot of assumptions about how to open and read a file. Sometimes the internals don’t fit the xarray ‘mould’ and we need to continue to work with data providers and software providers to make these two sides work together. Level 2 data (non-gridded), specifically, suffers from some of the assumptions made.