How do I access data stored in Earthdata Cloud in Python?

Direct Access

When you have found the data you want to use, you have two options. You can download the data to work locally, or access the data directly to work in the cloud. This second way of working is called “Direct Cloud Access” or simply, “Direct Access”. Your compute instance needs to be in Amazon Web Services (AWS) Region us-west-2 in order for the code to run successfully. We authenticate using a netrc file and an Earthdata Login, see the appendix for more information on Earthdata Login and netrc setup.

Python

We can use the earthaccess python library to grab the file URLs and then access them with the xarray library.

#Import packages
import earthaccess
import xarray as xr
#Authentication with Earthdata Login
auth = earthaccess.login(strategy="netrc")
#Access land ice height from ATLAS/ICESat-2 V005 (10.5067/ATLAS/ATL06.005), searching for data over western Greenland coast over two weeks in July 2022. The data are provided as HDF5 granules (files) that span about 1/14th of an orbit.

results = earthaccess.search_data(short_name="ATL06",
                                  version="006",
                                  cloud_hosted=True,
                                  temporal = ("2022-07-17","2022-07-31"),
                                  bounding_box = (-51.96423,68.10554,-48.71969,70.70529))
Granules found: 5
#Use xarray to load the data as a multifile dataset for a single group in the HDF5 file, in this case land ice segments:
ds = xr.open_mfdataset(earthaccess.open(results), group='/gt1l/land_ice_segments')
ds
 Opening 5 granules, approx size: 0.0 GB
<xarray.Dataset>
Dimensions:                (delta_time: 241711)
Coordinates:
  * delta_time             (delta_time) datetime64[ns] 2022-07-18T01:00:46.67...
    latitude               (delta_time) float64 dask.array<chunksize=(78325,), meta=np.ndarray>
    longitude              (delta_time) float64 dask.array<chunksize=(78325,), meta=np.ndarray>
Data variables:
    atl06_quality_summary  (delta_time) int8 dask.array<chunksize=(78325,), meta=np.ndarray>
    h_li                   (delta_time) float32 dask.array<chunksize=(78325,), meta=np.ndarray>
    h_li_sigma             (delta_time) float32 dask.array<chunksize=(78325,), meta=np.ndarray>
    segment_id             (delta_time) float64 dask.array<chunksize=(78325,), meta=np.ndarray>
    sigma_geo_h            (delta_time) float32 dask.array<chunksize=(78325,), meta=np.ndarray>
Attributes:
    Description:  The land_ice_height group contains the primary set of deriv...
    data_rate:    Data within this group are sparse.  Data values are provide...
    • delta_time
      PandasIndex
      PandasIndex(DatetimeIndex(['2022-07-18 01:00:46.678760592',
                     '2022-07-18 01:00:46.681322640',
                     '2022-07-18 01:00:46.684008720',
                     '2022-07-18 01:00:46.686753504',
                     '2022-07-18 01:00:46.689526560',
                     '2022-07-18 01:00:46.692315280',
                     '2022-07-18 01:00:46.695049040',
                     '2022-07-18 01:00:46.700724096',
                     '2022-07-18 01:00:46.703545872',
                     '2022-07-18 01:00:46.706366832',
                     ...
                     '2022-07-26 00:49:18.806914512',
                     '2022-07-26 00:49:18.809737328',
                     '2022-07-26 00:49:18.812559600',
                     '2022-07-26 00:49:18.815380608',
                     '2022-07-26 00:49:18.818200224',
                     '2022-07-26 00:49:18.821015744',
                     '2022-07-26 00:49:18.823827088',
                     '2022-07-26 00:49:18.826637808',
                     '2022-07-26 00:49:18.829449568',
                     '2022-07-26 00:49:18.832263232'],
                    dtype='datetime64[ns]', name='delta_time', length=241711, freq=None))
  • Description :
    The land_ice_height group contains the primary set of derived ATL06 products. This includes geolocation, height, and standard error and quality measures for each segment. This group is sparse, meaning that parameters are provided only for pairs of segments for which at least one beam has a valid surface-height measurement.
    data_rate :
    Data within this group are sparse. Data values are provided only for those ICESat-2 20m segments where at least one beam has a valid land ice height measurement.
  • End User License Agreement (EULA)

    Sometimes, accessing data in NASA Earthdata Cloud requires an End-User License Agreement (EULA). If you cannot access a dataset, this may be your issue! See these instructions for how to authorize EULAs.

    Alternative Access Method without earthaccess

    An alternative approach to accessing data is outlined in some notebooks in the Earthdata Cloud Cookbook Appendix! The earthaccess package uses these methods for it’s back end. See this GitHub folder.