How do I access data stored in Earthdata Cloud in Python?
Direct Access
When you have found the data you want to use, you have two options. You can download the data to work locally, or access the data directly to work in the cloud. This second way of working is called “Direct Cloud Access” or simply, “Direct Access”. Your compute instance needs to be in Amazon Web Services (AWS) Region us-west-2 in order for the code to run successfully. We authenticate using a netrc file and an Earthdata Login, see the appendix for more information on Earthdata Login and netrc setup.
Python
We can use the earthaccess python library to grab the file URLs and then access them with the xarray library.
#Import packagesimport earthaccessimport xarray as xr
#Authentication with Earthdata Loginauth = earthaccess.login(strategy="netrc")
#Access land ice height from ATLAS/ICESat-2 V005 (10.5067/ATLAS/ATL06.005), searching for data over western Greenland coast over two weeks in July 2022. The data are provided as HDF5 granules (files) that span about 1/14th of an orbit.results = earthaccess.search_data(short_name="ATL06", version="006", cloud_hosted=True, temporal = ("2022-07-17","2022-07-31"), bounding_box = (-51.96423,68.10554,-48.71969,70.70529))
Granules found: 5
#Use xarray to load the data as a multifile dataset for a single group in the HDF5 file, in this case land ice segments:ds = xr.open_mfdataset(earthaccess.open(results), group='/gt1l/land_ice_segments')ds
Opening 5 granules, approx size: 0.0 GB
<xarray.Dataset>
Dimensions: (delta_time: 241711)
Coordinates:
* delta_time (delta_time) datetime64[ns] 2022-07-18T01:00:46.67...
latitude (delta_time) float64 dask.array<chunksize=(78325,), meta=np.ndarray>
longitude (delta_time) float64 dask.array<chunksize=(78325,), meta=np.ndarray>
Data variables:
atl06_quality_summary (delta_time) int8 dask.array<chunksize=(78325,), meta=np.ndarray>
h_li (delta_time) float32 dask.array<chunksize=(78325,), meta=np.ndarray>
h_li_sigma (delta_time) float32 dask.array<chunksize=(78325,), meta=np.ndarray>
segment_id (delta_time) float64 dask.array<chunksize=(78325,), meta=np.ndarray>
sigma_geo_h (delta_time) float32 dask.array<chunksize=(78325,), meta=np.ndarray>
Attributes:
Description: The land_ice_height group contains the primary set of deriv...
data_rate: Data within this group are sparse. Data values are provide...
xarray.Dataset
delta_time: 241711
delta_time
(delta_time)
datetime64[ns]
2022-07-18T01:00:46.678760592 .....
contentType :
referenceInformation
description :
Number of GPS seconds since the ATLAS SDP epoch. The ATLAS Standard Data Products (SDP) epoch offset is defined within /ancillary_data/atlas_sdp_gps_epoch as the number of GPS seconds between the GPS epoch (1980-01-06T00:00:00.000000Z UTC) and the ATLAS SDP epoch. By adding the offset contained within atlas_sdp_gps_epoch to delta time parameters, the time in gps_seconds relative to the GPS epoch can be computed.
The ATL06_quality_summary parameter indicates the best-quality subset of all ATL06 data. A zero in this parameter implies that no data-quality tests have found a problem with the segment, a one implies that some potential problem has been found. Users who select only segments with zero values for this flag can be relatively certain of obtaining high-quality data, but will likely miss a significant fraction of usable data, particularly in cloudy, rough, or low-surface-reflectance conditions.
flag_meanings :
best_quality potential_problem
flag_values :
[0 1]
long_name :
ATL06_Quality_Summary
source :
section 4.3
units :
1
valid_max :
1
valid_min :
0
Array
Chunk
Bytes
236.05 kiB
76.49 kiB
Shape
(241711,)
(78325,)
Dask graph
5 chunks in 11 graph layers
Data type
int8 numpy.ndarray
h_li
(delta_time)
float32
dask.array<chunksize=(78325,), meta=np.ndarray>
contentType :
physicalMeasurement
description :
Standard land-ice segment height determined by land ice algorithm, corrected for first-photon bias, representing the median- based height of the selected PEs
long_name :
Land Ice height
source :
section 4.4
units :
meters
Array
Chunk
Bytes
0.92 MiB
305.96 kiB
Shape
(241711,)
(78325,)
Dask graph
5 chunks in 11 graph layers
Data type
float32 numpy.ndarray
h_li_sigma
(delta_time)
float32
dask.array<chunksize=(78325,), meta=np.ndarray>
contentType :
qualityInformation
description :
Propagated error due to sampling error and FPB correction from the land ice algorithm
long_name :
Expected RMS segment misfit
source :
section 4.4
units :
meters
Array
Chunk
Bytes
0.92 MiB
305.96 kiB
Shape
(241711,)
(78325,)
Dask graph
5 chunks in 11 graph layers
Data type
float32 numpy.ndarray
segment_id
(delta_time)
float64
dask.array<chunksize=(78325,), meta=np.ndarray>
contentType :
referenceInformation
description :
Segment number, counting from the equator. Equal to the segment_id for the second of the two 20m ATL03 segments included in the 40m ATL06 segment
long_name :
Reference Point, m
source :
section 3.1.2.1
units :
1
Array
Chunk
Bytes
1.84 MiB
611.91 kiB
Shape
(241711,)
(78325,)
Dask graph
5 chunks in 11 graph layers
Data type
float64 numpy.ndarray
sigma_geo_h
(delta_time)
float32
dask.array<chunksize=(78325,), meta=np.ndarray>
contentType :
qualityInformation
description :
Total vertical geolocation error due to PPD and POD, including the effects of horizontal geolocation error on the segment vertical error.
The land_ice_height group contains the primary set of derived ATL06 products. This includes geolocation, height, and standard error and quality measures for each segment. This group is sparse, meaning that parameters are provided only for pairs of segments for which at least one beam has a valid surface-height measurement.
data_rate :
Data within this group are sparse. Data values are provided only for those ICESat-2 20m segments where at least one beam has a valid land ice height measurement.
End User License Agreement (EULA)
Sometimes, accessing data in NASA Earthdata Cloud requires an End-User License Agreement (EULA). If you cannot access a dataset, this may be your issue! See these instructions for how to authorize EULAs.