How do I find data using code?
Introduction
Here are our recommended approaches for finding data with code, from the command line or a notebook.
In Python we can use the earthaccess library (renamed, previously earthdata)
To install the package we’ll run this code from the command line. Note: you can run shell code directly from a Jupyter Notebook cell by adding a !, so it would be !conda install.
## In the command line
## Install earthaccess
conda install -c conda-forge earthaccessThis example searches for data from the Land Processes DAAC with a spatial bounding box and temporal range.
## In Python
## Import packages
from earthaccess import DataGranules, DataCollections
from pprint import pprint
## We'll get 4 collections that match with our keyword of interest
collections = DataCollections().keyword("REFLECTANCE").cloud_hosted(True).get(4)
## Let's print 2 collections
for collection in collections[0:2]:
print(pprint(collection.summary()) , collection.abstract(), "\n")
## Search for files from the second dataset result over a small plot in Nebraska, USA for two weeks in September 2022
granules = DataGranules().concept_id("C2021957657-LPCLOUD").temporal("2022-09-10","2022-09-24").bounding_box(-101.67271,41.04754,-101.65344,41.06213)
print(len(granules))
granulesTo find data in R, we’ll also use the earthaccess python package - we can do so from R using the reticulate package (cheatsheet). Note below that we import the python library as an R object we name earthaccess, as well as the earthaccess$ syntax for accessing functions from the earthaccess library. The granules object has a list of JSON dictionaries with some extra dictionaries.
## In R
## load R libraries
library(tidyverse) # install.packages("tidyverse")
library(reticulate) # install.packages("reticulate")
## load python library
earthaccess <- reticulate::import("earthaccess")
## use earthaccess to access data # https://nsidc.github.io/earthaccess/tutorials/search-granules/
granules <- earthaccess$search_data(
doi = "10.5067/SLREF-CDRV3",
temporal = reticulate::tuple("2017-01", "2017-02") # with an earthaccess update, this can be simply c() or list()
)
## Granules found: 72
## exploring
granules # this is the result of the get request.
class(granules) # "list"
## granules <- reticulate::py_to_r(granules) # Object to convert is not a Python objectMatlab code coming soon!
## In Matlab
## Coming soon!With wget and curl:
## In the command line
## Coming soon!