```
import xarray as xr
=True)
xr.set_options(keep_attrsimport hvplot.xarray
```

# Xarray

imported on: **2024-08-08**

This notebook is from NASA Openscapes 2021 Cloud Hackathon Repository

The original source for this document is https://github.com/NASA-Openscapes/2021-Cloud-Hackathon/blob/main/tutorials/03_Xarray_hvplot.ipynb

# 03. Introduction to `xarray`

## Why do we need `xarray`

?

As Geoscientists, we often work with time series of data with two or more dimensions: a time series of calibrated, orthorectified satellite images; two-dimensional grids of surface air temperature from an atmospheric reanalysis; or three-dimensional (level, x, y) cubes of ocean salinity from an ocean model. These data are often provided in GeoTIFF, NetCDF or HDF format with rich and useful metadata that we want to retain, or even use in our analysis. Common analyses include calculating means, standard deviations and anomalies over time or one or more spatial dimensions (e.g. zonal means). Model output often includes multiple variables that you want to apply similar analyses to.

The schematic above shows a typical data structure for multi-dimensional data. There are two data cubes, one for temperature and one for precipitation. Common coordinate variables, in this case latitude, longitude and time are associated with each variable. Each variable, including coordinate variables, will have a set of attributes: name, units, missing value, etc. The file containing the data may also have attributes: source of the data, model name coordinate reference system if the data are projected. Writing code using low-level packages such as `netcdf4`

and `numpy`

to read the data, then perform analysis, and write the results to file is time consuming and prone to errors.

## What is `xarray`

`xarray`

is an open-source project and `python`

package to work with labelled multi-dimensional arrays. It is leverages `numpy`

, `pandas`

, `matplotlib`

and `dask`

to build `Dataset`

and `DataArray`

objects with built-in methods to subset, analyze, interpolate, and plot multi-dimensional data. It makes working with multi-dimensional data cubes efficient and fun. **It will change your life for the better. You’ll be more attractive, more interesting, and better equiped to take on lifes challenges.**

## What you will learn from this tutorial

In this tutorial you will learn how to:

- load a netcdf file into
`xarray`

- interrogate the
`Dataset`

and understand the difference between`DataArray`

and`Dataset`

- subset a
`Dataset`

- calculate annual and monthly mean fields
- calculate a time series of zonal means
- plot these results

As always, we’ll start by importing `xarray`

. We’ll follow convention by giving the module the shortname `xr`