Dask and Coiled

Utilize CPU’s in parallel on your cloud virtual machine to speed up computations with Python’s dask package. Or, combine many VM’s together in a distributed cluster for parallel computations using Coiled.

Notebooks

  • Introduction to Dask Tutorial - covers the basics of using Dask for parallel computing with NASA Earth Data completely in the cloud
  • Dask Function Replication Example - demonstrates a more complex example of replicating a function over many files in parallel using dask.delayed(). The example analysis generates spatial correlation maps of sea surface temperature vs sea surface height, using data sets available on PO.DAAC.
  • Dask Dataset Chunking Example - demonstrates a more complex example of applying computations to a large dataset via chunking and parallel computing. The example analysis generates seasonal cycles of sea surface temperature off the west coast of the U.S.A for a decade of ultra-high resolution data. Parallel computations are performed on a single VM with a local Dask cluster.
  • Coiled Function Replication Example - demonstrates a more complex example of replicating a function over many files in parallel using coiled.function(). The example analysis generates spatial correlation maps of sea surface temperature vs sea surface height, using data sets available on PO.DAAC. This replicates the analysis from the Dask Function Replication Example, but changes the method of parallel computation. Instead of using a local cluster on a single VM (Dask), many VM’s are combined into a distributed cluster (Coiled).
  • Coiled Dataset Chunking Example - demonstrates a more complex example of applying computations to a large dataset via chunking and parallel computing. The example analysis generates seasonal cycles of sea surface temperature off the west coast of the U.S.A for a decade of ultra-high resolution data. Parallel computations are distributed over many VM’s using Coiled’s coiled.cluster().