ECCOv4 access via NASA’s EarthData in the Cloud#
This notebook demonstrates access to ECCOv4 model output. Broad information about the ECCO dataset can be found in the PODAAC website (see here).
Requirements to run this notebook
Have an Earth Data Login account
Have a Bearer Token.
Objectives
Use pydap’s client API to demonstrate
Access to NASA’s EarthData via the use of
tokens
.Tokens
are greatly favored overusername/password
, sincetokens
avoid repeated authentication over the many redirects.A workflow involving multiple OPeNDAP URLs and xarray parallelism, lazy evaluation, and plotting of Level 4 with complex Topology ECCOv4 Data via OPeNDAP.
Some variables of focus are
Author
: Miguel Jimenez-Urias, ‘24
import matplotlib.pyplot as plt
import numpy as np
import requests
from pydap.client import open_url
import cartopy.crs as ccrs
import json
import xarray as xr
Access EARTHDATA#
Many of the data variables can be browsed here. Here we will work with the original data defined on the Lat-Lon-Cap (LLC90) grid.
Grid_url = 'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Geometry%20Parameters%20for%20the%20Lat-Lon-Cap%2090%20(llc90)%20Native%20Model%20Grid%20(Version%204%20Release%204)/granules/GRID_GEOMETRY_ECCO_V4r4_native_llc0090'
Lazy access to remote data via pydap’s client API#
pydap
exploits the OPeNDAP’s separation between metadata and data, to create lazy dataset objects that point to the data
. These lazy objects contain all the attributes detailed in OPeNDAP’s metadata files (DMR).
ds_grid = open_url(Grid_url, session=my_session, protocol="dap4")
ds_grid.tree()
.GRID_GEOMETRY_ECCO_V4r4_native_llc0090.nc
├──dxC
├──PHrefF
├──XG
├──dyG
├──rA
├──hFacS
├──Zp1
├──Zl
├──rAw
├──dxG
├──maskW
├──YC
├──XC
├──maskS
├──YG
├──hFacC
├──drC
├──drF
├──XC_bnds
├──Zu
├──Z_bnds
├──YC_bnds
├──PHrefC
├──rAs
├──Depth
├──dyC
├──SN
├──rAz
├──maskC
├──CS
├──hFacW
├──Z
├──i
├──i_g
├──j
├──j_g
├──k
├──k_l
├──k_p1
├──k_u
├──nb
├──nv
└──tile
Note
PyDAP accesses the remote dataset’s metadata, and no data has been downloaded yet!
Download data into memory
The syntax is as follows:
# this fetches remote data into a pydap object container
pydap_Array = dataset[<VarName>][:]
where <VarName>
is the name of one of the variables in the pydap.model.BaseType
object. The pydap.model.BaseType
is a thing wrapper of numpy arrays
. The array has been downloaded into “local” memory (RAM) as (uncompressed) numpy arrays.
To extract the data as a numpy array, run
# The `.data` command allows direct access to the Numpy array (e.g. for manipulation)
pydap_Array.data
# lets download some data
Depth = ds_grid['Depth'][:]
print(type(Depth))
<class 'pydap.model.BaseType'>
Depth.attributes
{'_FillValue': 9.969209968e+36,
'long_name': 'model seafloor depth below ocean surface at rest',
'units': 'm',
'coordinate': 'XC YC',
'coverage_content_type': 'modelResult',
'standard_name': 'sea_floor_depth_below_geoid',
'comment': "Model sea surface height (SSH) of 0m corresponds to an ocean surface at rest relative to the geoid. Depth corresponds to seafloor depth below geoid. Note: the MITgcm used by ECCO V4r4 implements 'partial cells' so the actual model seafloor depth may differ from the seafloor depth provided by the input bathymetry file.",
'coordinates': 'YC XC',
'origname': 'Depth',
'fullnamepath': '/Depth',
'checksum': array([950678499], dtype=uint32),
'Maps': ()}
Depth.shape, Depth.dimensions
((13, 90, 90), ['tile', 'j', 'i'])
Plot Depth along native grid
ECCO
data is defined on a Cube Sphere – meaning the horizontal grid contains an extra
dimension: tile
or face
. You can inspect the data in its native grid by plotting all horizontal data onto a grid as follows:
Variable = [Depth[i].data for i in range(13)]
clevels = np.linspace(0, 6000, 100)
cMap = 'Greys_r'
fig, axes = plt.subplots(nrows=5, ncols=5, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES = [
axes[4, 0], axes[3, 0], axes[2, 0], axes[4, 1], axes[3, 1], axes[2, 1],
axes[1, 1],
axes[1, 2], axes[1, 3], axes[1, 4], axes[0, 2], axes[0, 3], axes[0, 4],
]
for i in range(len(AXES)):
AXES[i].contourf(Variable[i], clevels, cmap=cMap)
for ax in np.ravel(axes):
ax.axis('off')
plt.setp(ax.get_xticklabels(), visible=False)
plt.setp(ax.get_yticklabels(), visible=False)
plt.show()
Fig. 1. Depth
plotted on a horizontal layout. Data on tiles 0-5
follow C-ordering
, whereas data on tiles 7-13
follow F-ordering
. Data on the arctic cap
, is defined on a polar coordinate grid.
Plot with corrected Topology
fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES_NR = [
axes[3, 0], axes[2, 0], axes[1, 0], axes[3, 1], axes[2, 1], axes[1, 1],
]
AXES_CAP = [axes[0, 0]]
AXES_R = [
axes[1, 2], axes[2, 2], axes[3, 2], axes[1, 3], axes[2, 3], axes[3, 3],
]
for i in range(len(AXES_NR)):
AXES_NR[i].contourf(Variable[i], clevels, cmap=cMap)
for i in range(len(AXES_CAP)):
AXES_CAP[i].contourf(Variable[6].transpose()[:, ::-1], clevels, cmap=cMap)
for i in range(len(AXES_R)):
AXES_R[i].contourf(Variable[7+i].transpose()[::-1, :], clevels, cmap=cMap)
for ax in np.ravel(axes):
ax.axis('off')
plt.setp(ax.get_xticklabels(), visible=False)
plt.setp(ax.get_yticklabels(), visible=False)
plt.show()
Fig. 2. Depth
plotted on a horizontal layout that approximates lat-lon
display. Data on the arctic cap
, however, remains on a polar coordinate grid.
pydap + xarray to aggregate multiple URLs
OPeNDAP
allows remote access via dataURL
s. Each dataURL
represents a variable, i.e. a piece of the whole puzzle. We can exploit xarray
concatenation and parallelism to combine multiple dataURL
s, and thus multiple pydap objects, into a single xarray.Dataset
. Can do this because xarray has long-implemented pydap as an engine to access/open remote datasets
.
A single URL
Many OPeNDAP
servers can serve data in either DAP2
and DAP4
(much newer) model implementations. In pydap
, you can specify which of the two implementations by replacing the beginning of the URL with either one of dap2
or dap4
. Since xarray
imports pydap internally as is, then xarray can recognize this behavior as well.
Below we access remote Temperature and Salinity ECCO data with xarray
via (internally) pydap
.
baseURL = 'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/'
Temp_Salt = "ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_"
year = '2017-'
month = '01'
end_ = '_ECCO_V4r4_native_llc0090'
Temp_2017 = baseURL + Temp_Salt +year + month + end_
dataset = xr.open_dataset(Temp_2017, engine='pydap', session=my_session)
dataset
<xarray.Dataset> Size: 47MB Dimensions: (tile: 13, j_g: 90, i_g: 90, k_p1: 51, k_l: 50, j: 90, i: 90, time: 1, k: 50, nb: 4, k_u: 50, nv: 2) Coordinates: (12/24) XG (tile, j_g, i_g) float32 421kB ... Zp1 (k_p1) float32 204B ... Zl (k_l) float32 200B ... YC (tile, j, i) float32 421kB ... XC (tile, j, i) float32 421kB ... YG (tile, j_g, i_g) float32 421kB ... ... ... * k_p1 (k_p1) int32 204B 0 1 2 3 4 5 6 7 8 ... 43 44 45 46 47 48 49 50 * k_u (k_u) int32 200B 0 1 2 3 4 5 6 7 8 ... 41 42 43 44 45 46 47 48 49 * nb (nb) float32 16B 0.0 1.0 2.0 3.0 * nv (nv) float32 8B 0.0 1.0 * tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12 * time (time) datetime64[ns] 8B 2017-01-16T12:00:00 Data variables: SALT (time, k, tile, j, i) float32 21MB ... THETA (time, k, tile, j, i) float32 21MB ... Attributes: (12/62) acknowledgement: This research was carried out by the Jet... author: Ian Fenty and Ou Wang cdm_data_type: Grid comment: Fields provided on the curvilinear lat-l... Conventions: CF-1.8, ACDD-1.3 coordinates_comment: Note: the global 'coordinates' attribute... ... ... time_coverage_duration: P1M time_coverage_end: 2017-02-01T00:00:00 time_coverage_resolution: P1M time_coverage_start: 2017-01-01T00:00:00 title: ECCO Ocean Temperature and Salinity - Mo... uuid: f5b7028c-4181-11eb-b7e6-0cc47a3f47b1
Add a Constraint expression to the URL to ONLY retrieve THETA#
You can do that by appending to the <base_URL> the CE syntax:
<base_URL>?dap4.ce=/<VarName>
where <VarName>
is the name of the Array you want to keep
Note
When using xarray
, we need to include all dimensions associated with THETA
in the Constraint Expression
.
CE = '?dap4.ce=/THETA;/SALT;/tile;/j;/k;/i;/time'
dataset = xr.open_dataset(Temp_2017+CE, engine='pydap', session=my_session)
Multiple files in parallel#
We can exploit xarray
’s intuitive concat
+ merge
capabilities to read multiple data URL in parallel. To accomplish that, we need to create a list of all relevant URLs that each point to an OPeNDAP dataset.
Temp_2017 = [baseURL + Temp_Salt + year + f'{i:02}' + end_ + CE for i in range(1, 13)]
theta_salt_ds = xr.open_mfdataset(
Temp_2017,
engine='pydap',
session=my_session,
parallel=True,
combine='nested',
concat_dim='time',
)
Finally, we plot THETA
#
Variable = [dataset['THETA'].isel(time=0, k=0, tile=i) for i in range(13)]
clevels = np.linspace(-5, 30, 100)
cMap='RdBu_r'
fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES_NR = [
axes[3, 0], axes[2, 0], axes[1, 0], axes[3, 1], axes[2, 1], axes[1, 1],
]
AXES_CAP = [axes[0, 0]]
AXES_R = [
axes[1, 2], axes[2, 2], axes[3, 2], axes[1, 3], axes[2, 3], axes[3, 3],
]
for i in range(len(AXES_NR)):
AXES_NR[i].contourf(Variable[i], clevels, cmap=cMap)
for i in range(len(AXES_CAP)):
AXES_CAP[i].contourf(Variable[6].transpose()[:, ::-1], clevels, cmap=cMap)
for i in range(len(AXES_R)):
AXES_R[i].contourf(Variable[7+i].transpose()[::-1, :], clevels, cmap=cMap)
for ax in np.ravel(axes):
ax.axis('off')
plt.setp(ax.get_xticklabels(), visible=False)
plt.setp(ax.get_yticklabels(), visible=False)
plt.show()
Fig. 3. Surface temperature
, plotted on a horizontal layout that approximates lat-lon
display. Data on the arctic cap
, however, remains on a polar coordinate grid.