ECCOv4 access via NASA’s OPeNDAP in the Cloud.#

This notebook demonstrates access to ECCOv4 model output. Broad information about the ECCO dataset can be found in the PODAAC website (see here).

Requirements to run this notebook

  1. Have an Earth Data Login account

  2. Have a Bearer Token.

Objectives

Use pydap, Xarray and OPeNDAP to

  • Discover all OPeNDAP URLs associated with a ECCOv4 data collection. This is a Level 4 dataset defined in the CubedSphere ECCOv4

  • Authenticate via EDL (token based)

  • Explore ECCOv4 collection and filter variables, and coordinates

  • Consolidate Metadata at the collection level

  • Download/stream a subset of interest

Some variables of focus are

Author: Miguel Jimenez-Urias, ‘25

from pydap.net import create_session
from pydap.client import get_cmr_urls, consolidate_metadata, open_url
import xarray as xr
import datetime as dt
import earthaccess
import matplotlib.pyplot as plt
import numpy as np
import pydap
print("xarray version: ", xr.__version__)
print("pydap version: ", pydap.__version__)
xarray version:  2026.4.0
pydap version:  3.5.10.dev14+g40484efcb

Explore the ECCOv4 Collection#

The entire ECCOv4 simulation is available via NASA’s OPeNDAP service, but it is split into several collections. Each collection is available via Netcdf files. In this tutorial, we will subset Temperature, Salinity at the oceanic surface, for a region of interest: The North Atlantic Ocean. Subsetting the CubedSphere requires specialized code. Since the data is tiled, we will only focus for sake of simplicity, on subsetting the tiles.

For more information about the Ocean Temperature and Salinity, you can checkout this resource https://podaac.jpl.nasa.gov/ECCO?sections=data

ecco_ts_ccid = "C1991543728-POCLOUD" # 
time_range = [dt.datetime(2007, 1, 1), dt.datetime(2017, 12, 31)] # One month of data

cmr_urls = get_cmr_urls(ccid=ecco_ts_ccid, time_range=time_range, limit=1000) # you can incread the limit of results
print("################################################ \n We found a total of ", len(cmr_urls), "OPeNDAP URLS!!!\n################################################")
################################################ 
 We found a total of  133 OPeNDAP URLS!!!
################################################

Authenticate#

To hide the abstraction, we will use earthaccess to authenticate, and create a CachedSession to consolidate all metadata pre-download. This approach can result in speed up of 100x when aggregating all URLs into a single Xarray Dataset object.

auth = earthaccess.login(strategy="netrc", persist=True) # you will be promted to add your EDL credentials

# pass Token Authorization to a new Session.
cache_kwargs={'cache_name':'data/ECCOv4'}
my_session = create_session(use_cache=True, session=auth.get_session(), cache_kwargs=cache_kwargs)
my_session.cache.clear()

Explore Variables in collection and filter down to keep only desirable ones#

Here we demonstrate pydap as a exploratory metadata tool. Without downloading any array data, will let us know:

  • Variables Names, sizes, attributes

  • Dimensions and Coordinates

  • Allow us to construct an OPeNDAP Constraint Expression that subsets by variable name

pyds = open_url(cmr_urls[0].replace("https", "dap4"), session=my_session)
pyds.tree()
.OCEAN_TEMPERATURE_SALINITY_mon_mean_2006-12_ECCO_V4r4_native_llc0090.nc
├──XG
├──Zp1
├──Zl
├──YC
├──XC
├──SALT
├──YG
├──XC_bnds
├──Zu
├──THETA
├──Z_bnds
├──YC_bnds
├──time_bnds
├──Z
├──i
├──i_g
├──j
├──j_g
├──k
├──k_l
├──k_p1
├──k_u
├──nb
├──nv
├──tile
└──time

Note

NetCDF files are self describing, and often contain all information necessary to interprete the data. At the collection level (dataset), much of this information is duplicated across each file. With OPeNDAP we can reduce the amount of data processed/download with Constraint Expressions.

We are interested only in Salinity (SALT) and Temperature (THETA), and the necessary extra dimensions/coordinates to work with this arrays. We use PyDAP to help us figure this out

pyds['THETA'].dims
['/time', '/k', '/tile', '/j', '/i']
pyds['SALT'].dims
['/time', '/k', '/tile', '/j', '/i']
pyds['THETA'].coordinates, pyds["SALT"].coordinates
('YC Z XC time', 'YC Z XC time')

Filter by Variable Names (CEs)#

We use the information on dimensions and coordinates, to further filter all possible variables in the dataset. We accomplish this by adding a query parameter of the form:

<base_url> + ?dap4.ce=/var_name1;...

Any variable name included in the query expression, will be processed and all others will be ignored.

Note

You can also construct the full URL with constraint expressions interactively, by selecting manually the variables on the datasets’s Data Request Form and selecting Copy (Raw) Data URL. To go to this page, you need to append a .dmr to each opendap url and insert it into a browser.

dims = pyds['THETA'].dims
Vars = ['/THETA', "/SALT"] + dims

# Below construct Contraint Expression
CE = "?dap4.ce="+(";").join(Vars)
print("Constraint Expression: ", CE)
Constraint Expression:  ?dap4.ce=/THETA;/SALT;/time;/k;/tile;/j;/i

DAP4 URLs#

To tell pydap and Xarray which OPeNDAP protocol to use to stream data, we can replace the scheme of each url with dap4. DAP4 is a relatively new protocol (compared to DAP2, and it is widely used across NASA.

ECCO_urls = [url.replace("https", "dap4") + CE for url in cmr_urls]
ECCO_urls[0]
'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2006-12_ECCO_V4r4_native_llc0090?dap4.ce=/THETA;/SALT;/time;/k;/tile;/j;/i'

CubedSphere: Visualizing Depth in the native grid#

Before continuing, we explore the complex topology of the ECCO dataset via the Grid file.

Note

Many of the coordinate variables in the Temperature/Salinity file are also present in the Grid file.

ECCO data is defined on a Cube Sphere – meaning the horizontal grid contains an extra dimension: tile or face. You can inspect the data in its native grid by plotting all horizontal data onto a grid. For that, we look at the Depth variable. This is NOT included in same collection as Temperature. Below we provide the Cloud OPeNDAP URL, which can be queried from the CMR or Earthdata Search.

## Concept collection ID for ECCO grid
grid_ccid = "C2013557893-POCLOUD"
Grid_url = get_cmr_urls(ccid=grid_ccid)[0] # only one element
Grid_url = Grid_url.replace("https", "dap4")

grid_ds = open_url(Grid_url)
grid_ds
<DatasetType with children 'dxC', 'PHrefF', 'XG', 'dyG', 'rA', 'hFacS', 'Zp1', 'Zl', 'rAw', 'dxG', 'maskW', 'YC', 'XC', 'maskS', 'YG', 'hFacC', 'drC', 'drF', 'XC_bnds', 'Zu', 'Z_bnds', 'YC_bnds', 'PHrefC', 'rAs', 'Depth', 'dyC', 'SN', 'rAz', 'maskC', 'CS', 'hFacW', 'Z', 'i', 'i_g', 'j', 'j_g', 'k', 'k_l', 'k_p1', 'k_u', 'nb', 'nv', 'tile'>

Note

The coordinates for THETA and SALT are present in the single file for Grid. And so we will not include them in our workflow until the end

Download a single variable#

With pure PyDAP, slicing an array triggers the download of that array. We illustrate that below

%%time
Depth = grid_ds["Depth"][:].data  # <-------- LOADS DATA as in-memory numpy array
CPU times: user 76.9 ms, sys: 14.3 ms, total: 91.2 ms
Wall time: 6.33 s
Depth.shape
(13, 90, 90)
Variable = [Depth[i] for i in range(13)]
clevels =  np.linspace(0, 6000, 100)
cMap = 'Greys_r'
fig, axes = plt.subplots(nrows=5, ncols=5, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES = [
    axes[4, 0], axes[3, 0], axes[2, 0], axes[4, 1], axes[3, 1], axes[2, 1],
    axes[1, 1], 
    axes[1, 2], axes[1, 3], axes[1, 4], axes[0, 2], axes[0, 3], axes[0, 4],
]
for i in range(len(AXES)):
    AXES[i].contourf(Variable[i], clevels, cmap=cMap)

for ax in np.ravel(axes):
    ax.axis('off')
    plt.setp(ax.get_xticklabels(), visible=False)
    plt.setp(ax.get_yticklabels(), visible=False)

plt.show()
../_images/8deb991839e486f093d257b856cf5d089cf76987d9d0f66604c31bd6f61ab15d.png

Fig. 1. Depth plotted on a horizontal layout. Data on tiles 0-5 follow C-ordering, whereas data on tiles 7-13 follow F-ordering. Data on the arctic cap, is defined on a polar coordinate grid.

Plot with corrected Topology

fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES_NR = [
    axes[3, 0], axes[2, 0], axes[1, 0], axes[3, 1], axes[2, 1], axes[1, 1],
]
AXES_CAP = [axes[0, 0]]
AXES_R = [
    axes[1, 2], axes[2, 2], axes[3, 2], axes[1, 3], axes[2, 3], axes[3, 3],
]
for i in range(len(AXES_NR)):
    AXES_NR[i].contourf(Variable[i], clevels, cmap=cMap)

for i in range(len(AXES_CAP)):
    AXES_CAP[i].contourf(np.transpose(Variable[6])[:, ::-1], clevels, cmap=cMap)

for i in range(len(AXES_R)):
    AXES_R[i].contourf(np.transpose(Variable[7+i])[::-1, :], clevels, cmap=cMap)

for ax in np.ravel(axes):
    ax.axis('off')
    plt.setp(ax.get_xticklabels(), visible=False)
    plt.setp(ax.get_yticklabels(), visible=False)

plt.show()
../_images/57aff92f93cde9a0259585ca5662b6f3488687e2a0b3ec72699e276a03b23821.png

Fig. 2. Depth plotted on a horizontal layout that approximates lat-lon display. Data on the arctic cap, however, remains on a polar coordinate grid.

Consolidate Metadata#

Aggregating multiple remote files at the collection level requires persistent internet connection. The pydap backend allows to download and store the metadata required by xarray locally as a sqlite3 database, and this database can be used as session manager (for futher downloads). Creating this databaset can be done once, and reused, version controlled, etc. Reusing this database can cut the time to generate the aggregated dataset view from minutes to seconds.

Persisting metadata for later reuse#

When executing consolidate_metadata below, a sqlite3 database is created in the default directory specified by the cache_name, in the CachedSession. This metadata file can be stored persistently, and version-controlled, to avoid running consolidate metadata everytime.

Note

To persist the metadata database, avoid executing session.cache.clear(). You can also place the .sqlite in a version controlled repository local to your workflow that can track any changes to it. With this, re-executing consolidate_metadata will exit in <1 second.

Warning

After creatng the sqlite database, you can access it across Kernel restarts by simply creating the session as done above, choosing the cache_name that matches the name of the sqlite database. BUT you will may also need to re-execute consolidate_metadata, as in some cases consolidate_metadata creates a special key_fn to reuse the cache associated with certain dimensions across urls. This behavior will go away in a future release.

%%time
consolidate_metadata(ECCO_urls, session=my_session, concat_dim='time')
datacube has dimensions ['i[0:1:89]', 'j[0:1:89]', 'k[0:1:49]', 'tile[0:1:12]'] , and concat dim: `['time']`
CPU times: user 2.63 s, sys: 785 ms, total: 3.42 s
Wall time: 1min 16s
len(my_session.cache.urls())
268

Dataset aggregation via Xarray#

Before creating the aggregated dataset, we know we are interested in:

  • Surface data (a single element k=0, where k is the dimension along the vertical axis).

  • Tiles 2, 6, and 10, cover the North Atlantic Ocean (see depth plots below).

We want the OPeNDAP server to do the subsetting for us, reducing the amount of data downloaded. As such, we need to define the chunking that will match our server request. See below

A note about chunks#

With OPeNDAP, any download from a granule should be considered as a single chunk, from the perspective of Xarray. The DAP4 protocol does send chunk responses over the web, but these are assembled by pydap in memory as individual numpy arrays. You can see the perspective of Xarray on the remote chunking by passing a chunks={} argument when opening the dataset.

%%time
ds = xr.open_mfdataset(
    ECCO_urls, 
    engine='pydap',
    session=my_session, 
    parallel=True,
    combine='nested',
    concat_dim='time',
    chunks={},
)
ds
CPU times: user 4.05 s, sys: 477 ms, total: 4.52 s
Wall time: 4.81 s
<xarray.Dataset> Size: 6GB
Dimensions:  (time: 133, k: 50, tile: 13, j: 90, i: 90)
Coordinates:
  * time     (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
  * k        (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
  * tile     (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * j        (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i        (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
Data variables:
    SALT     (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
    THETA    (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
Attributes: (12/62)
    acknowledgement:                 This research was carried out by the Jet...
    author:                          Ian Fenty and Ou Wang
    cdm_data_type:                   Grid
    comment:                         Fields provided on the curvilinear lat-l...
    Conventions:                     CF-1.8, ACDD-1.3
    coordinates_comment:             Note: the global 'coordinates' attribute...
    ...                              ...
    time_coverage_duration:          P1M
    time_coverage_end:               2007-01-01T00:00:00
    time_coverage_resolution:        P1M
    time_coverage_start:             2006-12-01T00:00:00
    title:                           ECCO Ocean Temperature and Salinity - Mo...
    uuid:                            f32555f0-4181-11eb-bed0-0cc47a3f47a3

Xarray treats the remote data variable as an individual chunk, no matter the size of the OPeNDAP subset#

ds['THETA']
<xarray.DataArray 'THETA' (time: 133, k: 50, tile: 13, j: 90, i: 90)> Size: 3GB
dask.array<concatenate, shape=(133, 50, 13, 90, 90), dtype=float32, chunksize=(1, 50, 13, 90, 90), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
  * k        (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
  * tile     (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * j        (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i        (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
Attributes:
    long_name:              Potential temperature 
    units:                  degree_C
    coverage_content_type:  modelResult
    standard_name:          sea_water_potential_temperature
    comment:                Sea water potential temperature is the temperatur...
    valid_min:              -2.2909388542175293
    valid_max:              36.032955169677734
    origname:               THETA
    fullnamepath:           /THETA
    Maps:                   ()

For Chunk Considerations#

Download performance is a game of Chunk Sizes. For OPeNDAP, this concerns the size of the slice, which is bounded from above by the size of the granule (at the variable level). However, if the chunks are too small, the download will take forever. Too large, and you will download too much data from S3 (assuming this tutorial is being run outside of the region). In that last scenario (data is on S3), egress charges are important, and we find ourselves with the different choices for our approach to download data:

  • Minimize the amount of data leaving the cloud. This is accompished by the following chunking: chunks={'k':1, 'tile':1}. However, since we will download 3 tiles per granule, chunking the dataset will multiply by a factor of 3 the amount of request made to the remote OPeNDAP server, only for the data to be assembled by Xarray afterwards, once downloaded locally and before storing.

  • Optimized performance. In this case, this is done by choosing the following chunking: chunks={'k':1}. It will download all the tiles per granule, and the top level only. All these are easily subsetted by Xarray once data is downloaded. This option also keeps the number of requests to the OPeNDAP server at a minimum.

Warning

Even with the optimized performance approach, this may not be fast. Download speed will depend on internet connection, or data locality. Overall, the number of request to the server will be ~600 in this scenario, all behind some form of authentication.

%%time
ds = xr.open_mfdataset(
    ECCO_urls, 
    engine='pydap',
    session=my_session,
    parallel=True,
    combine='nested',
    concat_dim='time',
    chunks={'k':1},
)
ds
CPU times: user 3.75 s, sys: 402 ms, total: 4.15 s
Wall time: 4.54 s
<xarray.Dataset> Size: 6GB
Dimensions:  (time: 133, k: 50, tile: 13, j: 90, i: 90)
Coordinates:
  * time     (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
  * k        (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
  * tile     (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * j        (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i        (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
Data variables:
    SALT     (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
    THETA    (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/62)
    acknowledgement:                 This research was carried out by the Jet...
    author:                          Ian Fenty and Ou Wang
    cdm_data_type:                   Grid
    comment:                         Fields provided on the curvilinear lat-l...
    Conventions:                     CF-1.8, ACDD-1.3
    coordinates_comment:             Note: the global 'coordinates' attribute...
    ...                              ...
    time_coverage_duration:          P1M
    time_coverage_end:               2007-01-01T00:00:00
    time_coverage_resolution:        P1M
    time_coverage_start:             2006-12-01T00:00:00
    title:                           ECCO Ocean Temperature and Salinity - Mo...
    uuid:                            f32555f0-4181-11eb-bed0-0cc47a3f47a3

Include the grid data#

Now we will access the remote Grid file to include some additional grid variables. This new individual dataset will be merged with our temperature/salinity dataset.

Note

We will NOT use the same CachedSession object. Instead, we will initiate a different session object. Note the speed it takes may be similar (same order) as aggregating 100s of urls. This is what consolidate metadata does.

%%time
### create anew session
session = create_session(session=auth.get_session())

### create an individual dataset with only the variables of interest
grid_ds = xr.open_dataset(Grid_url+"?dap4.ce=/Depth;/XC;/YC;/i;/j;/tile", engine='pydap', session=session)
grid_ds
CPU times: user 83.2 ms, sys: 11.8 ms, total: 95 ms
Wall time: 8.11 s
<xarray.Dataset> Size: 1MB
Dimensions:  (tile: 13, j: 90, i: 90)
Coordinates:
  * tile     (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * j        (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i        (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
    YC       (tile, j, i) float32 421kB ...
    XC       (tile, j, i) float32 421kB ...
Data variables:
    Depth    (tile, j, i) float32 421kB ...
Attributes: (12/58)
    acknowledgement:                 This research was carried out by the Jet...
    author:                          Ian Fenty and Ou Wang
    cdm_data_type:                   Grid
    comment:                         Fields provided on the curvilinear lat-l...
    Conventions:                     CF-1.8, ACDD-1.3
    coordinates_comment:             Note: the global 'coordinates' attribute...
    ...                              ...
    references:                      ECCO Consortium, Fukumori, I., Wang, O.,...
    source:                          The ECCO V4r4 state estimate was produce...
    standard_name_vocabulary:        NetCDF Climate and Forecast (CF) Metadat...
    summary:                         This dataset provides geometric paramete...
    title:                           ECCO Geometry Parameters for the Lat-Lon...
    uuid:                            87ff7d24-86e5-11eb-9c5f-f8f21e2ee3e0
#### Combine the two datasets into a single dataset reference
nds = xr.merge([ds, grid_ds])
nds
<xarray.Dataset> Size: 6GB
Dimensions:  (time: 133, k: 50, tile: 13, j: 90, i: 90)
Coordinates:
  * time     (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
  * k        (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
  * tile     (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * j        (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i        (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
    YC       (tile, j, i) float32 421kB ...
    XC       (tile, j, i) float32 421kB ...
Data variables:
    SALT     (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
    THETA    (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
    Depth    (tile, j, i) float32 421kB ...
Attributes: (12/62)
    acknowledgement:                 This research was carried out by the Jet...
    author:                          Ian Fenty and Ou Wang
    cdm_data_type:                   Grid
    comment:                         Fields provided on the curvilinear lat-l...
    Conventions:                     CF-1.8, ACDD-1.3
    coordinates_comment:             Note: the global 'coordinates' attribute...
    ...                              ...
    time_coverage_duration:          P1M
    time_coverage_end:               2007-01-01T00:00:00
    time_coverage_resolution:        P1M
    time_coverage_start:             2006-12-01T00:00:00
    title:                           ECCO Ocean Temperature and Salinity - Mo...
    uuid:                            f32555f0-4181-11eb-bed0-0cc47a3f47a3

Stream all surface data with OPeNDAP, subset tiles, and store data locally with Xarray and PyDAP as engine#

%%time
nds.isel(k=0, tile=[2,6,10]).to_netcdf("data/ECCOv4_NA.nc4")
CPU times: user 373 ms, sys: 45.1 ms, total: 418 ms
Wall time: 20.7 s
---------------------------------------------------------------------------
ResponseError                             Traceback (most recent call last)
ResponseError: too many 500 error responses

The above exception was the direct cause of the following exception:

MaxRetryError                             Traceback (most recent call last)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/adapters.py:696, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    695 try:
--> 696     resp = conn.urlopen(
    697         method=request.method,
    698         url=url,
    699         body=request.body,  # type: ignore[arg-type]  # urllib3 stubs don't accept Iterable[bytes | str]
    700         headers=request.headers,  # type: ignore[arg-type]  # urllib3#3072
    701         redirect=False,
    702         assert_same_host=False,
    703         preload_content=False,
    704         decode_content=False,
    705         retries=self.max_retries,
    706         timeout=resolved_timeout,
    707         chunked=chunked,
    708     )
    710 except (ProtocolError, OSError) as err:

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/connectionpool.py:955, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    954     log.debug("Retry: %s", url)
--> 955     return self.urlopen(
    956         method,
    957         url,
    958         body,
    959         headers,
    960         retries=retries,
    961         redirect=redirect,
    962         assert_same_host=assert_same_host,
    963         timeout=timeout,
    964         pool_timeout=pool_timeout,
    965         release_conn=release_conn,
    966         chunked=chunked,
    967         body_pos=body_pos,
    968         preload_content=preload_content,
    969         decode_content=decode_content,
    970         **response_kw,
    971     )
    973 return response

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/connectionpool.py:955, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    954     log.debug("Retry: %s", url)
--> 955     return self.urlopen(
    956         method,
    957         url,
    958         body,
    959         headers,
    960         retries=retries,
    961         redirect=redirect,
    962         assert_same_host=assert_same_host,
    963         timeout=timeout,
    964         pool_timeout=pool_timeout,
    965         release_conn=release_conn,
    966         chunked=chunked,
    967         body_pos=body_pos,
    968         preload_content=preload_content,
    969         decode_content=decode_content,
    970         **response_kw,
    971     )
    973 return response

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/connectionpool.py:945, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    944 try:
--> 945     retries = retries.increment(method, url, response=response, _pool=self)
    946 except MaxRetryError:

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/util/retry.py:543, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    542     reason = error or ResponseError(cause)
--> 543     raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    545 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPSConnectionPool(host='opendap.earthdata.nasa.gov', port=443): Max retries exceeded with url: /providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2014-08_ECCO_V4r4_native_llc0090.dap?dap4.ce=/THETA%5B0:1:0%5D%5B0:1:0%5D%5B0:1:12%5D%5B0:1:89%5D%5B0:1:89%5D&dap4.checksum=true (Caused by ResponseError('too many 500 error responses'))

During handling of the above exception, another exception occurred:

RetryError                                Traceback (most recent call last)
Cell In[24], line 1
----> 1 get_ipython().run_cell_magic('time', '', 'nds.isel(k=0, tile=[2,6,10]).to_netcdf("data/ECCOv4_NA.nc4")\n')

File <timed eval>:1
----> 1 'Could not get source, probably due dynamically evaluated source code.'

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/dataset.py:2134, in Dataset.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf, auto_complex)
   2130         if encoding is None:
   2131             encoding = {}
   2132         from xarray.backends.writers import to_netcdf
   2133 
-> 2134         return to_netcdf(  # type: ignore[return-value]  # mypy cannot resolve the overloads:(
   2135             self,
   2136             path,
   2137             mode=mode,

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/writers.py:450, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf, auto_complex)
    447     if multifile:
    448         return writer, store
--> 450     writes = writer.sync(compute=compute)
    452 finally:
    453     if not multifile and not autoclose:  # type: ignore[redundant-expr,unused-ignore]

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/common.py:420, in ArrayWriter.sync(self, compute, chunkmanager_store_kwargs)
    417 if chunkmanager_store_kwargs is None:
    418     chunkmanager_store_kwargs = {}
--> 420 delayed_store = chunkmanager.store(
    421     self.sources,
    422     self.targets,
    423     lock=self.lock,
    424     compute=compute,
    425     flush=True,
    426     regions=self.regions,
    427     **chunkmanager_store_kwargs,
    428 )
    429 self.sources = []
    430 self.targets = []

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/namedarray/daskmanager.py:247, in DaskManager.store(self, sources, targets, **kwargs)
    239 def store(
    240     self,
    241     sources: Any | Sequence[Any],
    242     targets: Any,
    243     **kwargs: Any,
    244 ) -> Any:
    245     from dask.array import store
--> 247     return store(
    248         sources=sources,
    249         targets=targets,
    250         **kwargs,
    251     )

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/dask/array/core.py:1220, in store(***failed resolving arguments***)
   1217 if not return_stored:
   1218     import dask
-> 1220     dask.compute(arrays, **kwargs)
   1221     return None
   1222 else:

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/dask/base.py:685, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    682     expr = expr.optimize()
    683     keys = list(flatten(expr.__dask_keys__()))
--> 685     results = schedule(expr, keys, **kwargs)
    687 return repack(results)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:686, in ImplicitToExplicitIndexingAdapter.__array__(self, dtype, copy)
    682 def __array__(
    683     self, dtype: DTypeLike | None = None, /, *, copy: bool | None = None
    684 ) -> np.ndarray:
    685     if Version(np.__version__) >= Version("2.0.0"):
--> 686         return np.asarray(self.get_duck_array(), dtype=dtype, copy=copy)
    687     else:
    688         return np.asarray(self.get_duck_array(), dtype=dtype)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:691, in ImplicitToExplicitIndexingAdapter.get_duck_array(self)
    690 def get_duck_array(self):
--> 691     return self.array.get_duck_array()

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:924, in CopyOnWriteArray.get_duck_array(self)
    923 def get_duck_array(self):
--> 924     return self.array.get_duck_array()

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/coding/common.py:80, in _ElementwiseFunctionArray.get_duck_array(self)
     79 def get_duck_array(self):
---> 80     return self.func(self.array.get_duck_array())

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:764, in LazilyIndexedArray.get_duck_array(self)
    761 from xarray.backends.common import BackendArray
    763 if isinstance(self.array, BackendArray):
--> 764     array = self.array[self.key]
    765 else:
    766     array = apply_indexer(self.array, self.key)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:51, in PydapArrayWrapper.__getitem__(self, key)
     50 def __getitem__(self, key):
---> 51     return indexing.explicit_indexing_adapter(
     52         key, self.shape, indexing.IndexingSupport.BASIC, self._getitem
     53     )

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:1156, in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
   1134 """Support explicit indexing by delegating to a raw indexing method.
   1135 
   1136 Outer and/or vectorized indexers are supported by indexing a second time
   (...)   1153 Indexing result, in the form of a duck numpy-array.
   1154 """
   1155 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
-> 1156 result = raw_indexing_method(raw_key.tuple)
   1157 if numpy_indices.tuple:
   1158     # index the loaded duck array
   1159     indexable = as_indexable(result)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:56, in PydapArrayWrapper._getitem(self, key)
     55 def _getitem(self, key):
---> 56     result = robust_getitem(self.array, key, catch=ValueError)
     57     result = np.asarray(result.data)
     58     axis = tuple(n for n, k in enumerate(key) if isinstance(k, integer_types))

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/common.py:300, in robust_getitem(array, key, catch, max_retries, initial_delay)
    298 for n in range(max_retries + 1):
    299     try:
--> 300         return array[key]
    301     except catch:
    302         if n == max_retries:

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/model.py:538, in BaseType.__getitem__(self, index)
    535     return out
    537 out = copy.copy(self)
--> 538 data = self._get_data_index(index)
    540 # Check if index is a full slice (e.g., [:], ..., or tuple of all slice(None))
    541 if (
    542     index == slice(None)
    543     or index == Ellipsis
    544     or (isinstance(index, tuple) and all(i == slice(None) for i in index))
    545 ):

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/model.py:587, in BaseType._get_data_index(self, index)
    583     return np.vectorize(decode_np_strings, otypes=self._data.dtype.char)(
    584         self._data[index]
    585     )
    586 else:
--> 587     return self._get_data()[index]

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/handlers/dap.py:599, in BaseProxyDap4.__getitem__(self, index, build_only)
    597     if self.id not in _vars and "debug" not in self.session.cache.cache_name:
    598         cache_kwargs = {"skip": True}
--> 599 r = GET(
    600     url,
    601     self.application,
    602     self.session,
    603     timeout=self.timeout,
    604     verify=self.verify,
    605     get_kwargs=self.get_kwargs,
    606     cache_kwargs=cache_kwargs,
    607 )
    609 dataset = UNPACKDAP4DATA(r, self.checksums, self.user_charset).dataset
    610 self._data = dataset[self.id]._data

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/net.py:74, in GET(url, application, session, timeout, verify, use_cache, session_kwargs, cache_kwargs, get_kwargs)
     72     _, _, path, _, query, fragment = urlparse(url)
     73     url = urlunparse(("", "", path, "", _quote(query), fragment))
---> 74 res = create_request(
     75     url,
     76     application=application,
     77     session=session,
     78     timeout=timeout,
     79     verify=verify,
     80     session_kwargs=session_kwargs,
     81     cache_kwargs=cache_kwargs,
     82     get_kwargs=get_kwargs,
     83 )
     84 if isinstance(res, webob_Request):
     85     res = get_response(res, application, verify=verify)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/net.py:172, in create_request(url, application, timeout, verify, session, session_kwargs, cache_kwargs, get_kwargs)
    170 session_kwargs = session_kwargs or {}
    171 try:
--> 172     req = get_request(
    173         url,
    174         base_session=session,
    175         timeout=timeout,
    176         verify=verify,
    177         get_kwargs=get_kwargs,
    178         backend=cache_kwargs.pop("backend", None),
    179         cache_name=cache_kwargs.pop("http-cache", None),
    180         backend_options=cache_kwargs.pop("backend_options", None),
    181         cache_kwargs=cache_kwargs,
    182         session_kwargs=session_kwargs,
    183     )
    184     req.raise_for_status()
    185     return req

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/net.py:331, in get_request(url, base_session, timeout, verify, get_kwargs, backend, cache_name, backend_options, cache_kwargs, session_kwargs)
    329 if should_skip_cache(url, s) or skip:
    330     with s.cache_disabled():
--> 331         req = s.get(
    332             url,
    333             timeout=timeout,
    334             verify=verify,
    335             allow_redirects=True,
    336             **(get_kwargs or {}),
    337         )
    338 else:
    339     req = s.get(
    340         url,
    341         timeout=timeout,
   (...)    344         **(get_kwargs or {}),
    345     )

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:133, in CacheMixin.get(self, url, params, **kwargs)
    131 def get(self, url: str, params=None, **kwargs) -> AnyResponse:  # type: ignore
    132     kwargs.setdefault('allow_redirects', True)
--> 133     return self.request('GET', url, params=params, **kwargs)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:189, in CacheMixin.request(self, method, url, headers, expire_after, only_if_cached, refresh, force_refresh, *args, **kwargs)
    187 headers = set_request_headers(headers, expire_after, only_if_cached, refresh, force_refresh)
    188 with patch_form_boundary() if kwargs.get('files') else nullcontext():
--> 189     return super().request(method, url, *args, headers=headers, **kwargs)

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/sessions.py:651, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    646 send_kwargs = {
    647     "timeout": timeout,
    648     "allow_redirects": allow_redirects,
    649 }
    650 send_kwargs.update(settings)
--> 651 resp = self.send(prep, **send_kwargs)
    653 return resp

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:255, in CacheMixin.send(self, request, expire_after, only_if_cached, refresh, force_refresh, **kwargs)
    253     response = self._resend(request, actions, cached_response, **kwargs)  # type: ignore
    254 elif actions.send_request:
--> 255     response = self._send_and_cache(request, actions, cached_response, **kwargs)
    256 else:
    257     response = cached_response  # type: ignore  # Guaranteed to be non-None by this point

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:279, in CacheMixin._send_and_cache(self, request, actions, cached_response, **kwargs)
    275 """Send a request and cache the response, unless disabled by settings or headers.
    276 If applicable, also handle conditional requests.
    277 """
    278 request = actions.update_request(request)
--> 279 response = super().send(request, **kwargs)
    280 actions.update_from_response(response)
    282 if not actions.skip_write:

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/sessions.py:784, in Session.send(self, request, **kwargs)
    781 start = preferred_clock()
    783 # Send the request
--> 784 r = adapter.send(request, **kwargs)
    786 # Total elapsed time of the request (approximately)
    787 elapsed = preferred_clock() - start

File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/adapters.py:720, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    717         raise ConnectTimeout(e, request=request)
    719 if isinstance(e.reason, ResponseError):
--> 720     raise RetryError(e, request=request)
    722 if isinstance(e.reason, _ProxyError):
    723     raise ProxyError(e, request=request)

RetryError: HTTPSConnectionPool(host='opendap.earthdata.nasa.gov', port=443): Max retries exceeded with url: /providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2014-08_ECCO_V4r4_native_llc0090.dap?dap4.ce=/THETA%5B0:1:0%5D%5B0:1:0%5D%5B0:1:12%5D%5B0:1:89%5D%5B0:1:89%5D&dap4.checksum=true (Caused by ResponseError('too many 500 error responses'))

Finally, inspect the downloaded data#

mds = xr.open_dataset("data/ECCOv4_NA.nc4")
mds
Variable = [mds['THETA'][0, i, :, :] for i in range(3)]
clevels = np.linspace(-5, 30, 100)
cMap='RdBu_r'
ocean_mask = mds["Depth"]>0
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(8, 8), gridspec_kw={'hspace':0.001, 'wspace':0.001})
AXES_NR = [
    axes[1, 1],
]
AXES_CAP = [axes[0, 1]]
AXES_R = [
    axes[1, 0],
]
for i in range(len(AXES_NR)):
    ocean_mask.isel(tile=0).plot(ax=AXES_NR[i], cmap="Greys_r", add_colorbar=False)
    Variable[0].where(ocean_mask.isel(tile=0)).plot(ax=AXES_NR[i], levels=clevels, cmap=cMap, add_colorbar=False)

for i in range(len(AXES_CAP)):
    ocean_mask.isel(tile=1).transpose().plot(ax= AXES_CAP[i], cmap="Greys_r", add_colorbar=False, xincrease=False)
    Variable[1].transpose().where(ocean_mask.isel(tile=1)).plot(ax=AXES_CAP[i], levels=clevels, cmap=cMap, add_colorbar=False, xincrease=False)


for i in range(len(AXES_R)):
    # AXES_R[i].contourf(Variable[2].transpose()[::-1, :], clevels, cmap=cMap)
    ocean_mask.isel(tile=2).transpose().plot(ax= AXES_R[i], cmap="Greys_r", add_colorbar=False, yincrease=False)
    Variable[2].transpose().where(ocean_mask.isel(tile=2)).plot(ax=AXES_R[i], levels=clevels, cmap=cMap, add_colorbar=False, yincrease=False)
for ax in np.ravel(axes):
    ax.axis('off')
    plt.setp(ax.get_xticklabels(), visible=False)
    plt.setp(ax.get_yticklabels(), visible=False)
    plt.setp(ax.title, visible=False)

plt.show()

Fig. 3. Surface temperature, plotted on a horizontal layout that approximates lat-lon display. Data on the arctic cap, however, remains on a polar coordinate grid.