ECCOv4 access via NASA’s OPeNDAP in the Cloud.#
This notebook demonstrates access to ECCOv4 model output. Broad information about the ECCO dataset can be found in the PODAAC website (see here).
Requirements to run this notebook
Have an Earth Data Login account
Have a Bearer Token.
Objectives
Use pydap, Xarray and OPeNDAP to
Discover all OPeNDAP URLs associated with a ECCOv4 data collection. This is a Level 4 dataset defined in the CubedSphere ECCOv4
Authenticate via EDL (token based)
Explore ECCOv4 collection and filter variables, and coordinates
Consolidate Metadata at the collection level
Download/stream a subset of interest
Some variables of focus are
Author: Miguel Jimenez-Urias, ‘25
from pydap.net import create_session
from pydap.client import get_cmr_urls, consolidate_metadata, open_url
import xarray as xr
import datetime as dt
import earthaccess
import matplotlib.pyplot as plt
import numpy as np
import pydap
print("xarray version: ", xr.__version__)
print("pydap version: ", pydap.__version__)
xarray version: 2026.4.0
pydap version: 3.5.10.dev14+g40484efcb
Explore the ECCOv4 Collection#
The entire ECCOv4 simulation is available via NASA’s OPeNDAP service, but it is split into several collections. Each collection is available via Netcdf files. In this tutorial, we will subset Temperature, Salinity at the oceanic surface, for a region of interest: The North Atlantic Ocean. Subsetting the CubedSphere requires specialized code. Since the data is tiled, we will only focus for sake of simplicity, on subsetting the tiles.
For more information about the Ocean Temperature and Salinity, you can checkout this resource https://podaac.jpl.nasa.gov/ECCO?sections=data
ecco_ts_ccid = "C1991543728-POCLOUD" #
time_range = [dt.datetime(2007, 1, 1), dt.datetime(2017, 12, 31)] # One month of data
cmr_urls = get_cmr_urls(ccid=ecco_ts_ccid, time_range=time_range, limit=1000) # you can incread the limit of results
print("################################################ \n We found a total of ", len(cmr_urls), "OPeNDAP URLS!!!\n################################################")
################################################
We found a total of 133 OPeNDAP URLS!!!
################################################
Authenticate#
To hide the abstraction, we will use earthaccess to authenticate, and create a CachedSession to consolidate all metadata pre-download. This approach can result in speed up of 100x when aggregating all URLs into a single Xarray Dataset object.
auth = earthaccess.login(strategy="netrc", persist=True) # you will be promted to add your EDL credentials
# pass Token Authorization to a new Session.
cache_kwargs={'cache_name':'data/ECCOv4'}
my_session = create_session(use_cache=True, session=auth.get_session(), cache_kwargs=cache_kwargs)
my_session.cache.clear()
Explore Variables in collection and filter down to keep only desirable ones#
Here we demonstrate pydap as a exploratory metadata tool. Without downloading any array data, will let us know:
Variables Names, sizes, attributes
Dimensions and Coordinates
Allow us to construct an OPeNDAP Constraint Expression that subsets by variable name
pyds = open_url(cmr_urls[0].replace("https", "dap4"), session=my_session)
pyds.tree()
.OCEAN_TEMPERATURE_SALINITY_mon_mean_2006-12_ECCO_V4r4_native_llc0090.nc
├──XG
├──Zp1
├──Zl
├──YC
├──XC
├──SALT
├──YG
├──XC_bnds
├──Zu
├──THETA
├──Z_bnds
├──YC_bnds
├──time_bnds
├──Z
├──i
├──i_g
├──j
├──j_g
├──k
├──k_l
├──k_p1
├──k_u
├──nb
├──nv
├──tile
└──time
Note
NetCDF files are self describing, and often contain all information necessary to interprete the data. At the collection level (dataset), much of this information is duplicated across each file. With OPeNDAP we can reduce the amount of data processed/download with Constraint Expressions.
We are interested only in Salinity (SALT) and Temperature (THETA), and the necessary extra dimensions/coordinates to work with this arrays. We use PyDAP to help us figure this out
pyds['THETA'].dims
['/time', '/k', '/tile', '/j', '/i']
pyds['SALT'].dims
['/time', '/k', '/tile', '/j', '/i']
pyds['THETA'].coordinates, pyds["SALT"].coordinates
('YC Z XC time', 'YC Z XC time')
Filter by Variable Names (CEs)#
We use the information on dimensions and coordinates, to further filter all possible variables in the dataset. We accomplish this by adding a query parameter of the form:
<base_url> + ?dap4.ce=/var_name1;...
Any variable name included in the query expression, will be processed and all others will be ignored.
Note
You can also construct the full URL with constraint expressions interactively, by selecting manually the variables on the datasets’s Data Request Form and selecting Copy (Raw) Data URL. To go to this page, you need to append a .dmr to each opendap url and insert it into a browser.
dims = pyds['THETA'].dims
Vars = ['/THETA', "/SALT"] + dims
# Below construct Contraint Expression
CE = "?dap4.ce="+(";").join(Vars)
print("Constraint Expression: ", CE)
Constraint Expression: ?dap4.ce=/THETA;/SALT;/time;/k;/tile;/j;/i
DAP4 URLs#
To tell pydap and Xarray which OPeNDAP protocol to use to stream data, we can replace the scheme of each url with dap4. DAP4 is a relatively new protocol (compared to DAP2, and it is widely used across NASA.
ECCO_urls = [url.replace("https", "dap4") + CE for url in cmr_urls]
ECCO_urls[0]
'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2006-12_ECCO_V4r4_native_llc0090?dap4.ce=/THETA;/SALT;/time;/k;/tile;/j;/i'
CubedSphere: Visualizing Depth in the native grid#
Before continuing, we explore the complex topology of the ECCO dataset via the Grid file.
Note
Many of the coordinate variables in the Temperature/Salinity file are also present in the Grid file.
ECCO data is defined on a Cube Sphere – meaning the horizontal grid contains an extra dimension: tile or face. You can inspect the data in its native grid by plotting all horizontal data onto a grid. For that, we look at the Depth variable. This is NOT included in same collection as Temperature. Below we provide the Cloud OPeNDAP URL, which can be queried from the CMR or Earthdata Search.
## Concept collection ID for ECCO grid
grid_ccid = "C2013557893-POCLOUD"
Grid_url = get_cmr_urls(ccid=grid_ccid)[0] # only one element
Grid_url = Grid_url.replace("https", "dap4")
grid_ds = open_url(Grid_url)
grid_ds
<DatasetType with children 'dxC', 'PHrefF', 'XG', 'dyG', 'rA', 'hFacS', 'Zp1', 'Zl', 'rAw', 'dxG', 'maskW', 'YC', 'XC', 'maskS', 'YG', 'hFacC', 'drC', 'drF', 'XC_bnds', 'Zu', 'Z_bnds', 'YC_bnds', 'PHrefC', 'rAs', 'Depth', 'dyC', 'SN', 'rAz', 'maskC', 'CS', 'hFacW', 'Z', 'i', 'i_g', 'j', 'j_g', 'k', 'k_l', 'k_p1', 'k_u', 'nb', 'nv', 'tile'>
Note
The coordinates for THETA and SALT are present in the single file for Grid. And so we will not include them in our workflow until the end
Download a single variable#
With pure PyDAP, slicing an array triggers the download of that array. We illustrate that below
%%time
Depth = grid_ds["Depth"][:].data # <-------- LOADS DATA as in-memory numpy array
CPU times: user 76.9 ms, sys: 14.3 ms, total: 91.2 ms
Wall time: 6.33 s
Depth.shape
(13, 90, 90)
Variable = [Depth[i] for i in range(13)]
clevels = np.linspace(0, 6000, 100)
cMap = 'Greys_r'
fig, axes = plt.subplots(nrows=5, ncols=5, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES = [
axes[4, 0], axes[3, 0], axes[2, 0], axes[4, 1], axes[3, 1], axes[2, 1],
axes[1, 1],
axes[1, 2], axes[1, 3], axes[1, 4], axes[0, 2], axes[0, 3], axes[0, 4],
]
for i in range(len(AXES)):
AXES[i].contourf(Variable[i], clevels, cmap=cMap)
for ax in np.ravel(axes):
ax.axis('off')
plt.setp(ax.get_xticklabels(), visible=False)
plt.setp(ax.get_yticklabels(), visible=False)
plt.show()
Fig. 1. Depth plotted on a horizontal layout. Data on tiles 0-5 follow C-ordering, whereas data on tiles 7-13 follow F-ordering. Data on the arctic cap, is defined on a polar coordinate grid.
Plot with corrected Topology
fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES_NR = [
axes[3, 0], axes[2, 0], axes[1, 0], axes[3, 1], axes[2, 1], axes[1, 1],
]
AXES_CAP = [axes[0, 0]]
AXES_R = [
axes[1, 2], axes[2, 2], axes[3, 2], axes[1, 3], axes[2, 3], axes[3, 3],
]
for i in range(len(AXES_NR)):
AXES_NR[i].contourf(Variable[i], clevels, cmap=cMap)
for i in range(len(AXES_CAP)):
AXES_CAP[i].contourf(np.transpose(Variable[6])[:, ::-1], clevels, cmap=cMap)
for i in range(len(AXES_R)):
AXES_R[i].contourf(np.transpose(Variable[7+i])[::-1, :], clevels, cmap=cMap)
for ax in np.ravel(axes):
ax.axis('off')
plt.setp(ax.get_xticklabels(), visible=False)
plt.setp(ax.get_yticklabels(), visible=False)
plt.show()
Fig. 2. Depth plotted on a horizontal layout that approximates lat-lon display. Data on the arctic cap, however, remains on a polar coordinate grid.
Consolidate Metadata#
Aggregating multiple remote files at the collection level requires persistent internet connection. The pydap backend allows to download and store the metadata required by xarray locally as a sqlite3 database, and this database can be used as session manager (for futher downloads). Creating this databaset can be done once, and reused, version controlled, etc. Reusing this database can cut the time to generate the aggregated dataset view from minutes to seconds.
Persisting metadata for later reuse#
When executing consolidate_metadata below, a sqlite3 database is created in the default directory specified by the cache_name, in the CachedSession. This metadata file can be stored persistently, and version-controlled, to avoid running consolidate metadata everytime.
Note
To persist the metadata database, avoid executing session.cache.clear(). You can also place the .sqlite in a version controlled repository local to your workflow that can track any changes to it. With this, re-executing consolidate_metadata will exit in <1 second.
Warning
After creatng the sqlite database, you can access it across Kernel restarts by simply creating the session as done above, choosing the cache_name that matches the name of the sqlite database. BUT you will may also need to re-execute consolidate_metadata, as in some cases consolidate_metadata creates a special key_fn to reuse the cache associated with certain dimensions across urls. This behavior will go away in a future release.
%%time
consolidate_metadata(ECCO_urls, session=my_session, concat_dim='time')
datacube has dimensions ['i[0:1:89]', 'j[0:1:89]', 'k[0:1:49]', 'tile[0:1:12]'] , and concat dim: `['time']`
CPU times: user 2.63 s, sys: 785 ms, total: 3.42 s
Wall time: 1min 16s
len(my_session.cache.urls())
268
Dataset aggregation via Xarray#
Before creating the aggregated dataset, we know we are interested in:
Surface data (a single element
k=0, wherekis the dimension along the vertical axis).Tiles
2,6, and10, cover the North Atlantic Ocean (see depth plots below).
We want the OPeNDAP server to do the subsetting for us, reducing the amount of data downloaded. As such, we need to define the chunking that will match our server request. See below
A note about chunks#
With OPeNDAP, any download from a granule should be considered as a single chunk, from the perspective of Xarray. The DAP4 protocol does send chunk responses over the web, but these are assembled by pydap in memory as individual numpy arrays. You can see the perspective of Xarray on the remote chunking by passing a chunks={} argument when opening the dataset.
%%time
ds = xr.open_mfdataset(
ECCO_urls,
engine='pydap',
session=my_session,
parallel=True,
combine='nested',
concat_dim='time',
chunks={},
)
ds
CPU times: user 4.05 s, sys: 477 ms, total: 4.52 s
Wall time: 4.81 s
<xarray.Dataset> Size: 6GB
Dimensions: (time: 133, k: 50, tile: 13, j: 90, i: 90)
Coordinates:
* time (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
* k (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
* tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
* j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
* i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
Data variables:
SALT (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
THETA (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
Attributes: (12/62)
acknowledgement: This research was carried out by the Jet...
author: Ian Fenty and Ou Wang
cdm_data_type: Grid
comment: Fields provided on the curvilinear lat-l...
Conventions: CF-1.8, ACDD-1.3
coordinates_comment: Note: the global 'coordinates' attribute...
... ...
time_coverage_duration: P1M
time_coverage_end: 2007-01-01T00:00:00
time_coverage_resolution: P1M
time_coverage_start: 2006-12-01T00:00:00
title: ECCO Ocean Temperature and Salinity - Mo...
uuid: f32555f0-4181-11eb-bed0-0cc47a3f47a3Xarray treats the remote data variable as an individual chunk, no matter the size of the OPeNDAP subset#
ds['THETA']
<xarray.DataArray 'THETA' (time: 133, k: 50, tile: 13, j: 90, i: 90)> Size: 3GB
dask.array<concatenate, shape=(133, 50, 13, 90, 90), dtype=float32, chunksize=(1, 50, 13, 90, 90), chunktype=numpy.ndarray>
Coordinates:
* time (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
* k (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
* tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
* j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
* i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
Attributes:
long_name: Potential temperature
units: degree_C
coverage_content_type: modelResult
standard_name: sea_water_potential_temperature
comment: Sea water potential temperature is the temperatur...
valid_min: -2.2909388542175293
valid_max: 36.032955169677734
origname: THETA
fullnamepath: /THETA
Maps: ()For Chunk Considerations#
Download performance is a game of Chunk Sizes. For OPeNDAP, this concerns the size of the slice, which is bounded from above by the size of the granule (at the variable level). However, if the chunks are too small, the download will take forever. Too large, and you will download too much data from S3 (assuming this tutorial is being run outside of the region). In that last scenario (data is on S3), egress charges are important, and we find ourselves with the different choices for our approach to download data:
Minimize the amount of data leaving the cloud. This is accompished by the following chunking:
chunks={'k':1, 'tile':1}. However, since we will download 3 tiles per granule, chunking the dataset will multiply by a factor of 3 the amount of request made to the remote OPeNDAP server, only for the data to be assembled by Xarray afterwards, once downloaded locally and before storing.Optimized performance. In this case, this is done by choosing the following chunking:
chunks={'k':1}. It will download all the tiles per granule, and the top level only. All these are easily subsetted by Xarray once data is downloaded. This option also keeps the number of requests to the OPeNDAP server at a minimum.
Warning
Even with the optimized performance approach, this may not be fast. Download speed will depend on internet connection, or data locality. Overall, the number of request to the server will be ~600 in this scenario, all behind some form of authentication.
%%time
ds = xr.open_mfdataset(
ECCO_urls,
engine='pydap',
session=my_session,
parallel=True,
combine='nested',
concat_dim='time',
chunks={'k':1},
)
ds
CPU times: user 3.75 s, sys: 402 ms, total: 4.15 s
Wall time: 4.54 s
<xarray.Dataset> Size: 6GB
Dimensions: (time: 133, k: 50, tile: 13, j: 90, i: 90)
Coordinates:
* time (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
* k (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
* tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
* j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
* i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
Data variables:
SALT (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
THETA (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/62)
acknowledgement: This research was carried out by the Jet...
author: Ian Fenty and Ou Wang
cdm_data_type: Grid
comment: Fields provided on the curvilinear lat-l...
Conventions: CF-1.8, ACDD-1.3
coordinates_comment: Note: the global 'coordinates' attribute...
... ...
time_coverage_duration: P1M
time_coverage_end: 2007-01-01T00:00:00
time_coverage_resolution: P1M
time_coverage_start: 2006-12-01T00:00:00
title: ECCO Ocean Temperature and Salinity - Mo...
uuid: f32555f0-4181-11eb-bed0-0cc47a3f47a3Include the grid data#
Now we will access the remote Grid file to include some additional grid variables. This new individual dataset will be merged with our temperature/salinity dataset.
Note
We will NOT use the same CachedSession object. Instead, we will initiate a different session object. Note the speed it takes may be similar (same order) as aggregating 100s of urls. This is what consolidate metadata does.
%%time
### create anew session
session = create_session(session=auth.get_session())
### create an individual dataset with only the variables of interest
grid_ds = xr.open_dataset(Grid_url+"?dap4.ce=/Depth;/XC;/YC;/i;/j;/tile", engine='pydap', session=session)
grid_ds
CPU times: user 83.2 ms, sys: 11.8 ms, total: 95 ms
Wall time: 8.11 s
<xarray.Dataset> Size: 1MB
Dimensions: (tile: 13, j: 90, i: 90)
Coordinates:
* tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
* j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
* i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
YC (tile, j, i) float32 421kB ...
XC (tile, j, i) float32 421kB ...
Data variables:
Depth (tile, j, i) float32 421kB ...
Attributes: (12/58)
acknowledgement: This research was carried out by the Jet...
author: Ian Fenty and Ou Wang
cdm_data_type: Grid
comment: Fields provided on the curvilinear lat-l...
Conventions: CF-1.8, ACDD-1.3
coordinates_comment: Note: the global 'coordinates' attribute...
... ...
references: ECCO Consortium, Fukumori, I., Wang, O.,...
source: The ECCO V4r4 state estimate was produce...
standard_name_vocabulary: NetCDF Climate and Forecast (CF) Metadat...
summary: This dataset provides geometric paramete...
title: ECCO Geometry Parameters for the Lat-Lon...
uuid: 87ff7d24-86e5-11eb-9c5f-f8f21e2ee3e0#### Combine the two datasets into a single dataset reference
nds = xr.merge([ds, grid_ds])
nds
<xarray.Dataset> Size: 6GB
Dimensions: (time: 133, k: 50, tile: 13, j: 90, i: 90)
Coordinates:
* time (time) datetime64[ns] 1kB 2006-12-16T12:00:00 ... 2017-12-16T06:...
* k (k) int32 200B 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49
* tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
* j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
* i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
YC (tile, j, i) float32 421kB ...
XC (tile, j, i) float32 421kB ...
Data variables:
SALT (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
THETA (time, k, tile, j, i) float32 3GB dask.array<chunksize=(1, 1, 13, 90, 90), meta=np.ndarray>
Depth (tile, j, i) float32 421kB ...
Attributes: (12/62)
acknowledgement: This research was carried out by the Jet...
author: Ian Fenty and Ou Wang
cdm_data_type: Grid
comment: Fields provided on the curvilinear lat-l...
Conventions: CF-1.8, ACDD-1.3
coordinates_comment: Note: the global 'coordinates' attribute...
... ...
time_coverage_duration: P1M
time_coverage_end: 2007-01-01T00:00:00
time_coverage_resolution: P1M
time_coverage_start: 2006-12-01T00:00:00
title: ECCO Ocean Temperature and Salinity - Mo...
uuid: f32555f0-4181-11eb-bed0-0cc47a3f47a3Stream all surface data with OPeNDAP, subset tiles, and store data locally with Xarray and PyDAP as engine#
%%time
nds.isel(k=0, tile=[2,6,10]).to_netcdf("data/ECCOv4_NA.nc4")
CPU times: user 373 ms, sys: 45.1 ms, total: 418 ms
Wall time: 20.7 s
---------------------------------------------------------------------------
ResponseError Traceback (most recent call last)
ResponseError: too many 500 error responses
The above exception was the direct cause of the following exception:
MaxRetryError Traceback (most recent call last)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/adapters.py:696, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
695 try:
--> 696 resp = conn.urlopen(
697 method=request.method,
698 url=url,
699 body=request.body, # type: ignore[arg-type] # urllib3 stubs don't accept Iterable[bytes | str]
700 headers=request.headers, # type: ignore[arg-type] # urllib3#3072
701 redirect=False,
702 assert_same_host=False,
703 preload_content=False,
704 decode_content=False,
705 retries=self.max_retries,
706 timeout=resolved_timeout,
707 chunked=chunked,
708 )
710 except (ProtocolError, OSError) as err:
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/connectionpool.py:955, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
954 log.debug("Retry: %s", url)
--> 955 return self.urlopen(
956 method,
957 url,
958 body,
959 headers,
960 retries=retries,
961 redirect=redirect,
962 assert_same_host=assert_same_host,
963 timeout=timeout,
964 pool_timeout=pool_timeout,
965 release_conn=release_conn,
966 chunked=chunked,
967 body_pos=body_pos,
968 preload_content=preload_content,
969 decode_content=decode_content,
970 **response_kw,
971 )
973 return response
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/connectionpool.py:955, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
954 log.debug("Retry: %s", url)
--> 955 return self.urlopen(
956 method,
957 url,
958 body,
959 headers,
960 retries=retries,
961 redirect=redirect,
962 assert_same_host=assert_same_host,
963 timeout=timeout,
964 pool_timeout=pool_timeout,
965 release_conn=release_conn,
966 chunked=chunked,
967 body_pos=body_pos,
968 preload_content=preload_content,
969 decode_content=decode_content,
970 **response_kw,
971 )
973 return response
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/connectionpool.py:945, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
944 try:
--> 945 retries = retries.increment(method, url, response=response, _pool=self)
946 except MaxRetryError:
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/urllib3/util/retry.py:543, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
542 reason = error or ResponseError(cause)
--> 543 raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
545 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)
MaxRetryError: HTTPSConnectionPool(host='opendap.earthdata.nasa.gov', port=443): Max retries exceeded with url: /providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2014-08_ECCO_V4r4_native_llc0090.dap?dap4.ce=/THETA%5B0:1:0%5D%5B0:1:0%5D%5B0:1:12%5D%5B0:1:89%5D%5B0:1:89%5D&dap4.checksum=true (Caused by ResponseError('too many 500 error responses'))
During handling of the above exception, another exception occurred:
RetryError Traceback (most recent call last)
Cell In[24], line 1
----> 1 get_ipython().run_cell_magic('time', '', 'nds.isel(k=0, tile=[2,6,10]).to_netcdf("data/ECCOv4_NA.nc4")\n')
File <timed eval>:1
----> 1 'Could not get source, probably due dynamically evaluated source code.'
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/dataset.py:2134, in Dataset.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf, auto_complex)
2130 if encoding is None:
2131 encoding = {}
2132 from xarray.backends.writers import to_netcdf
2133
-> 2134 return to_netcdf( # type: ignore[return-value] # mypy cannot resolve the overloads:(
2135 self,
2136 path,
2137 mode=mode,
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/writers.py:450, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf, auto_complex)
447 if multifile:
448 return writer, store
--> 450 writes = writer.sync(compute=compute)
452 finally:
453 if not multifile and not autoclose: # type: ignore[redundant-expr,unused-ignore]
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/common.py:420, in ArrayWriter.sync(self, compute, chunkmanager_store_kwargs)
417 if chunkmanager_store_kwargs is None:
418 chunkmanager_store_kwargs = {}
--> 420 delayed_store = chunkmanager.store(
421 self.sources,
422 self.targets,
423 lock=self.lock,
424 compute=compute,
425 flush=True,
426 regions=self.regions,
427 **chunkmanager_store_kwargs,
428 )
429 self.sources = []
430 self.targets = []
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/namedarray/daskmanager.py:247, in DaskManager.store(self, sources, targets, **kwargs)
239 def store(
240 self,
241 sources: Any | Sequence[Any],
242 targets: Any,
243 **kwargs: Any,
244 ) -> Any:
245 from dask.array import store
--> 247 return store(
248 sources=sources,
249 targets=targets,
250 **kwargs,
251 )
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/dask/array/core.py:1220, in store(***failed resolving arguments***)
1217 if not return_stored:
1218 import dask
-> 1220 dask.compute(arrays, **kwargs)
1221 return None
1222 else:
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/dask/base.py:685, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
682 expr = expr.optimize()
683 keys = list(flatten(expr.__dask_keys__()))
--> 685 results = schedule(expr, keys, **kwargs)
687 return repack(results)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:686, in ImplicitToExplicitIndexingAdapter.__array__(self, dtype, copy)
682 def __array__(
683 self, dtype: DTypeLike | None = None, /, *, copy: bool | None = None
684 ) -> np.ndarray:
685 if Version(np.__version__) >= Version("2.0.0"):
--> 686 return np.asarray(self.get_duck_array(), dtype=dtype, copy=copy)
687 else:
688 return np.asarray(self.get_duck_array(), dtype=dtype)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:691, in ImplicitToExplicitIndexingAdapter.get_duck_array(self)
690 def get_duck_array(self):
--> 691 return self.array.get_duck_array()
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:924, in CopyOnWriteArray.get_duck_array(self)
923 def get_duck_array(self):
--> 924 return self.array.get_duck_array()
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/coding/common.py:80, in _ElementwiseFunctionArray.get_duck_array(self)
79 def get_duck_array(self):
---> 80 return self.func(self.array.get_duck_array())
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:764, in LazilyIndexedArray.get_duck_array(self)
761 from xarray.backends.common import BackendArray
763 if isinstance(self.array, BackendArray):
--> 764 array = self.array[self.key]
765 else:
766 array = apply_indexer(self.array, self.key)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:51, in PydapArrayWrapper.__getitem__(self, key)
50 def __getitem__(self, key):
---> 51 return indexing.explicit_indexing_adapter(
52 key, self.shape, indexing.IndexingSupport.BASIC, self._getitem
53 )
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/indexing.py:1156, in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
1134 """Support explicit indexing by delegating to a raw indexing method.
1135
1136 Outer and/or vectorized indexers are supported by indexing a second time
(...) 1153 Indexing result, in the form of a duck numpy-array.
1154 """
1155 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
-> 1156 result = raw_indexing_method(raw_key.tuple)
1157 if numpy_indices.tuple:
1158 # index the loaded duck array
1159 indexable = as_indexable(result)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:56, in PydapArrayWrapper._getitem(self, key)
55 def _getitem(self, key):
---> 56 result = robust_getitem(self.array, key, catch=ValueError)
57 result = np.asarray(result.data)
58 axis = tuple(n for n, k in enumerate(key) if isinstance(k, integer_types))
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/common.py:300, in robust_getitem(array, key, catch, max_retries, initial_delay)
298 for n in range(max_retries + 1):
299 try:
--> 300 return array[key]
301 except catch:
302 if n == max_retries:
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/model.py:538, in BaseType.__getitem__(self, index)
535 return out
537 out = copy.copy(self)
--> 538 data = self._get_data_index(index)
540 # Check if index is a full slice (e.g., [:], ..., or tuple of all slice(None))
541 if (
542 index == slice(None)
543 or index == Ellipsis
544 or (isinstance(index, tuple) and all(i == slice(None) for i in index))
545 ):
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/model.py:587, in BaseType._get_data_index(self, index)
583 return np.vectorize(decode_np_strings, otypes=self._data.dtype.char)(
584 self._data[index]
585 )
586 else:
--> 587 return self._get_data()[index]
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/handlers/dap.py:599, in BaseProxyDap4.__getitem__(self, index, build_only)
597 if self.id not in _vars and "debug" not in self.session.cache.cache_name:
598 cache_kwargs = {"skip": True}
--> 599 r = GET(
600 url,
601 self.application,
602 self.session,
603 timeout=self.timeout,
604 verify=self.verify,
605 get_kwargs=self.get_kwargs,
606 cache_kwargs=cache_kwargs,
607 )
609 dataset = UNPACKDAP4DATA(r, self.checksums, self.user_charset).dataset
610 self._data = dataset[self.id]._data
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/net.py:74, in GET(url, application, session, timeout, verify, use_cache, session_kwargs, cache_kwargs, get_kwargs)
72 _, _, path, _, query, fragment = urlparse(url)
73 url = urlunparse(("", "", path, "", _quote(query), fragment))
---> 74 res = create_request(
75 url,
76 application=application,
77 session=session,
78 timeout=timeout,
79 verify=verify,
80 session_kwargs=session_kwargs,
81 cache_kwargs=cache_kwargs,
82 get_kwargs=get_kwargs,
83 )
84 if isinstance(res, webob_Request):
85 res = get_response(res, application, verify=verify)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/net.py:172, in create_request(url, application, timeout, verify, session, session_kwargs, cache_kwargs, get_kwargs)
170 session_kwargs = session_kwargs or {}
171 try:
--> 172 req = get_request(
173 url,
174 base_session=session,
175 timeout=timeout,
176 verify=verify,
177 get_kwargs=get_kwargs,
178 backend=cache_kwargs.pop("backend", None),
179 cache_name=cache_kwargs.pop("http-cache", None),
180 backend_options=cache_kwargs.pop("backend_options", None),
181 cache_kwargs=cache_kwargs,
182 session_kwargs=session_kwargs,
183 )
184 req.raise_for_status()
185 return req
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/net.py:331, in get_request(url, base_session, timeout, verify, get_kwargs, backend, cache_name, backend_options, cache_kwargs, session_kwargs)
329 if should_skip_cache(url, s) or skip:
330 with s.cache_disabled():
--> 331 req = s.get(
332 url,
333 timeout=timeout,
334 verify=verify,
335 allow_redirects=True,
336 **(get_kwargs or {}),
337 )
338 else:
339 req = s.get(
340 url,
341 timeout=timeout,
(...) 344 **(get_kwargs or {}),
345 )
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:133, in CacheMixin.get(self, url, params, **kwargs)
131 def get(self, url: str, params=None, **kwargs) -> AnyResponse: # type: ignore
132 kwargs.setdefault('allow_redirects', True)
--> 133 return self.request('GET', url, params=params, **kwargs)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:189, in CacheMixin.request(self, method, url, headers, expire_after, only_if_cached, refresh, force_refresh, *args, **kwargs)
187 headers = set_request_headers(headers, expire_after, only_if_cached, refresh, force_refresh)
188 with patch_form_boundary() if kwargs.get('files') else nullcontext():
--> 189 return super().request(method, url, *args, headers=headers, **kwargs)
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/sessions.py:651, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
646 send_kwargs = {
647 "timeout": timeout,
648 "allow_redirects": allow_redirects,
649 }
650 send_kwargs.update(settings)
--> 651 resp = self.send(prep, **send_kwargs)
653 return resp
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:255, in CacheMixin.send(self, request, expire_after, only_if_cached, refresh, force_refresh, **kwargs)
253 response = self._resend(request, actions, cached_response, **kwargs) # type: ignore
254 elif actions.send_request:
--> 255 response = self._send_and_cache(request, actions, cached_response, **kwargs)
256 else:
257 response = cached_response # type: ignore # Guaranteed to be non-None by this point
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests_cache/session.py:279, in CacheMixin._send_and_cache(self, request, actions, cached_response, **kwargs)
275 """Send a request and cache the response, unless disabled by settings or headers.
276 If applicable, also handle conditional requests.
277 """
278 request = actions.update_request(request)
--> 279 response = super().send(request, **kwargs)
280 actions.update_from_response(response)
282 if not actions.skip_write:
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/sessions.py:784, in Session.send(self, request, **kwargs)
781 start = preferred_clock()
783 # Send the request
--> 784 r = adapter.send(request, **kwargs)
786 # Total elapsed time of the request (approximately)
787 elapsed = preferred_clock() - start
File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/requests/adapters.py:720, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
717 raise ConnectTimeout(e, request=request)
719 if isinstance(e.reason, ResponseError):
--> 720 raise RetryError(e, request=request)
722 if isinstance(e.reason, _ProxyError):
723 raise ProxyError(e, request=request)
RetryError: HTTPSConnectionPool(host='opendap.earthdata.nasa.gov', port=443): Max retries exceeded with url: /providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2014-08_ECCO_V4r4_native_llc0090.dap?dap4.ce=/THETA%5B0:1:0%5D%5B0:1:0%5D%5B0:1:12%5D%5B0:1:89%5D%5B0:1:89%5D&dap4.checksum=true (Caused by ResponseError('too many 500 error responses'))
Finally, inspect the downloaded data#
mds = xr.open_dataset("data/ECCOv4_NA.nc4")
mds
Variable = [mds['THETA'][0, i, :, :] for i in range(3)]
clevels = np.linspace(-5, 30, 100)
cMap='RdBu_r'
ocean_mask = mds["Depth"]>0
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(8, 8), gridspec_kw={'hspace':0.001, 'wspace':0.001})
AXES_NR = [
axes[1, 1],
]
AXES_CAP = [axes[0, 1]]
AXES_R = [
axes[1, 0],
]
for i in range(len(AXES_NR)):
ocean_mask.isel(tile=0).plot(ax=AXES_NR[i], cmap="Greys_r", add_colorbar=False)
Variable[0].where(ocean_mask.isel(tile=0)).plot(ax=AXES_NR[i], levels=clevels, cmap=cMap, add_colorbar=False)
for i in range(len(AXES_CAP)):
ocean_mask.isel(tile=1).transpose().plot(ax= AXES_CAP[i], cmap="Greys_r", add_colorbar=False, xincrease=False)
Variable[1].transpose().where(ocean_mask.isel(tile=1)).plot(ax=AXES_CAP[i], levels=clevels, cmap=cMap, add_colorbar=False, xincrease=False)
for i in range(len(AXES_R)):
# AXES_R[i].contourf(Variable[2].transpose()[::-1, :], clevels, cmap=cMap)
ocean_mask.isel(tile=2).transpose().plot(ax= AXES_R[i], cmap="Greys_r", add_colorbar=False, yincrease=False)
Variable[2].transpose().where(ocean_mask.isel(tile=2)).plot(ax=AXES_R[i], levels=clevels, cmap=cMap, add_colorbar=False, yincrease=False)
for ax in np.ravel(axes):
ax.axis('off')
plt.setp(ax.get_xticklabels(), visible=False)
plt.setp(ax.get_yticklabels(), visible=False)
plt.setp(ax.title, visible=False)
plt.show()
Fig. 3. Surface temperature, plotted on a horizontal layout that approximates lat-lon display. Data on the arctic cap, however, remains on a polar coordinate grid.