Acceso a ECCOv4 por medio de EarthData in the Cloud de la NASA#

Este tutorial demuestra acceso al producto ECCOv4 de un modelo numerico global. Para mayor informacion del producto de ECCO, puede ir aqui.

Requisitos

  1. Tener una cuenta de Earth Data Login por medio de la NASA.

  2. Tener un Token valido.

Tambien se puede utilizar el metodo de Nombre de Usuario / Contrasena descrito en el tutorial de Autenticacion

Objectivos

Utilizar pydap para demostrar

  • Acceso to archivos cientificos de la NASA por medio del uso de tokens y EarthData como metodo de autenticacion.

Algunas variables de interes son:

Autor: Miguel Jimenez-Urias, ‘24

import matplotlib.pyplot as plt
import numpy as np
from pydap.net import create_session
from pydap.client import open_url
import xarray as xr
import pydap
print("xarray version: ", xr.__version__)
print("pydap version: ", pydap.__version__)
xarray version:  2026.4.0
pydap version:  3.5.10.dev14+g40484efcb

Acceso a EARTHDATA#

El listado de las variables de este producto pueden encontrarse aqui. En este caso, accederemos a las variables en su malla original, el LL90.

Grid_url = 'https://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Geometry%20Parameters%20for%20the%20Lat-Lon-Cap%2090%20(llc90)%20Native%20Model%20Grid%20(Version%204%20Release%204)/granules/GRID_GEOMETRY_ECCO_V4r4_native_llc0090'

Autenticacion via .netrc#

Las credenciales son recuperadas automaticamente por pydap.

my_session = create_session()

Alternativa: El use the tokens#

session_extra = {"token": "YourToken"}

# initialize a requests.session object with the token headers. All handled by pydap.
my_session = create_session(session_kwargs=session_extra)

Accesso a los metadatos solamente, por medio de pydap#

pydap aprovecha el protocolo de OPeNDAP, el cual permite la separacion de los metadatos de los valores numericos. Esto permite una inspeccion remota de los datasets.

ds_grid = open_url(Grid_url, session=my_session, protocol="dap4")
ds_grid.tree()
.GRID_GEOMETRY_ECCO_V4r4_native_llc0090.nc
├──dxC
├──PHrefF
├──XG
├──dyG
├──rA
├──hFacS
├──Zp1
├──Zl
├──rAw
├──dxG
├──maskW
├──YC
├──XC
├──maskS
├──YG
├──hFacC
├──drC
├──drF
├──XC_bnds
├──Zu
├──Z_bnds
├──YC_bnds
├──PHrefC
├──rAs
├──Depth
├──dyC
├──SN
├──rAz
├──maskC
├──CS
├──hFacW
├──Z
├──i
├──i_g
├──j
├──j_g
├──k
├──k_l
├──k_p1
├──k_u
├──nb
├──nv
└──tile

Note

PyDAP accesa solo a los metadatos de archivo en el servidor de OPeNDAP, y ningun arreglo numerico ha sido descargardo hasta este punto!

Para descargar los arreglos numericos hay que indexar la variable

Esto es, de la siguiente manera:

# this fetches remote data into a pydap object container
pydap_Array = dataset[<VarName>][:]

donde <VarName> es el nombre de una de las variables. El producto sera una representation “en memoria” de tipo pydap.model.BaseType, el cual permite el acceso a los arreglos de numpy.

Para extraer los valores de la variable remota, hay que ejecutar el siguiente comando

# The `.data` command allows direct access to the Numpy array (e.g. for manipulation)
pydap_Array.data
# lets download some data
Depth = ds_grid['Depth'][:]
print(type(Depth))
<class 'pydap.model.BaseType'>
Depth.attributes
{'_FillValue': 9.969209968e+36,
 'long_name': 'model seafloor depth below ocean surface at rest',
 'units': 'm',
 'coordinate': 'XC YC',
 'coverage_content_type': 'modelResult',
 'standard_name': 'sea_floor_depth_below_geoid',
 'comment': "Model sea surface height (SSH) of 0m corresponds to an ocean surface at rest relative to the geoid. Depth corresponds to seafloor depth below geoid. Note: the MITgcm used by ECCO V4r4 implements 'partial cells' so the actual model seafloor depth may differ from the seafloor depth provided by the input bathymetry file.",
 'coordinates': 'YC XC',
 'origname': 'Depth',
 'fullnamepath': '/Depth',
 'Maps': (),
 '_DAP4_Checksum_CRC32': np.uint32(3811813944)}
Depth.shape, Depth.dims
((13, 90, 90), ['/tile', '/j', '/i'])

Visualizando el fondo oceanico Depth en la malla original del modelo

En este caso, el producto ECCO esta definido en una malla con topologia de un Cubo Esferico. De esta manera, la malla horizontal contiene una dimension extra llamada: tile o face. A continuacion visualizamos la variable en su topologia original

Variable = [Depth[i].data for i in range(13)]
clevels =  np.linspace(0, 6000, 100)
cMap = 'Greys_r'
fig, axes = plt.subplots(nrows=5, ncols=5, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES = [
    axes[4, 0], axes[3, 0], axes[2, 0], axes[4, 1], axes[3, 1], axes[2, 1],
    axes[1, 1], 
    axes[1, 2], axes[1, 3], axes[1, 4], axes[0, 2], axes[0, 3], axes[0, 4],
]
for i in range(len(AXES)):
    AXES[i].contourf(Variable[i], clevels, cmap=cMap)

for ax in np.ravel(axes):
    ax.axis('off')
    plt.setp(ax.get_xticklabels(), visible=False)
    plt.setp(ax.get_yticklabels(), visible=False)

plt.show()
../_images/8deb991839e486f093d257b856cf5d089cf76987d9d0f66604c31bd6f61ab15d.png

Fig. 1. La variable Depth visualizada en una malla horitonzal. Tiles con valores 0-5 tienen un ordenamiento de valores (indiciales) C-ordering, mientras que cualquier arreglo numerico tiles 7-13 siguen el F-ordering. Cualquier arreglo numerico en el arctic cap, tiene un comportamiento de una coordenada polar.

Visualizacion con topologia corregida

fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES_NR = [
    axes[3, 0], axes[2, 0], axes[1, 0], axes[3, 1], axes[2, 1], axes[1, 1],
]
AXES_CAP = [axes[0, 0]]
AXES_R = [
    axes[1, 2], axes[2, 2], axes[3, 2], axes[1, 3], axes[2, 3], axes[3, 3],
]
for i in range(len(AXES_NR)):
    AXES_NR[i].contourf(Variable[i], clevels, cmap=cMap)

for i in range(len(AXES_CAP)):
    AXES_CAP[i].contourf(Variable[6].transpose()[:, ::-1], clevels, cmap=cMap)

for i in range(len(AXES_R)):
    AXES_R[i].contourf(Variable[7+i].transpose()[::-1, :], clevels, cmap=cMap)

for ax in np.ravel(axes):
    ax.axis('off')
    plt.setp(ax.get_xticklabels(), visible=False)
    plt.setp(ax.get_yticklabels(), visible=False)

plt.show()
../_images/57aff92f93cde9a0259585ca5662b6f3488687e2a0b3ec72699e276a03b23821.png

Fig. 2. Visualizacion de la variable Depth en una malla horizontal que approxima una malla horizontal uniforme con ejes lat-lon. Sin embargo, cualquier arreglo numerico en el arctic cap, continua en una malla que approxima una malla en coordenadas polares

Utilizando xarray#

pydap se puede llamar dentro de xarray para acceder a los archivos expuestos mediante el servidor de OPeNDAP. En particular, xarray permite aggregar multiples archivos remotos de opendap. A continuacion demostramos un pequeno ejemplo.

Un URL individual –> un archivo remoto

Below we access remote Temperature and Salinity ECCO data with xarray via (internally) pydap.

baseURL = 'https://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/'
Temp_Salt = "ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_"
year = '2017-'
month = '01'
end_ = '_ECCO_V4r4_native_llc0090'
Temp_2017 = baseURL + Temp_Salt +year  + month + end_

# Convertimos el URL en un DAP4 opendap url
Temp_2017 = Temp_2017.replace("https", "dap4")
Temp_2017
'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2017-01_ECCO_V4r4_native_llc0090'
pyds = open_url(Temp_2017, session=my_session)
pyds.tree()
.OCEAN_TEMPERATURE_SALINITY_mon_mean_2017-01_ECCO_V4r4_native_llc0090.nc
├──XG
├──Zp1
├──Zl
├──YC
├──XC
├──SALT
├──YG
├──XC_bnds
├──Zu
├──THETA
├──Z_bnds
├──YC_bnds
├──time_bnds
├──Z
├──i
├──i_g
├──j
├──j_g
├──k
├──k_l
├──k_p1
├──k_u
├──nb
├──nv
├──tile
└──time

Para evitar descargas variables que no sean necesariamente de interest, se le puede instruir al Servidor Hyrax que variables requiere uno.

Para esto utilizaremos las Expresiones de Restriccion (CE, por sus siglas en ingles). Este metodo nos ayudara a construir datasets mucho mas simples, y evitar descargar N veces, informacion que no es requerida.

A continuacion demostramos el caso de solo requerir la variable THETA y sus dimensiones.

dims = pyds['/THETA'].dims
Vars = ['/THETA'] + dims

# Below construct Contraint Expression
CE = "?dap4.ce="+(";").join(Vars)
print("Constraint Expression: ", CE)
Constraint Expression:  ?dap4.ce=/THETA;/time;/k;/tile;/j;/i
Temp_2017 = baseURL + Temp_Salt +year  + month + end_ + CE
Temp_2017
'https://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2017-01_ECCO_V4r4_native_llc0090?dap4.ce=/THETA;/time;/k;/tile;/j;/i'

Importante#

Ahora aplicamos la CE a todos los possibles URLs the nuestro dataset.

Temp_2017 = [baseURL.replace("https", "dap4") + Temp_Salt + year + f'{i:02}' + end_ + CE for i in range(1, 13)]
Temp_2017[:3]
['dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2017-01_ECCO_V4r4_native_llc0090?dap4.ce=/THETA;/time;/k;/tile;/j;/i',
 'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2017-02_ECCO_V4r4_native_llc0090?dap4.ce=/THETA;/time;/k;/tile;/j;/i',
 'dap4://opendap.earthdata.nasa.gov/providers/POCLOUD/collections/ECCO%20Ocean%20Temperature%20and%20Salinity%20-%20Monthly%20Mean%20llc90%20Grid%20(Version%204%20Release%204)/granules/OCEAN_TEMPERATURE_SALINITY_mon_mean_2017-03_ECCO_V4r4_native_llc0090?dap4.ce=/THETA;/time;/k;/tile;/j;/i']

#

Debajo inicializaremos una session que persista, cada vez que descargue datos de servidores de OPeNDAP. En este caso utilizaremost la libraria requests_cache. El comportamiento es casi identico excepto la siguiente celda:

cache_session = create_session(use_cache=True, cache_kwargs={"cache_name":"data/ecco"}) # caching the session
cache_session.cache.clear() # clear all cache to demonstrate the behavior

Warning

A partir de la version de pydap >= 3.5.5, existe el nuevo metodo experimental consolidated_metadata que permite a pydap descargar todos los metadatos, y reusarlos. Este metodo continua en desarrollo y sus propiedades pueden cambiar en las siguientes versiones.

from pydap.client import consolidate_metadata
%%time
consolidate_metadata(Temp_2017, session=cache_session, concat_dim='time')
datacube has dimensions ['i[0:1:89]', 'j[0:1:89]', 'k[0:1:49]', 'tile[0:1:12]'] , and concat dim: `['time']`
CPU times: user 2.49 s, sys: 537 ms, total: 3.02 s
Wall time: 28.8 s
%%time
theta_salt_ds = xr.open_mfdataset(
    Temp_2017, 
    engine='pydap',
    session=cache_session, 
    parallel=True, 
    combine='nested', 
    concat_dim='time',
)
CPU times: user 989 ms, sys: 143 ms, total: 1.13 s
Wall time: 2.99 s
Traceback (most recent call last):

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/IPython/core/interactiveshell.py:3748 in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In[21], line 1
    get_ipython().run_cell_magic('time', '', "theta_salt_ds = xr.open_mfdataset(\n    Temp_2017, \n    engine='pydap',\n    session=cache_session, \n    parallel=True, \n    combine='nested', \n    concat_dim='time',\n)\n")

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/IPython/core/interactiveshell.py:2594 in run_cell_magic
    result = fn(*args, **kwargs)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/IPython/core/magics/execution.py:1448 in time
    raise captured_exception

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/IPython/core/magics/execution.py:1412 in time
    exec(code, glob, local_ns)

  File <timed exec>:1

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/api.py:1662 in open_mfdataset
    datasets, closers = dask.compute(datasets, closers)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/dask/base.py:685 in compute
    results = schedule(expr, keys, **kwargs)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/api.py:607 in open_dataset
    backend_ds = backend.open_dataset(

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:292 in open_dataset
    ds = store_entrypoint.open_dataset(

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/store.py:42 in open_dataset
    vars, attrs = filename_or_obj.load()

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/common.py:370 in load
    (_decode_variable_name(k), v) for k, v in self.get_variables().items()

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:194 in get_variables
    return FrozenDict((k, self.open_store_variable(self.ds[k])) for k in _vars)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/core/utils.py:522 in FrozenDict
    return Frozen(dict(*args, **kwargs))

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:194 in <genexpr>
    return FrozenDict((k, self.open_store_variable(self.ds[k])) for k in _vars)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:169 in open_store_variable
    data_array = self._get_data_array(var)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/xarray/backends/pydap_.py:228 in _get_data_array
    get_batch_data(var, checksums=self._checksums)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/client.py:1265 in get_batch_data
    fetch_consolidated(ds, cache_urls=cache_urls, checksums=checksums)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/client.py:1427 in fetch_consolidated
    pyds = UNPACKDAP4DATA(r, checksums=checksums).dataset

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/handlers/dap.py:1079 in __init__
    dataset = dmr_to_dataset(self.dmr, dmrVersion=self.dmrVersion)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/parsers/dmr.py:257 in dmr_to_dataset
    dom_et = copy.deepcopy(DMRParser(dmr, dmrVersion=dmrVersion).node)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/site-packages/pydap/parsers/dmr.py:413 in __init__
    self.node = ET.fromstring(_dmr)

  File ~/mambaforge/envs/pydap_docs/lib/python3.11/xml/etree/ElementTree.py:1350 in XML
    parser.feed(text)

  File <string>
    
ParseError: syntax error: line 1, column 0
theta_salt_ds

Finalmente, visualizamos THETA#

Variable = [theta_salt_ds['THETA'][0, 0, i, :, :] for i in range(13)]
clevels = np.linspace(-5, 30, 100)
cMap='RdBu_r'
fig, axes = plt.subplots(nrows=4, ncols=4, figsize=(8, 8), gridspec_kw={'hspace':0.01, 'wspace':0.01})
AXES_NR = [
    axes[3, 0], axes[2, 0], axes[1, 0], axes[3, 1], axes[2, 1], axes[1, 1],
]
AXES_CAP = [axes[0, 0]]
AXES_R = [
    axes[1, 2], axes[2, 2], axes[3, 2], axes[1, 3], axes[2, 3], axes[3, 3],
]
for i in range(len(AXES_NR)):
    AXES_NR[i].contourf(Variable[i], clevels, cmap=cMap)

for i in range(len(AXES_CAP)):
    AXES_CAP[i].contourf(Variable[6].transpose()[:, ::-1], clevels, cmap=cMap)

for i in range(len(AXES_R)):
    AXES_R[i].contourf(Variable[7+i].transpose()[::-1, :], clevels, cmap=cMap)

for ax in np.ravel(axes):
    ax.axis('off')
    plt.setp(ax.get_xticklabels(), visible=False)
    plt.setp(ax.get_yticklabels(), visible=False)

plt.show()

Fig. 3. Visualizacion de la variable Surface temperature, similar al metodo utilizado en la Figura 2.