How can I diagnose downloading issues?

How can I diagnose downloading issues?#

There are various ways that you can diagnose any issue you experience when trying to download data from an OPeNDAP server via pydap. Downloading data through OPeNDAP/pydap means downloading metadata only (i.e. dmr or dds/das), or binary data (dap / dods) from an OPeNDAP server. You can download different types of Responses depending on the suffix you append to a url.

Appendable suffix

Type of Response

.dmr

DAP4 metadata

.dmr.html

DAP4 Request Form

.dmr.xml

DAP4 Metadata

.dds

DAP2 Metadata

.das

DAP2 Metadata

.html

DAP2 Request Form

.dap

DAP4 Binary

.dods

DAP2 Binary

Note

If you are primarily using xarray, and the dataset generation or download times is slow, try using only pydap. xarray tends to be slower than pydap because of all the extra functionality that xarray adds to the dataset (and many internal checks required to do so).

Pythonic Approach#

pydap.client.open_url uses the Python’s requests library to authenticate and download data. You can try to download any of following responses via requests.session:

import requests
session = requests.session()

# assuming url points to a DAP4 dataset, otherwise replace `dmr` with `dds` and `dap` with `dods`
rdmr = session.get(data_url+".dmr")
rdap = session.get(data_url+".dap")

a) If rdmr returns a 200 status code, then pydap should be successful in creating a pydap dataset. If rdmr returns a [401] or [403] HTTP error, it is possible that you are experiencing authentication issues. Make sure you have the right credentials stored in a local netrc file, and that these remain valid. requests and therefore pydap should recover these credentials automatically, as long as the netrc is located in the default location.

Warning

Some older GrADS servers expose a data DAP2 URL beginning with http://, even though this url-scheme is no longer supported by NASA. Try replacing http with https in the URL. See this 2025 github issue.

b) If both rdmr and rdap are much faster than pydap in creating a dataset or downloading data,it is possible that the dataset contains a large amount of variables, or the remote data has many small chunks. Try first creating a pydap dataset with only a few variables from the remote dataset, and subset these by their indexes. This documentation should help your learn how to.

Warning

If dataset creation is fast, but downloading the array is extremely slow, then it is very likely that the variables in the remote dataset has lots of very small chunks. A sign of this behavior is when the download of the binary data is extremely slow, compared to the metadata. This scenario is unfortunate. One thing you should do is to download many spatial subsets of the remote dataset, and aggregate them in your machine.

curl#

Curl is a great tool for diagnosing HTTP errors such as redirect issues, authentication errors, etc. If you cannot download an OPeNDAP response with curl, then likely you wont be able to download it with pydap.

The following command is useful when downloading:

curl -L -n -v -o output.dmr "http:// ... .dmr"

where -L implies follow redirects, -n instructs curl to recover authentication credentials from the .netrc file (in the default location), -v instructs curl to “be verbose”, -o implies to download the remote resource onto a file name output.dmr.

If timing remains an issue, of HTTP errors are persistent, please consider opening an issue on the pydap/issue_tracker.