intake_erddap.ERDDAPCatalogReader

class intake_erddap.ERDDAPCatalogReader(server: str, bbox: Tuple[float, float, float, float] | None = None, standard_names: List[str] | None = None, variable_names: List[str] | None = None, start_time: datetime | str | None = None, end_time: datetime | str | None = None, search_for: List[str] | None = None, kwargs_search: MutableMapping[str, str | int | float | Sequence[str]] = None, category_search: Tuple[str, str] | None = None, erddap_client: Type[ERDDAP] | None = None, use_source_constraints: bool = True, protocol: str = 'tabledap', chunks: dict | None = None, xarray_kwargs: dict | None = None, metadata: dict = None, variables: list = None, query_type: str = 'union', cache_period: int | float | None = 500, open_kwargs: dict = None, mask_failed_qartod: bool = False, dropna: bool = False, cache_kwargs: dict | None = None, **kwargs)[source]

Makes data sources out of all datasets the given ERDDAP service

Parameters:
  • server (str) –

    URL to the ERDDAP service. Example: "https://coastwatch.pfeg.noaa.gov/erddap"

    Note

    Do not include a trailing slash.

  • bbox (tuple of 4 floats, optional) – For explicit geographic search queries, pass a tuple of four floats in the bbox argument. The bounding box parameters are (min_lon, min_lat, max_lon, max_lat).

  • standard_names (list of str, optional) – For explicit search queries for datasets containing a given standard_name use this argument. Example: [“air_temperature”, “air_pressure”].

  • variable_names (list of str, optional) – For explicit search queries for datasets containing a variable with a given name. This can be useful when the client knows of a particular variable name or a convention applied where there is no CF standard name.

  • start_time (datetime, optional) – For explicit search queries for datasets that contain data after start_time.

  • end_time (datetime, optional) – For explicit search queries for datasets that contain data before end_time.

  • search_for (list of str, optional) – For explicit search queries for datasets that any contain of the terms specified in this keyword argument.

  • kwargs_search (dict, optional) –

    Keyword arguments to input to search on the server before making the catalog. Options are:

    • to search by bounding box: include all of min_lon, max_lon, min_lat, max_lat: (int, float)

      Longitudes must be between -180 to +180.

    • to search within a datetime range: include both of min_time, max_time: interpretable

      datetime string, e.g., “2021-1-1”

    • to search using a textual keyword: include search_for as either

      a string or a list of strings. Multiple values will be searched individually and combined in the final catalog results.

  • category_search (list, tuple, optional) – Use this to narrow search by ERDDAP category. The syntax is (category, key), e.g. (“standard_name”, “temp”). category is the ERDDAP category for filtering results. Good choices for selecting variables are “standard_name” and “variableName”. key is the custom_criteria key to narrow the search by, which will be matched to the category results using the custom_criteria that must be set up or input by the user, with cf-pandas. Currently only a single key can be matched at a time.

  • use_source_constraints (bool, default True) – Any relevant search parameter defined in kwargs_search will be passed to the source objects as constraints.

  • protocol (str, default "tabledap") – One of the two supported ERDDAP Data Access Protocols: “griddap”, or “tabledap”. “tabledap” will present tabular datasets using pandas, meanwhile “griddap” will use xarray.

  • chunks (dict, optional) – For griddap protocol, pass a dictionary of chunk sizes for the xarray.

  • xarray_kwargs (dict, optional) – For griddap protocol, pass a dictionary of kwargs to pass to the xarray.open_dataset method.

  • metadata (dict, optional) – Extra metadata for the intake catalog.

  • variables (list of str, optional) – List of variables to limit the dataset to, if available. If you’re not sure what variables are available, check info_url for the station, or look up the dataset on the ERDDAP server.

  • query_type (str, default "union") – Specifies how the catalog should apply the query parameters. Choices are "union" or "intersection". If the query_type is set to "intersection", then the set of results will be the intersection of each individual query made to ERDDAP. This is equivalent to a logical AND of the results. If the value is "union" then the results will be the union of each resulting dataset. This is equivalent to a logical OR.

  • open_kwargs (dict, optional) – Keyword arguments to pass to the open method of the ERDDAP Reader, e.g. pandas read_csv. Response is an optional keyword argument that will be used by ERDDAPY to determine the response format. Default is “csvp” and for TableDAP Readers, “csv” and “csv0” are reasonable choices too.

  • mask_failed_qartod (bool, False) – WARNING ALPHA FEATURE. If True and *_qc_agg columns associated with data columns are available, data values associated with QARTOD flags other than 1 and 2 will be nan’ed out. Has not been thoroughly tested.

  • dropna (bool, False.) – WARNING ALPHA FEATURE. If True, rows with data columns of nans will be dropped from data frame. Has not been thoroughly tested.

  • cache_kwargs (dict, optional) – WARNING ALPHA FEATURE. If you want to have the data you access stored locally in a cache, use this keyword to input a dictionary of keywords. The cache is set up using fsspec’s simple cache. Example configuration is cache_kwargs=dict(cache_storage="/tmp/fnames/", same_names=True).

search_url

If a search is performed on the ERDDAP server, the search url is saved as an attribute.

Type:

str

server

The Base URL of the ERDDAP instance.

Type:

str

Attributes:
data

The BaseData this reader depends on, if it has one

func_doc
token

Token is computed from all non-_ attributes and then cached.

transform

Methods

__call__(*args, **kwargs)

New version of this instance with altered arguments

apply(func, *args[, output_instance])

Make a pipeline by applying a function to this reader's output

discover(**kwargs)

Part of the data

doc()

Doc associated with loading function

from_dict(data)

Recreate instance from the results of to_dict()

get_client()

Return an initialized ERDDAP Client.

get_search_urls()

Return the search URLs used in generating the catalog.

output_doc()

Doc associated with output type

pprint()

Produce nice text formatting of the instance's contents

qname()

package.module:class name of this class, makes str for import_name

read()

Produce data artefact

to_cat([name])

Create a Catalog containing on this reader

to_dict()

Dictionary representation of the instances contents

to_entry()

Create an entry version of this, ready to be inserted into a Catalog

to_reader([outtype, reader])

Make a different reader for the data used by this reader

auto_pipeline

check_imports

tab_completion_fixer

__init__(server: str, bbox: Tuple[float, float, float, float] | None = None, standard_names: List[str] | None = None, variable_names: List[str] | None = None, start_time: datetime | str | None = None, end_time: datetime | str | None = None, search_for: List[str] | None = None, kwargs_search: MutableMapping[str, str | int | float | Sequence[str]] = None, category_search: Tuple[str, str] | None = None, erddap_client: Type[ERDDAP] | None = None, use_source_constraints: bool = True, protocol: str = 'tabledap', chunks: dict | None = None, xarray_kwargs: dict | None = None, metadata: dict = None, variables: list = None, query_type: str = 'union', cache_period: int | float | None = 500, open_kwargs: dict = None, mask_failed_qartod: bool = False, dropna: bool = False, cache_kwargs: dict | None = None, **kwargs)[source]

Methods

__init__(server[, bbox, standard_names, ...])

apply(func, *args[, output_instance])

Make a pipeline by applying a function to this reader's output

auto_pipeline(outtype[, avoid])

check_imports()

discover(**kwargs)

Part of the data

doc()

Doc associated with loading function

from_dict(data)

Recreate instance from the results of to_dict()

get_client()

Return an initialized ERDDAP Client.

get_search_urls()

Return the search URLs used in generating the catalog.

output_doc()

Doc associated with output type

pprint()

Produce nice text formatting of the instance's contents

qname()

package.module:class name of this class, makes str for import_name

read()

Produce data artefact

tab_completion_fixer(item)

to_cat([name])

Create a Catalog containing on this reader

to_dict()

Dictionary representation of the instances contents

to_entry()

Create an entry version of this, ready to be inserted into a Catalog

to_reader([outtype, reader])

Make a different reader for the data used by this reader

Attributes

data

The BaseData this reader depends on, if it has one

func

function name for loading data

func_doc

docstring origin if not from func

implements

datatype(s) this applies to

imports

top-level packages required to use this

name

optional_imports

packages that might be required by some options

other_funcs

function names to recognise when matching user calls

output_instance

type the reader produces

token

Token is computed from all non-_ attributes and then cached.

transform