zcollection.dataset.Dataset#
- class zcollection.dataset.Dataset(variables, *, attrs=None, block_size_limit=None, chunks=None, delayed=None)[source]#
Bases:
object
Hold variables, dimensions, and attributes that together form a dataset.
- Parameters:
variables (DelayedArray | Array) – A dictionary of variables in the dataset, with variable names as keys and
Array
orDelayedArray
objects as values.attrs (Sequence[Attribute] | None) – A tuple of global attributes on this dataset.
block_size_limit (int | None) – The maximum size (in bytes) of a block/chunk of variable’s data. Defaults to 128 MiB.
chunks (Sequence[Dimension] | None) – A dictionary of chunk sizes for each dimension.
delayed (bool | None) – A boolean indicating whether the dataset contains delayed variables (numpy arrays wrapped in dask arrays).
- Raises:
ValueError – If the dataset contains variables with the same dimensions but with different values.
ValueError – If the dataset contains both delayed and non-delayed variables.
Notes
The dataset is a dictionary-like container of variables. It also holds the dimensions and attributes of the dataset. If the dataset contains delayed variables, the values are
DelayedArray
objects. Otherwise, the values areArray
objects. It is impossible to mix delayed and non-delayed variables in the same dataset.Attributes
A dictionary of dimension names and their index in the dataset
class:Variable <zcollection.variable.abc.Variable> objects.
The list of global attributes on this dataset
Chunk size for each dimension
Maximum data chunk size
The type of variables in the dataset
Dimensions chunk size as a tuple.
Return the total number of bytes in the dataset.
Public Methods
add_variable
(var, /[, data])Add a variable to the dataset.
compute
(**kwargs)Compute the dataset variables.
concat
(other, dim)Concatenate datasets along a dimension.
delete
(indexer, axis)Return a new dataset without the data selected by the provided indices.
drops_vars
(names)Drop variables from the dataset.
fill_attrs
(mds)Fill the dataset and its variables attributes using the provided metadata.
from_xarray
(zds[, delayed])Create a new dataset from a xarray dataset.
isel
(slices)Return a new dataset with each array indexed along the specified slices.
merge
(other)Merge the provided dataset into this dataset.
metadata
()Get the dataset metadata.
persist
(*[, compress])Persist the dataset variables.
rechunk
(**kwargs)Rechunk the dataset.
rename
(names)Rename variables in the dataset.
select_variables_by_dims
(dims[, predicate])Return a new dataset with only the variables that have the specified dimensions if predicate is true, otherwise return a new dataset with only the variables that do not have the specified dimensions.
select_vars
(names)Return a new dataset containing only the selected variables.
set_for_insertion
(mds)Create a new dataset ready to be inserted into a collection.
to_dict
([variables])Convert the dataset to a dictionary, between the variable names and their data.
to_xarray
(**kwargs)Convert the dataset to a xarray dataset.
to_zarr
(path[, fs, parallel])Write the dataset to a Zarr store.
Special Methods
__bool__
()__getattr__
(name)__getitem__
(name)Return a variable from the dataset.
Helper for pickle.
__len__
()__repr__
()Return repr(self).
__setstate__
(state)__str__
()Return str(self).