zcollection.view.View.map_overlap#
- View.map_overlap(func, /, *args, depth=1, delayed=True, filters=None, partition_size=None, npartitions=None, selected_variables=None, **kwargs)[source]#
Map a function over the partitions of the view with some overlap.
- Parameters:
func (PartitionCallable) – The function to apply to every partition of the view.
depth (int) – The depth of the overlap between the partitions. Default is 0 (no overlap). If depth is greater than 0, the function is applied on the partition and its neighbors selected by the depth. If
func
accepts a partition_info as a keyword argument, it will be passed a tuple with the name of the partitioned dimension and the slice allowing getting in the dataset the selected partition without the overlap.*args – The positional arguments to pass to the function.
delayed (bool) – Whether to load data in a dask array or in memory.
filters (str | Callable[[Dict[str, int]], bool] | None) – The predicate used to filter the partitions to process. To get more information on the predicate, see the documentation of the
zcollection.Collection.partitions()
method.partition_size (int | None) – The length of each bag partition.
npartitions (int | None) – The number of desired bag partitions.
selected_variables (Iterable[str] | None) – A list of variables to retain from the view. If None, all variables are loaded. Useful to load only a subset of the view.
**kwargs – The keyword arguments to pass to the function.
- Returns:
A bag containing the tuple of the partition scheme and the result of the function.
- Return type:
Example
>>> futures = view.map_overlap( ... lambda x: (x["var1"] + x["var2"]).values, ... depth=1) >>> for item in futures: ... print(item) [1.0, 2.0, 3.0, 4.0]