pasteur.utils.data.LazyDataset#

class pasteur.utils.data.LazyDataset(merged_load, partitions=None)[source]#

Attributes

Methods

`are_partitioned`(positional, *keyword)	Returns whether the provided datasets are partitioned.
`cache`(positional, *keyword)
`items`()
`keys`()
`separate`()	Splits the datasets into partitioned and not partitioned and returns them.
`values`()
`wrap`(positional, *keyword)	Converts provided arguments to lazy.
`zip`(positional, *keyword)	Aligns and returns a dictionary of partition ids to partitions.
`zip_values`(positional, *keyword)	Same as zip, but doesn't return partition names and works even if the datasets are not partitioned, by returning a single partition.

static are_partitioned(*positional, **keyword)[source]#: Returns whether the provided datasets are partitioned. If they are, checks they have the same partitions.

classmethod cache(*positional, **keyword)[source]#

separate()[source]#

Splits the datasets into partitioned and not partitioned and returns them.

non_partitioned, partitioned = separate_partitioned(datasets)

Return type:: tuple[dict[str, LazyDataset[TypeVar(A)]], dict[str, LazyDataset[TypeVar(A)]]]

classmethod wrap(*positional, **keyword)[source]#

Converts provided arguments to lazy. Tuples, dicts, and lists are traversed, and every object found in them is wrapped in a LazyDataset.

static zip(*positional, **keyword)[source]#

Aligns and returns a dictionary of partition ids to partitions.

Partitions can be a list, if positional arguments were provided, or a dictionary if keyword arguments were provided.

@warning: all partitioned sets should have the same keys.

static zip_values(*positional, **keyword)[source]#

Same as zip, but doesn’t return partition names and works even if the datasets are not partitioned, by returning a single partition.

pasteur.utils.data.LazyDataset