pasteur.utils.data.LazyDataset#
- class pasteur.utils.data.LazyDataset(merged_load, partitions=None)[source]#
Attributes
Methods
are_partitioned(*positional, **keyword)Returns whether the provided datasets are partitioned.
cache(*positional, **keyword)items()keys()separate()Splits the datasets into partitioned and not partitioned and returns them.
values()wrap(*positional, **keyword)Converts provided arguments to lazy.
zip(*positional, **keyword)Aligns and returns a dictionary of partition ids to partitions.
zip_values(*positional, **keyword)Same as zip, but doesn't return partition names and works even if the datasets are not partitioned, by returning a single partition.
- static are_partitioned(*positional, **keyword)[source]#
Returns whether the provided datasets are partitioned. If they are, checks they have the same partitions.
- property partitioned#
- property sample#
- separate()[source]#
Splits the datasets into partitioned and not partitioned and returns them.
non_partitioned, partitioned = separate_partitioned(datasets)
- Return type:
tuple[dict[str,LazyDataset[TypeVar(A)]],dict[str,LazyDataset[TypeVar(A)]]]
- property shape#
- classmethod wrap(*positional, **keyword)[source]#
Converts provided arguments to lazy. Tuples, dicts, and lists are traversed, and every object found in them is wrapped in a LazyDataset.
- Return type:
Any