pasteur.dataset

pasteur.dataset#

Description

This module holds the definitions for the Dataset module, the initial entrypoint for data in Pasteur.

Classes

Dataset(**_)

A class for a Dataset named name that creates a set of tables based on the provided dependencies.

TabularDataset(**_)

Boilerplate for a tabular dataset.

TypedDataset(**_)

Extend from to create an intermediary step in ingestion, where the table is loaded from <dataset>.raw@<table> to a parquet one `<dataset>.typed.<table>.