pasteur.transform.SeqTransformer#

class pasteur.transform.SeqTransformer(**_)[source]#

Sequence Transformers are a generalised version of Reference Transformers that can be used to process event data.

Sequence Transformers receive unprocessed parent columns, references and the ID table. Then, it is up to them to process the data and return the encoded version. They can also push columns upstream to parents, through context tables.

Event-based data is sequential. The Sequential transformers may require the order of each row. For this case, the main Sequence Transformer, which is named the sequencer, is processed first and returns an additional data column and attribute during fitting. This column and attribute are fed to the other sequence transformers.

Attributes

deterministic

For a given output, the input is the same.

lossless

The decoded output equals the input.

stateful

Transformer fits variables.

name

Methods

fit(table, data[, ref, ids, seq_val, seq])

Fits to the provided data

fit_transform(table, data[, ref, ids, ...])

get_attributes()

get_factory(*args, **kwargs)

Returns a factory that registers this module to the system.

get_seq_value()

reduce(other)

reverse(data, ctx[, ref, ids])

When reversing, the data column contains encoded data, whereas the ref column contains decoded/original data.

transform(data[, ref, ids, seq])

deterministic = True#

For a given output, the input is the same.

fit(table, data, ref=None, ids=None, seq_val=None, seq=None)[source]#

Fits to the provided data

Return type:

tuple[SeqValue, Series] | None

fit_transform(table, data, ref=None, ids=None, seq_val=None, seq=None)[source]#
Return type:

tuple[DataFrame, dict[str, DataFrame]] | tuple[DataFrame, dict[str, DataFrame], Series]

get_attributes()[source]#
Return type:

tuple[Mapping[str | tuple[str, ...], Attribute], dict[str, Mapping[str | tuple[str, ...], Attribute]]]

classmethod get_factory(*args, **kwargs)#

Returns a factory that registers this module to the system.

Any *args and **kwargs passed to this function will be saved and passed to the module’s __init__() method when calling build().

get_seq_value()[source]#
Return type:

SeqValue | None

lossless = True#

The decoded output equals the input.

name: str#
reduce(other)[source]#
reverse(data, ctx, ref=None, ids=None)[source]#

When reversing, the data column contains encoded data, whereas the ref column contains decoded/original data. Therefore, the referred columns have to be decoded first.

Return type:

DataFrame

stateful = False#

Transformer fits variables.

transform(data, ref=None, ids=None, seq=None)[source]#
Return type:

tuple[DataFrame, dict[str, DataFrame]] | tuple[DataFrame, dict[str, DataFrame], Series]