pasteur.encode.PostprocessEncoder#

class pasteur.encode.PostprocessEncoder(*args, _from_factory=False, **kwargs)[source]#

Same as AttributeEncoder but allows customizing the tables after they have been encoded or adding additional ones.

Unlike AttributeEncoder, this one does not parallelize per-table, so it should be avoided unless customization is required.

The context tables and their metadata are merged to the parent tables automatically, so they are not provided as an argument to finalize().

Default implementations are provided which behave as the normal AttributeEncoder.

Attributes

name

Methods

decode(enc)

rtype:

DataFrame

encode(data)

rtype:

DataFrame

finalize(meta, tables, ids)

rtype:

dict[str, Any]

fit(attr, data)

get_factory(*args, **kwargs)

Returns a factory that registers this module to the system.

get_metadata()

rtype:

dict[str | tuple[str], TypeVar(META)]

reduce(other)

undo(meta, data)

Undoes the process of finalize(), returns a tuple of (ids, tables).

decode(enc)#
Return type:

DataFrame

encode(data)#
Return type:

DataFrame

finalize(meta, tables, ids)[source]#
Return type:

dict[str, Any]

fit(attr, data)#
classmethod get_factory(*args, **kwargs)#

Returns a factory that registers this module to the system.

Any *args and **kwargs passed to this function will be saved and passed to the module’s __init__() method when calling build().

get_metadata()#
Return type:

dict[str | tuple[str], TypeVar(META)]

name: str = ''#
reduce(other)#
undo(meta, data)[source]#

Undoes the process of finalize(), returns a tuple of (ids, tables).

Return type:

tuple[dict[str, DataFrame], dict[str, DataFrame]]