pasteur.utils.progress.init_pool#
- class pasteur.utils.progress.init_pool(max_workers=None, refresh_processes=None)[source]#
Methods
- __init__(max_workers=None, refresh_processes=None)[source]#
Creates a shared process pool for all threads in this process.
max_workers should be set based either on cores or on how many RAM GBs will be required by each process.
log_queue connects the subprocesses to the main process logger, see pasteur.kedro.runner.parallel.py
refresh_processes sets maxtasksperchild for the pool, which prevents memory leaks from snowballing from node to node. However, due to additional imports every restart, it is slower.