Skip to content

Limiting threads in TensorFlow #84

Open
@rth

Description

@rth

As far as I can tell, limiting the number of threads in TensorFlow with threadpoolctl currently doesn't work.

For instance with the following minimal example with Tensorflow 2.5.0,
example.py

import tensorflow as tf
import numpy as np

from threadpoolctl import threadpool_limits

with threadpool_limits(limits=1):
    X = tf.constant(np.arange(0, 5000**2, dtype=np.int32), shape=(5000, 5000))
    
    tf.matmul(X, X)

running,

time python example.py

on a 64 cores CPU, produces,

real    0m3.781s
user    1m8.685s

so the user (CPU) time is still >> real run time, meaning that many CPU are used.

This becomes an issue if people run scikit-learn's GridSearchCV or cross_validate on a Keras or TensorFlow model, since it then results in CPU over-subscription. I'm surprised there are no more issues about it at scikit-learn.

Tensorflow also regrettably doesn't recognize any environment variables to limit the number of CPU cores either. The only way I found around it is to set the CPU affinity mask with taskset. But then again it wouldn't help for cross-validation for instance, since joblib would then need to set the affinity mask when creating new processes which is currently not supported.

Has anyone looked into this in the past by any chance?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions