Skip to content

Commit bf20ab1

Browse files
authored
Improve process section in configure documentation (#673)
1 parent b07308e commit bf20ab1

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

docs/source/clusters-configuration-setup.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,10 +69,13 @@ This can be avoided by always using 'GiB' in dask-jobqueue configuration.
6969
Processes
7070
---------
7171

72-
By default Dask will run one Python process per job. However, you can
73-
optionally choose to cut up that job into multiple processes using the
74-
``processes`` configuration value. This can be advantageous if your
75-
computations are bound by the GIL, but disadvantageous if you plan to
72+
By default Dask will try to cut up a job into multiple processes based on
73+
the value of ``cores``. If ``cores`` is less than or equal to 4, then the number
74+
of processes will be equal to the ``cores`` value. Else, the number of processes
75+
will be at least ``sqrt(cores)``, and rest will be threads, so that
76+
``cores = processes * threads`` is maintained. You can control the number of
77+
processes using the ``processes`` configuration value. Using processes can be advantageous
78+
if your computations are bound by the GIL, but disadvantageous if you plan to
7679
communicate a lot between processes. Typically we find that for pure Numpy
7780
workloads a low number of processes (like one) is best, while for pure Python
7881
workloads a high number of processes (like one process per two cores) is best.

0 commit comments

Comments
 (0)