Open
Description
I have used the provided syntax.
clusterer = Birch()
with parallel_backend('spark', n_jobs=100):
clusterer.fit(df.toPandas())
Spark UI does not register it as a job and no executors get deployed. However, the example provided in docs gets registered as a sparkjob.
Error - "Unable to allocate 920. GiB for an array with shape (123506239506,) and data type float64"
Activity
WeichenXu123 commentedon Dec 8, 2022
This looks like it runs out of memory, should not be an issue of joblib-spark.