Parallelisation not working for BIRCH clustering

I have used the provided syntax.   
```
clusterer = Birch()
with parallel_backend('spark', n_jobs=100):
    clusterer.fit(df.toPandas())
```   
     
Spark UI does not register it as a job and no executors get deployed. However, the example provided in docs gets registered as a sparkjob.   

Error - _"Unable to allocate 920. GiB for an array with shape (123506239506,) and data type float64"_    
<img width="888" alt="Screenshot 2022-08-27 at 12 33 08 AM" src="https://user-images.githubusercontent.com/98305872/186974714-e7ab9ea2-9ff4-44b4-af6a-e638e5d411a5.png">


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelisation not working for BIRCH clustering #42

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Parallelisation not working for BIRCH clustering #42

Description

Activity

WeichenXu123 commented on Dec 8, 2022

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions