-
Notifications
You must be signed in to change notification settings - Fork 467
Description
I'm crawling a large directory structure that contains 10s of thousands of high resolution images.
Using the CNN() method, it OOMs before it finishes the scan.
Traceback (most recent call last): File "/home/philglau/dedup_py/main.py", line 85, in <module> search('PyCharm') File "/home/philglau/dedup_py/main.py", line 50, in search encodings = cnn.encode_images(image_dir=image_dir,recursive=True,num_enc_workers=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/philglau/miniconda3/envs/id/lib/python3.11/site-packages/imagededup/methods/cnn.py", line 251, in encode_images return self._get_cnn_features_batch(image_dir=image_dir, recursive=recursive, num_workers=num_enc_workers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/philglau/miniconda3/envs/id/lib/python3.11/site-packages/imagededup/methods/cnn.py", line 146, in _get_cnn_features_batch arr = self.model(ims.to(self.device)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/philglau/miniconda3/envs/id/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
I think the problem is that during the scan, it is reading all the images and, most importantly running apply_mobilenet_preprocess() which is performing a transform on the image on the GPU. However, those transforms are not being consumed ?? In other words it seems like it's trying to scan the entire structure before proceeding with encoding the results??
Or at least that's what it seems like to me. (or perhaps it's doing the encoding, but some parts of the GPU memory are not being released after being consumed.)
- CNN.encode_images() calls _get_cnn_features_batch
- _get_cnn_features_batch calls img_dataloader(image_dir='mypath)
- img_dataloader calls ImgDataset with a basenet_preprocess set
- ImgDataset.get_item then applies the self.basenet_preprocess
- self.basenet_preprocess then hits apply_mobilenet_preprocess()
- which calls self.transform() which moves the data to the GPU
I believe it's all the apply_mobilenet_preprocess() that are filling the GPU before they have a chance to be consumed be the encoder ??? (or at least that's my guess)
Here's a screen shot from nvtop while CNN.encode_images() is still scanning the directories:
Shortly there after the encode_images() process crashes once the GPU (RTX 3090 with 24GB VRAM) goes OOM. I've tried adjusting the batch size using cnn.batch_size = 16 and other lower numbers but that doesn't make a difference. Still always OOMs.
As shown on the image, memory usage keeps increasing, but there is little or no compute occurring on the GPU during the time it is filling up.