How to design negative samples for Florence-2 model training? #144
Replies: 10 comments
-
We're currently experiencing a situation where our model's mAP (mean Average Precision) metrics are degrading while the loss values suggest overfitting. Our current saving strategy is based solely on validation loss, as shown in the following code snippet: def save_best(self, processor: AutoProcessor, model: AutoModelForCausalLM, val_loss: float):
"""Saves the best model checkpoint if the validation loss improves.
Args:
processor (AutoProcessor): The processor to save.
model (AutoModelForCausalLM): The model to save.
val_loss (float): The current validation loss.
"""
if val_loss < self.best_val_loss:
self.best_val_loss = val_loss
save_model(self.best_checkpoint_dir, processor, model)
print(f"New best model saved with validation loss: {self.best_val_loss}") I've been looking at our model saving strategy, and I'm curious about your thoughts on its effectiveness. While we're using validation loss as the primary metric for saving the best model, it seems that our mAP scores are not reflecting the improvements we see in the loss. Do you think relying solely on validation loss is the best approach for designing our model saving criteria? Would it be more beneficial to consider a combination of metrics, such as both validation loss and mAP, to ensure we're not just minimizing loss but also improving the model's precision? Or are there other metrics or strategies you believe would be more suitable for our current situation? Looking forward to your insights on this matter. |
Beta Was this translation helpful? Give feedback.
-
Hi @David-19940718 👋🏻 First of all, I'm thrilled to have users like you who are eager to experiment early on and push the library forward. Regarding negative samples, I don't think there are any established best practices at the moment, but I'll ask a few people involved in VLM training about it. I thought a good idea, and potentially simple to implement, would be to use the COCO dataset as negative samples. For example, splitting the training into two parts. In the first part, you fine-tune only on your dataset, and in the second part, on a mix of your dataset and the COCO dataset. This way, in the first phase, the model quickly learns your classes, and in the second phase, it becomes resistant to overfitting. As for your second question, the ability to define any metric as a condition for saving a checkpoint sounds very reasonable. I'll try to add a GH issue to add such support. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your detailed and encouraging response. 😄 |
Beta Was this translation helpful? Give feedback.
-
Hi @SkalskiP, By introducing appropriate data augmentation strategies, I've observed a significant reduction in overfitting. Moreover, under the same experimental conditions, the mAP accuracy has improved by several percentage points. In future version development plans, it might be worth considering the addition of this feature. |
Beta Was this translation helpful? Give feedback.
-
Hi @David-19940718 👋🏻 That looks fantastic! Could you tell me exactly what strategies you employed? |
Beta Was this translation helpful? Give feedback.
-
Sure! The main strategies I employed are:
class DetectionDataset(Dataset):
def __init__(self, jsonl_file_path: str, image_directory_path: str, split_name: str):
self.dataset = JSONLDataset(jsonl_file_path, image_directory_path)
self.mode = split_name
if split_name == "train":
self.transform = transforms.Compose([
transforms.RandomHorizontalFlip(p=0.5),
transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1)
])
def __len__(self):
return len(self.dataset)
def __getitem__(self, idx):
image, data = self.dataset[idx]
prefix = data["prefix"]
suffix = data["suffix"]
# Apply data augmentation
if self.mode == "train":
image = self.transform(image)
return prefix, suffix, image |
Beta Was this translation helpful? Give feedback.
-
Hi @David-19940718 👋🏻 Oh, so you ended up using fairly traditional data augmentation techniques? From what I see, you applied flipping. I understand that you also had to augment the object detection suffix in the process. |
Beta Was this translation helpful? Give feedback.
-
Yes, I just did a simple initial validation. I applied some basic data augmentation techniques to get started and test things out. 😄 |
Beta Was this translation helpful? Give feedback.
-
@David-19940718 would you perhaps have a moment to draft a PR introducing basic data augmentation? |
Beta Was this translation helpful? Give feedback.
-
Hello! Would be interested to know if there are any updates regarding this! Currently working on fine-tuning Florence2-base-ft for Object Detection tasks and have tried the following:
Leaving the negative samples out entirely still led to better results as compared to the two annotation methods I've tried where the model is unable to converge as well. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Search before asking
Question
Hi, @skylargivens,
We currently have a good understanding of how to create positive samples for the Florence-2 model, using a format like this:
However, I'm unclear on how to properly design negative samples for training. Negative samples are crucial for improving the model's ability to discriminate and reduce false positives. Some questions I have:
Any guidance or best practices for creating effective negative samples would be greatly appreciated. This will help ensure we're training the Florence-2 model optimally for object detection tasks.
Additional
If there are any existing resources, documentation, or examples specifically for Florence-2 negative sample creation, please point me in that direction. Also, if there are any tools or scripts the team recommends for generating or augmenting negative samples, that information would be very helpful.
Beta Was this translation helpful? Give feedback.
All reactions