Open
Description
Thanks for your excellent work on benchmarking video reasoning!
I was examining the test set and noticed that some of the questions and videos don't match. This is evidenced by the presence of non-existent timestamps in the questions/explanations in test_Video-Holmes.json
. For example,
{
"video ID": "yHioWLdgnMI",
"Question ID": 394,
"Question Type": "TCI",
"Question": "What is the main reason why the man in the suit fell into the lower level of the store?",
"Options": {
"A": "The floor is dilapidated due to years of neglect.",
"B": "The truck driver starts the mechanism.",
"C": "Accidentally touched the switch during the fight.",
"D": "The barber shop owner presses the button.",
"E": "Remote control by lower-level personnel",
"F": "The man in the suit jumped in voluntarily."
},
"Answer": "D",
"Explanation": "At 2:38, after the barber shop owner regained consciousness and pressed the mechanism button, it directly caused the man in the suit to fall to the lower level."
},
{
"video ID": "yHioWLdgnMI",
"Question ID": 396,
"Question Type": "MHR",
"Question": "What does the shot of the barber shop owner pressing the button at 2:38 imply?",
"Options": {
"A": "The store is about to explode.",
"B": "Initiate self-destruct sequence",
"C": "Send a distress signal to the outside world.",
"D": "Turn off the monitoring system",
"E": "The boss and the subordinates are accomplices.",
"F": "Release smoke interference"
},
"Answer": "E",
"Explanation": "The button directly led the man in the suit to fall into the lower space where killers gathered, indicating a collaborative relationship between the boss and the lower-level personnel."
},
The 2:38
in 394 Explanation or 396 Question is invalid, as the video yHioWLdgnMI.mp4
is only 0:59
long. Based on the content of the question, I'm guessing that the video was cut too short.
Not sure if there are more situations like this, hope you can work it out.
Metadata
Metadata
Assignees
Labels
No labels