Skip to content

Commit 537be72

Browse files
committed
Update instructions for ScienceQA
1 parent 2f439b5 commit 537be72

File tree

4 files changed

+8533
-21
lines changed

4 files changed

+8533
-21
lines changed

README.md

Lines changed: 10 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -213,24 +213,9 @@ python -m llava.eval.model_vqa_science \
213213
--model-name /path/to/LLaVA-13b-v0-science_qa \
214214
--question-file /path/to/ScienceQA/data/scienceqa/llava_test.json \
215215
--image-folder /path/to/ScienceQA/data/scienceqa/images/test \
216-
--answers-file vqa/results/ScienceQA/test_llava-13b.jsonl
217-
```
218-
219-
Alternatively, you may evaluate this with multiple GPUs, and concatenate the generated jsonl files.
220-
221-
```Shell
222-
CHUNKS=8
223-
CHUNK_IDX=0
224-
CUDA_VISIBLE_DEVICES=CHUNK_IDX python model_vqa_science.py \
225-
--model-name /path/to/LLaVA-13b-v0-science_qa \
226-
--question-file /path/to/ScienceQA/data/scienceqa/llava_test.json \
227-
--image-folder /path/to/ScienceQA/data/scienceqa/images/test \
228-
--answers-file vqa/results/ScienceQA/test_llava-13b-chunk${CHUNKS}_${CHUNK_IDX}.jsonl \
229-
--num-chunks $CHUNKS \
230-
--chunk-idx $CHUNK_IDX
231-
232-
# after running this for all chunks, concatenate the results
233-
cat {...} > vqa/results/ScienceQA/test_llava-13b.jsonl
216+
--answers-file vqa/results/ScienceQA/test_llava-13b.jsonl \
217+
--answer-prompter
218+
--conv-mode simple
234219
```
235220

236221
3. Evaluate the generated responses
@@ -240,17 +225,21 @@ python eval_science_qa.py \
240225
--base-dir /path/to/ScienceQA/data/scienceqa \
241226
--result-file vqa/results/ScienceQA/test_llava-13b.jsonl \
242227
--output-file vqa/results/ScienceQA/test_llava-13b_output.json \
243-
--result-file vqa/results/ScienceQA/test_llava-13b_result.json \
228+
--output-result vqa/results/ScienceQA/test_llava-13b_result.json \
244229
```
245230

231+
Alternatively, you may evaluate this with multiple GPUs, and concatenate the generated jsonl files. Please refer to our script for [batch evaluation](scripts/sqa_eval_batch.sh) and [results gathering](scripts/sqa_eval_gather.sh).
232+
233+
For reference, we attach our prediction file `test_llava-13b_result.json` [here](llava/eval/table/results/test_sqa_llava_13b_v0.json) for comparison when reproducing our results, as well as for further analysis in detail.
234+
246235
## Fine-tuning
247236
### Data
248237

249-
The current version of LLaVA is fine-tuned from a Vicuna-13B model. We use approximately 600K filtered CC3M in feature alignment pretraining and 150K GPT-generated multimodal instruction-following data in finetuning. For detailed description of the data generation pipeline, please refer see our [paper](#).
238+
The current version of LLaVA is fine-tuned from a Vicuna-13B model. We use approximately 600K filtered CC3M in feature alignment pretraining and 150K GPT-generated multimodal instruction-following data in finetuning. For detailed description of the data generation pipeline, please refer see our [paper](https://arxiv.org/abs/2304.08485).
250239

251240
We are working on a more capable model that is pretrained with the data at a larger scale. Stay tuned!
252241

253-
We release all three types of multimodal instruction-following data. The use of these data is subject to OpenAI [TOS](#).
242+
We release all three types of multimodal instruction-following data. The use of these data is subject to OpenAI [TOS](https://openai.com/policies/terms-of-use).
254243

255244
### Code and Hyperparameters
256245
We fine-tune the model using the code from [FastChat](https://github.com/lm-sys/FastChat). We use a similar set of hyperparameters as Vicuna in finetuning. Both hyperparameters used in pretraining and finetuning are provided below.

0 commit comments

Comments
 (0)