You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/ScienceQA.md
+12-87Lines changed: 12 additions & 87 deletions
Original file line number
Diff line number
Diff line change
@@ -5,115 +5,40 @@
5
5
2. Generate ScienceQA dataset for LLaVA conversation-style format.
6
6
7
7
```Shell
8
-
python scripts/convert_sqa_to_llava \
8
+
python scripts/convert_sqa_to_llava.py \
9
9
convert_to_llava \
10
10
--base-dir /path/to/ScienceQA/data/scienceqa \
11
+
--prompt-format "QCM-LEA" \
11
12
--split {train,val,minival,test,minitest}
12
13
```
13
14
14
15
#### Training
15
-
**NOTE**: Due to that ScienceQA experiments were done earlier, the current checkpoints are trained *without*`<im_start>` and `<im_end>` tokens. Here we provide our training scripts for the current checkpoints.
You can download our pretrained projector weights from our [Model Zoo](), or train your own projector weights using [`pretrain.sh`](https://github.com/haotian-liu/LLaVA/blob/main/scripts/pretrain.sh).
51
20
52
-
<details>
53
-
<summary>2. Finetuning</summary>
21
+
2. Finetuning
54
22
55
-
You may download our pretrained `llava-13b-v0-pretrain-no_im_start_end_token.bin`[here](https://huggingface.co/liuhaotian/LLaVA-13b-pretrain-projector-v0/blob/main/LLaVA-13b-pretrain-projector-v0-CC3M-595K-original_caption-no_im_token.bin).
See [`finetune_sqa.sh`](https://github.com/haotian-liu/LLaVA/blob/main/scripts/finetune_sqa.sh).
90
24
91
25
#### Evaluation
92
-
93
-
1. Download our pretrained LLaVA-13B (delta) weights for ScienceQA dataset [here](https://huggingface.co/liuhaotian/LLaVA-13b-delta-v0-science_qa). Convert the delta weights to actual weights.
94
-
95
-
```Shell
96
-
python -m llava.model.apply_delta \
97
-
--base /path/to/llama-13b \
98
-
--target /path/to/LLaVA-13b-v0-science_qa \
99
-
--delta liuhaotian/LLaVA-13b-delta-v0-science_qa
100
26
```
101
27
102
-
2.[Option 1] Multiple-GPU inference
28
+
1. Multiple-GPU inference
103
29
You may evaluate this with multiple GPUs, and concatenate the generated jsonl files. Please refer to our script for [batch evaluation](https://github.com/haotian-liu/LLaVA/blob/main/scripts/sqa_eval_batch.sh) and [results gathering](https://github.com/haotian-liu/LLaVA/blob/main/scripts/sqa_eval_gather.sh).
For reference, we attach our prediction file [`test_llava-13b_result.json`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/table/results/test_sqa_llava_13b_v0.json) for comparison when reproducing our results, as well as for further analysis in detail.
54
+
For reference, we attach our prediction file [`test_sqa_llava_13b_v0.json`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/table/results/test_sqa_llava_13b_v0.json) for comparison when reproducing our results, as well as for further analysis in detail.
0 commit comments