guspan-tanadi
diff --git a/‎LICENSE
Lines changed: 1 addition & 1 deletion b/‎LICENSE
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md
Lines changed: 93 additions & 2 deletions b/‎README.md
Lines changed: 93 additions & 2 deletions
diff --git a/‎data/generate_trainauglist.py
Lines changed: 23 additions & 0 deletions b/‎data/generate_trainauglist.py
Lines changed: 23 additions & 0 deletions
diff --git a/‎data/xml2png_context.py
Lines changed: 89 additions & 0 deletions b/‎data/xml2png_context.py
Lines changed: 89 additions & 0 deletions
diff --git a/‎data/xml2png_voc.py
Lines changed: 82 additions & 0 deletions b/‎data/xml2png_voc.py
Lines changed: 82 additions & 0 deletions
diff --git a/‎experiment/blpseg-context/__init__.py b/‎experiment/blpseg-context/__init__.py
diff --git a/‎experiment/blpseg-context/config.py
Lines changed: 63 additions & 0 deletions b/‎experiment/blpseg-context/config.py
Lines changed: 63 additions & 0 deletions
@@ -1,6 +1,6 @@
 MIT License
 
-Copyright (c) 2023 Hibercraft
+Copyright (c) 2022 Hibercraft
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 
@@ -1,4 +1,95 @@
 # BLPSeg
-The implementation of [BLPSeg: Balance the Label Preference in Scribble-Supervised Semantic Segmentation](https://github.com/YudeWang/BLPSeg).
 
-The code is coming soon.
+The implementation of [**BLPSeg: Balance the Label Preference in Scribble-Supervised Semantic Segmentation**](https://ieeexplore.ieee.org/abstract/document/10225696).
+
+## Abstract
+
+Scribble-supervised semantic segmentation is an appealing weakly supervised technique with low labeling cost. Existing approaches mainly consider diffusing the labeled region of scribble by low-level feature similarity to narrow the supervision gap between scribble labels and mask labels. In this study, we observe an annotation bias between scribble and object mask, i.e., label workers tend to scribble on the spacious region instead of corners. This label preference makes the model learn well on those frequently labeled regions but poor on rarely labeled pixels. Therefore, we propose BLPSeg to balance the label preference for complete segmentation. Specifically, the BLPSeg first predicts an annotation probability map to evaluate the rarity of labels on each image, then utilizes a novel BLP loss to balance the model training by up-weighting those rare annotations. Additionally, to further alleviate the impact of label preference, we design a local aggregation module (LAM) to propagate supervision from labeled to unlabeled regions in gradient backpropagation. We conduct extensive experiments to illustrate the effectiveness of our BLPSeg. Our single-stage method even outperforms other advanced multi-stage methods and achieves state-of-the-art performance.
+
+## Installation
+
+- Linux with Python 3.6
+- pytorch 1.13.0, torchvision 0.14.0
+- CUDA 11.7
+- 2 x TITAN RTX GPUs (24G)
+- `pip install -r requirements.txt`
+
+
+## Getting Started
+
+### Preparing Dataset
+
+This repository support PASCAL VOC 2012 and PASCAL-Context dataset. The datasets are organized as follow (recommend use soft link to organize):
+```
+data/
+	VOCdevkit/
+		VOC2012/
+			Annotations/
+			JPEGIMages/
+			ImageSets/
+			SegmentationClass/
+			SegmentationClassAug/
+				xxxx.png
+				......
+			SegmentationObject/
+		Context/
+			ImageSets/
+			JPEGImages/
+			SegmentationClass/
+		scribble_annotation/
+			pascal_2012/
+			pascal_2012_label/
+			pascal_context/	
+			pascal_context_label/
+```
+
+1. Download PASCAL VOC 2012 dataset following [official instruction.](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit)
+2. Download PASCAL VOC 2012 trainaug set (including 10582 images) from [here](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0), place the folder at `data/VOCdevkit/SegmentationClassAug/`.
+3. Generate training list file `data/VOCdevkit/ImageSets/trainaug.txt` for trainaug set (1464 images from official VOC12 dataset + additional 9118 images determined by `data/VOCdevkit/VOC2012/SegmentationClassAug`)
+```
+cd data
+python generate_trainauglist.py
+```
+4. Download PASCAL-Context dataset from [here.](https://www.cs.stanford.edu/~roozbeh/pascal-context/)
+5. Download scribble annotation from [PASCAL-Scribble.](https://jifengdai.org/downloads/scribble_sup/) Convert `.xml` scribble annotation files into `.png` pixel-level annotation format
+```
+cd data
+python xml2png_voc.py
+python xml2png_context.py
+```
+
+### Train & Evaluation 
+
+We take the experiments on PASCAL VOC 2012 as example. Firstly switch to the experiment folder.
+```
+cd experiment/blpseg-voc
+```
+Please setup the corresponding settings in `config.py` then run:
+```
+python train.py
+```
+Check the `config_dict['TEST_CKPT']` in `config.py` and run evaluation script:
+```
+python test.py
+```
+## Model Zoo
+
+| Model | Dataset | mIoU% (w/o CRF) | Download|
+|:------|:--------|------|---------|
+| BLPSeg-res101 | PASCAL VOC 2012 | 77.559 | [Google Drive](https://drive.google.com/file/d/13UJZOZVIZDkdbYAhEANJks8in2sbCD93/view?usp=sharing)/[Baiduyun Drive](https://pan.baidu.com/s/1iuKk-8AgMjK78SyEOtj_ow?pwd=d9ie)(code: d9ie) |
+| BLPSeg-res101 | PASCAL-Context | 45.745 | [Google Drive](https://drive.google.com/file/d/1TiVU2toU6wr1_xa6nbVuP29up_Wt4tff/view?usp=sharing)/[Baiduyun Drive](https://pan.baidu.com/s/155noxNOA9EnTZ4_6Yy01sA?pwd=pls1)(code: pls1) |
+
+## Citations
+
+Please cite our paper if the code is helpful to your research.
+
+```
+@article{wang2023blpseg,
+  title={BLPSeg: Balance the Label Preference in Scribble-Supervised Semantic Segmentation},
+  author={Wang, Yude and Zhang, Jie and Kan, Meina and Shan, Shiguang and Chen, Xilin},
+  journal={IEEE Transactions on Image Processing},
+  year={2023},
+  publisher={IEEE}
+}
+```
+
@@ -0,0 +1,23 @@
+import os
+import argparse
+import pandas as pd
+
+if __name__ == '__main__':
+	parser = argparse.ArgumentParser()
+	parser.add_argument('--list_folder', type=str, default='./VOCdevkit/VOC2012/ImageSets/Segmentation')
+	parser.add_argument('--aug_folder', type=str, default='./VOCdevkit/VOC2012/SegmentationClassAug')
+	args = parser.parse_args()
+
+	train_file = os.path.join(args.list_folder, 'train.txt')
+	val_file = os.path.join(args.list_folder, 'val.txt')
+	trainaug_file = os.path.join(args.list_folder, 'trainaug.txt')
+	train_list = pd.read_csv(train_file, names=['filename'])['filename'].values
+	val_list = pd.read_csv(val_file, names=['filename'])['filename'].values
+	files = os.listdir(args.aug_folder)
+	trainaug_txt_file = open(trainaug_file, 'w')
+	for f in files:
+		fname = f[:-4]
+		if fname not in val_list:
+			trainaug_txt_file.write(f[:-4]+'\n')
+	trainaug_txt_file.close()
+	
@@ -0,0 +1,89 @@
+import os
+import argparse
+import numpy as np
+import xml.dom.minidom as minidom
+from xml.dom.minidom import parse
+from PIL import Image
+from tqdm import tqdm
+import time
+
+def xml2dict(xml_file):
+    result = {}
+    tree = minidom.parse(xml_file)
+    collection = tree.documentElement
+    size = collection.getElementsByTagName('size')[0]
+    h = int(size.getElementsByTagName('height')[0].childNodes[0].data)
+    w = int(size.getElementsByTagName('width')[0].childNodes[0].data)
+    result['size'] = (h,w)
+    result['filename'] = collection.getElementsByTagName('filename')[0].childNodes[0].data
+    polygons = collection.getElementsByTagName('polygon')
+    polygon_list = []
+    for polygon in polygons:
+        single_polygon_dict = {}
+        single_polygon_dict['category'] = polygon.getElementsByTagName('tag')[0].childNodes[0].data
+        points = polygon.getElementsByTagName('point')
+        point_list = []
+        for point in points:
+            x = int(point.getElementsByTagName('X')[0].childNodes[0].data)
+            y = int(point.getElementsByTagName('Y')[0].childNodes[0].data)
+            x = max(min(x,w-1),0)
+            y = max(min(y,h-1),0)
+            point_list.append((y,x))
+        single_polygon_dict['points'] = point_list
+        polygon_list.append(single_polygon_dict)
+    result['polygons'] = polygon_list
+    return result
+
+def drawline(img, pos1, pos2, value):
+    r1,c1 = pos1
+    r2,c2 = pos2
+    m = max(np.abs(r1-r2), np.abs(c1-c2))
+    if m <= 1:
+        return img
+    delta_r = (r2-r1)/m
+    delta_c = (c2-c1)/m
+    for i in range(m):
+        r = int(r1 + delta_r*i)
+        c = int(c1 + delta_c*i)
+        img[r,c] = value
+    return img
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--xml', type=str, default='./VOCdevkit/scribble_annotation/pascal_context')
+    parser.add_argument('--save', type=str, default='./VOCdevkit/scribble_annotation/pascal_context_label')
+    args = parser.parse_args()
+
+    cls2idx = {'background':0, 'plane': 1, 'bike': 2, 'bird': 3, 'boat': 4, 'bottle': 5, 'bus': 6, 
+                'car': 7, 'cat': 8, 'chair': 9, 'cow': 10, 'table': 11, 'dog': 12, 'horse': 13, 'motorbike': 14, 
+                'person': 15, 'plant': 16, 'sheep': 17, 'sofa': 18, 'train': 19, 'monitor': 20, 'bag': 21, 'bed': 22, 
+                'bench': 23, 'book': 24, 'building': 25, 'cabinet': 26, 'ceiling': 27, 'cloth': 28, 'computer': 29, 
+                'cup': 30, 'door': 31, 'fence': 32, 'floor': 33, 'flower': 34, 'food': 35, 'grass': 36, 'ground': 37, 
+                'keyboard': 38, 'light': 39, 'mountain': 40, 'mouse': 41, 'curtain': 42, 'platform': 43, 'sign': 44, 
+                'plate': 45, 'road': 46, 'rock': 47, 'shelves': 48, 'sidewalk': 49, 'sky': 50, 'snow': 51, 'bedclothes': 52, 
+                'track': 53, 'tree': 54, 'truck': 55, 'wall': 56, 'water': 57, 'window': 58, 'wood': 59}
+    g = os.walk(args.xml)
+    if not os.path.exists(args.save):
+        os.makedirs(args.save)
+    
+    for path, dir_list, file_list in g:
+        with tqdm(total=len(file_list)) as pbar:
+            pbar.set_description('Processing:')
+            for file_name in file_list:
+                filename = os.path.join(path, file_name)
+                info = xml2dict(filename)
+                label = np.ones(info['size'])*255
+                for polygon in info['polygons']:
+                    clsidx = cls2idx[polygon['category']]
+                    for i in range(len(polygon['points'])-1):
+                        point1 = polygon['points'][i]
+                        point2 = polygon['points'][i+1]
+                        label = drawline(label, point1, point2, clsidx)
+                        label[point1] = clsidx
+                        label[point2] = clsidx
+                label = label.astype(np.uint8)
+                label = Image.fromarray(label)
+                out_name = os.path.join(args.save, file_name.replace('.xml','.png'))
+                label.save(out_name)
+                time.sleep(0.01)
+                pbar.update(1)
@@ -0,0 +1,82 @@
+import os
+import argparse
+import numpy as np
+import xml.dom.minidom as minidom
+from xml.dom.minidom import parse
+from PIL import Image
+from tqdm import tqdm
+import time
+
+def xml2dict(xml_file):
+    result = {}
+    tree = minidom.parse(xml_file)
+    collection = tree.documentElement
+    size = collection.getElementsByTagName('size')[0]
+    h = int(size.getElementsByTagName('height')[0].childNodes[0].data)
+    w = int(size.getElementsByTagName('width')[0].childNodes[0].data)
+    result['size'] = (h,w)
+    result['filename'] = collection.getElementsByTagName('filename')[0].childNodes[0].data
+    polygons = collection.getElementsByTagName('polygon')
+    polygon_list = []
+    for polygon in polygons:
+        single_polygon_dict = {}
+        single_polygon_dict['category'] = polygon.getElementsByTagName('tag')[0].childNodes[0].data
+        points = polygon.getElementsByTagName('point')
+        point_list = []
+        for point in points:
+            x = int(point.getElementsByTagName('X')[0].childNodes[0].data)
+            y = int(point.getElementsByTagName('Y')[0].childNodes[0].data)
+            x = max(min(x,w-1),0)
+            y = max(min(y,h-1),0)
+            point_list.append((y,x))
+        single_polygon_dict['points'] = point_list
+        polygon_list.append(single_polygon_dict)
+    result['polygons'] = polygon_list
+    return result
+
+def drawline(img, pos1, pos2, value):
+    r1,c1 = pos1
+    r2,c2 = pos2
+    m = max(np.abs(r1-r2), np.abs(c1-c2))
+    if m <= 1:
+        return img
+    delta_r = (r2-r1)/m
+    delta_c = (c2-c1)/m
+    for i in range(m):
+        r = int(r1 + delta_r*i)
+        c = int(c1 + delta_c*i)
+        img[r,c] = value
+    return img
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--xml', type=str, default='./VOCdevkit/scribble_annotation/pascal_2012')
+    parser.add_argument('--save', type=str, default='./VOCdevkit/scribble_annotation/pascal_2012_label')
+    args = parser.parse_args()
+
+    cls2idx = {'background':0, 'plane':1, 'bike':2, 'bird':3, 'boat':4, 'bottle':5, 'bus':6, 'car':7, 'cat':8, 'chair':9, 'cow':10,
+              'table':11, 'dog':12, 'horse':13, 'motorbike':14, 'person':15, 'plant':16, 'sheep':17, 'sofa':18, 'train':19, 'monitor':20}
+    g = os.walk(args.xml)
+    if not os.path.exists(args.save):
+        os.makedirs(args.save)
+    for path, dir_list, file_list in g:
+        with tqdm(total=len(file_list)) as pbar:
+            pbar.set_description('Processing:')
+            for file_name in file_list:
+                filename = os.path.join(path, file_name)
+                info = xml2dict(filename)
+                label = np.ones(info['size'])*255
+                for polygon in info['polygons']:
+                    clsidx = cls2idx[polygon['category']]
+                    for i in range(len(polygon['points'])-1):
+                        point1 = polygon['points'][i]
+                        point2 = polygon['points'][i+1]
+                        label = drawline(label, point1, point2, clsidx)
+                        label[point1] = clsidx
+                        label[point2] = clsidx
+                label = label.astype(np.uint8)
+                label = Image.fromarray(label)
+                out_name = os.path.join(args.save, file_name.replace('.xml','.png'))
+                label.save(out_name)
+                time.sleep(0.01)
+                pbar.update(1)
@@ -0,0 +1,63 @@
+# ----------------------------------------
+# Written by Yude Wang
+# ----------------------------------------
+import torch
+import argparse
+import os
+import sys
+import cv2
+import time
+
+config_dict = {
+		'EXP_NAME': 'blpseg-context',
+
+		'DATA_NAME': 'ContextDataset',
+		'DATA_YEAR': 2012,
+		'DATA_AUG': True,
+		'DATA_WORKERS': 4,
+		'DATA_MEAN': [0.485, 0.456, 0.406],
+		'DATA_STD': [0.229, 0.224, 0.225],
+		'DATA_RANDOMSCALE': [0.75, 1.25],
+		'DATA_RANDOM_H': 10,
+		'DATA_RANDOM_S': 10,
+		'DATA_RANDOM_V': 10,
+		'DATA_RANDOMCROP': 384,
+		'DATA_RANDOMROTATION': 0,
+		'DATA_RANDOMFLIP': 0.5,
+
+		'MODEL_NAME': 'BLPSeg',
+		'MODEL_BACKBONE': 'resnet101',
+		'MODEL_BACKBONE_PRETRAIN': True,
+		'MODEL_PPM_DIM': 256,
+		'MODEL_NUM_CLASSES': 60,
+		'MODEL_FREEZEBN': False,
+		'MODEL_LAM_SIGMA': 6,
+
+		'LOSS_GAMMA': 2.0,
+		'LOSS_UNLABEL_CLASS_W': 0.02,
+
+		'TRAIN_LR': 2.4e-5,
+		'TRAIN_MOMENTUM': 0.9,
+		'TRAIN_WEIGHT_DECAY': 0.01,
+		'TRAIN_BN_MOM': 0.1,
+		'TRAIN_POWER': 0.9,
+		'TRAIN_BATCHES': 8,
+		'TRAIN_SHUFFLE': False,
+		'TRAIN_MINEPOCH': 0,
+		'TRAIN_EPOCHS': 85,
+		'TRAIN_TBLOG': True,
+		'TRAIN_ST_POINT': 20000,
+
+		'TEST_MULTISCALE': [0.5, 0.75, 1, 1.25],
+		'TEST_FLIP': True,
+		'TEST_CRF': False,
+		'TEST_BATCHES': 1,		
+}
+
+config_dict['ROOT_DIR'] = os.path.abspath(os.path.join(os.path.dirname("__file__"),'..','..'))
+config_dict['MODEL_SAVE_DIR'] = os.path.join(config_dict['ROOT_DIR'],'model',config_dict['EXP_NAME'])
+config_dict['TRAIN_CKPT'] = None
+config_dict['LOG_DIR'] = os.path.join(config_dict['ROOT_DIR'],'log',config_dict['EXP_NAME'])
+config_dict['TEST_CKPT'] = os.path.join(config_dict['ROOT_DIR'],f'model/{config_dict["EXP_NAME"]}/BLPSeg_resnet101_ContextDataset_epoch85.pth')
+
+sys.path.insert(0, os.path.join(config_dict['ROOT_DIR'], 'lib'))