您好，感谢您开源了您的工作，该工作十分的出色，我在测试您的模型时发现了app code部分的一些瑕疵

1、首先是数据导入组件video_input.change部分，您在process_example函数中更改了video_input会导致数据导入会不断的触发，我的解决方法如下：(思路就是不改变video_input，也不返回video_input)
video_input.change(
        fn=process_example,
        inputs=[
            video_input,
            video_caption,
            target_region_frame1_caption,
            point_prompt,
            click_state
        ],
        outputs=[
            video_caption, 
            target_region_frame1_caption, 
            inference_state, 
            video_state, 
            video_info, 
            template_frame,
            image_selection_slider, 
            track_pause_number_slider,
            point_prompt, 
            clear_button_click,
            tracking_video_predict_button,
            video_output, 
            inpaint_video_predict_button,
            run_status,
            # video_input
        ]
    )


# extract frames from upload video
def get_frames_from_video(video_input, video_state):
    video_path = video_input
    frames = []
    user_name = time.time()
    vr = VideoReader(video_path)
    original_fps = vr.get_avg_fps()
    
    # If fps > 8, downsample frames to 8fps
    if original_fps > 8:
        total_frames = len(vr)
        sample_interval = max(1, int(original_fps / 8))
        frame_indices = list(range(0, total_frames, sample_interval))
        frames = vr.get_batch(frame_indices).asnumpy()
    else:
        frames = vr.get_batch(list(range(len(vr)))).asnumpy()
    
    # Take only first 49 frames
    frames = frames[:49]
    
    # Resize all frames to 480x720
    resized_frames = []
    for frame in frames:
        resized_frame = cv2.resize(frame, (720, 480))
        resized_frames.append(resized_frame)
    frames = np.array(resized_frames)

    init_start = time.time() 
    inference_state = predictor.init_state(images=frames, offload_video_to_cpu=True, async_loading_frames=True)
    init_time = time.time() - init_start
    print(f"Inference state initialization took {init_time:.2f}s")
    
    fps = 8
    image_size = (frames[0].shape[0],frames[0].shape[1])
    # initialize video_state
    video_state = {
        "user_name": user_name,
        "video_name": os.path.split(video_path)[-1],
        "origin_images": frames,
        "painted_images": frames.copy(),
        "masks": [np.zeros((frames[0].shape[0],frames[0].shape[1]), np.uint8)]*len(frames),
        "logits": [None]*len(frames),
        "select_frame_number": 0,
        "fps": fps,
        "ann_obj_id": 0
        }
    video_info = "Video Name: {}, FPS: {}, Total Frames: {}, Image Size:{}".format(video_state["video_name"], video_state["fps"], len(frames), image_size)

    video_input_tem = generate_video_from_frames(frames, output_path=f"{GRADIO_TEMP_DIR}/inpaint/{video_state['video_name']}", fps=video_state["fps"])

    return gr.update(visible=True), \
    gr.update(visible=True), \
    inference_state, \
    video_state, \
    video_info, \
    video_state["origin_images"][0], \
    gr.update(visible=False, maximum=len(frames), value=1, interactive=True), \
    gr.update(visible=False, maximum=len(frames), value=len(frames), interactive=True), \
    gr.update(visible=True, interactive=True), \
    gr.update(visible=True, interactive=True), \
    gr.update(visible=True, interactive=True), \
    gr.update(visible=True), \
    gr.update(visible=True, interactive=True), \
    create_status("Upload video already. Try click the image for adding targets to track and inpaint.", StatusMessage.SUCCESS), \
    # video_input

2、由于第1点的修改会都是放回参数变少一个，所以在使用测试用例会报错少一个参数，我的解决方法是直接注释掉最后一个返回参数（不好意思忘记具体的code在哪了）

3、我发现Inpainting按钮在sam2处理完毕后无法正常被点击，我的解决方法是在读取数据完之后设置interactive=True；（解决code在第一点中）


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

您好，感谢您开源了您的工作，该工作十分的出色，我在测试您的模型时发现了app code部分的一些瑕疵 #17

extract frames from upload video

tracking video from select image and mask

tracking vos

extract frames from upload video

extract frames from upload video

extract frames from upload video

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

您好，感谢您开源了您的工作，该工作十分的出色，我在测试您的模型时发现了app code部分的一些瑕疵 #17

Description

extract frames from upload video

Activity

f-chen165 commented on Mar 27, 2025

tracking video from select image and mask

tracking vos

yxbian23 commented on Apr 8, 2025

extract frames from upload video

f-chen165 commented on Apr 9, 2025

extract frames from upload video

yxbian23 commented on Apr 9, 2025

extract frames from upload video

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions