A curated list of latest research papers, projects and resources related to DiT/FLUX. Content is automatically updated daily.
Last Update: 2025-06-20 06:35:16
Thanks to @longxiang-ai for the template.
- Image Editing (39 papers) - Papers about image editing with Diffusion Transformer or FLUX
- Image Generation (201 papers) - Papers focusing on image generation with Diffusion Transformer or FLUX
- Video Related (138 papers) - Papers about video generation and editing with Diffusion Transformer or FLUX
- VINCIE: Unlocking In-context Image Editing from Video (Published: 2025-06-12)
Authors: Leigang Qu, Feng Cheng, Ziyan Yang, Qi Zhao, Shanchuan Lin, Yichun Shi, Yicong Li, Wenjie Wang, Tat-Seng Chua, Lu Jiang
Links:
Keywords: image editing, diffusion transformer - Flow Diverse and Efficient: Learning Momentum Flow Matching via Stochastic Velocity Field Sampling (Published: 2025-06-10)
Authors: Zhiyuan Ma, Ruixun Liu, Sixian Liu, Jianjun Li, Bowen Zhou
Links:|
Keywords: rectified flow, FLUX - Image Editing As Programs with Diffusion Models (Published: 2025-06-04)
Authors: Yujia Hu, Songhua Liu, Zhenxiong Tan, Xingyi Yang, Xinchao Wang
Links:|
Keywords: text-to-image, image generation, image editing, diffusion transformer - ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions (Published: 2025-06-03)
Authors: Di Chang, Mingdeng Cao, Yichun Shi, Bo Liu, Shengqu Cai, Shijie Zhou, Weilin Huang, Gordon Wetzstein, Mohammad Soleymani, Peng Wang
Links:
Keywords: image editing, diffusion transformer - RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers (Published: 2025-06-03)
Authors: Yan Gong, Yiren Song, Yicheng Li, Chenglin Li, Yin Zhang
Links:
Keywords: image editing, diffusion transformer - LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers (Published: 2025-05-29)
Authors: Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
Links:
Keywords: image generation, rectified flow, image editing, FLUX, diffusion transformer - HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer (Published: 2025-05-28)
Authors: Qi Cai, Jingwen Chen, Yang Chen, Yehao Li, Fuchen Long, Yingwei Pan, Zhaofan Qiu, Yiheng Zhang, Fengbin Gao, Peihan Xu, Yimeng Wang, Kai Yu, Wenxuan Chen, Ziwei Feng, Zijian Gong, Jianzhuang Pan, Yi Peng, Rui Tian, Siyu Wang, Bo Zhao, Ting Yao, Tao Mei
Links:|
|
Keywords: text-to-image, image generation, image editing, diffusion transformer - DanceGRPO: Unleashing GRPO on Visual Generation (Published: 2025-05-12)
Authors: Zeyue Xue, Jie Wu, Yu Gao, Fangyuan Kong, Lingting Zhu, Mengzhao Chen, Zhiheng Liu, Wei Liu, Qiushan Guo, Weilin Huang, Ping Luo
Links:
Keywords: text-to-image, video generation, rectified flow, FLUX - Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers (Published: 2025-05-07)
Authors: Divyansh Srivastava, Xiang Zhang, He Wen, Chenru Wen, Zhuowen Tu
Links:
Keywords: image generation, Control, Controllable, image editing, diffusion transformer - In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer (Published: 2025-04-29)
Authors: Zechuan Zhang, Ji Xie, Yu Lu, Zongxin Yang, Yi Yang
Links:|
Keywords: image editing, diffusion transformer - Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing (Published: 2025-04-07)
Authors: Hui Liu, Bin Zou, Suiyun Zhang, Kecheng Chen, Rui Liu, Haoliang Li
Links:
Keywords: image editing, Control, diffusion transformer - Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models (Published: 2025-04-04)
Authors: Ved Umrajkar, Aakash Kumar Singh
Links:|
Keywords: text-to-image, image generation, inversion, rectified flow, FLUX - DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution (Published: 2025-03-30)
Authors: Zheng-Peng Duan, Jiawei Zhang, Xin Jin, Ziheng Zhang, Zheng Xiong, Dongqing Zou, Jimmy Ren, Chun-Le Guo, Chongyi Li
Links:|
Keywords: image super-resolution, image generation, Control, diffusion transformer - FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing (Published: 2025-03-20)
Authors: Tianyi Wei, Yifan Zhou, Dongdong Chen, Xingang Pan
Links:
Keywords: text-to-image, image generation, image editing, FLUX, diffusion transformer - Personalize Anything for Free with Diffusion Transformer (Published: 2025-03-16)
Authors: Haoran Feng, Zehuan Huang, Lin Li, Hairong Lv, Lu Sheng
Links:
Keywords: image editing, image generation, Control, diffusion transformer - NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers (Published: 2025-03-12)
Authors: Yuhang Ma, Bo Cheng, Shanyuan Liu, Ao Ma, Xiaoyu Wu, Liebucha Wu, Dawei Leng, Yuhui Yin
Links:
Keywords: image generation, rectified flow, diffusion transformer - Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models (Published: 2025-03-11)
Authors: Armando Fortes, Tianyi Wei, Shangchen Zhou, Xingang Pan
Links:
Keywords: text-to-image, Control, inversion, image editing, FLUX - Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model (Published: 2025-03-10)
Authors: Lixue Gong, Xiaoxia Hou, Fanshi Li, Liang Li, Xiaochen Lian, Fei Liu, Liyang Liu, Wei Liu, Wei Lu, Yichun Shi, Shiqi Sun, Yu Tian, Zhi Tian, Peng Wang, Xun Wang, Ye Wang, Guofeng Wu, Jie Wu, Xin Xia, Xuefeng Xiao, Linjie Yang, Zhonghua Zhai, Xinyu Zhang, Qi Zhang, Yuwei Zhang, Shijia Zhao, Jianchao Yang, Weilin Huang
Links:
Keywords: image generation, image editing, FLUX - TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation (Published: 2025-03-10)
Authors: Victor Shea-Jay Huang, Le Zhuo, Yi Xin, Zhaokai Wang, Peng Gao, Hongsheng Li
Links:
Keywords: image editing, image generation, Control, diffusion transformer - X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation (Published: 2025-03-08)
Authors: Jian Ma, Qirong Peng, Xu Guo, Chen Chen, Haonan Lu, Zhenyu Yang
Links:|
Keywords: text-to-image, image generation, Control, image editing, image to image, diffusion transformer - AnyRefill: A Unified, Data-Efficient Framework for Left-Prompt-Guided Vision Tasks (Published: 2025-02-16)
Authors: Ming Xie, Chenjie Cao, Yunuo Cai, Xiangyang Xue, Yu-Gang Jiang, Yanwei Fu
Links:
Keywords: text-to-image, image editing, diffusion transformer - EliGen: Entity-Level Controlled Image Generation with Regional Attention (Published: 2025-01-02)
Authors: Hong Zhang, Zhongjie Duan, Xingjun Wang, Yingda Chen, Yu Zhang
Links:|
Keywords: text-to-image, image generation, Control, image inpainting, diffusion transformer - Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG (Published: 2024-12-12)
Authors: Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag
Links:
Keywords: text-to-image, Control, image editing, FLUX - FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers (Published: 2024-12-12)
Authors: Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
Links:
Keywords: image generation, Control, rectified flow, image editing, FLUX - AMO Sampler: Enhancing Text Rendering with Overshooting (Published: 2024-11-28)
Authors: Xixi Hu, Keyang Xu, Bo Liu, Qiang Liu, Hongliang Fei
Links:|
Keywords: text-to-image, image generation, Control, rectified flow, FLUX - Prediction with Action: Visual Policy Learning via Joint Denoising Process (Published: 2024-11-27)
Authors: Yanjiang Guo, Yucheng Hu, Jianke Zhang, Yen-Jen Wang, Xiaoyu Chen, Chaochao Lu, Jianyu Chen
Links:|
Keywords: image editing, image generation, Control, diffusion transformer - HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads (Published: 2024-11-22)
Authors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Xiaoyu Kong, Jintao Li, Oliver Deussen, Tong-Yee Lee
Links:
Keywords: image generation, image editing, diffusion transformer - Stable Flow: Vital Layers for Training-Free Image Editing (Published: 2024-11-21)
Authors: Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or
Links:|
Keywords: image editing, Control, diffusion transformer, inversion - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method (Published: 2024-11-17)
Authors: Yan Zheng, Zhenxiao Liang, Xiaoyan Cong, Lanqing guo, Yuehao Wang, Peihao Wang, Zhangyang Wang
Links:|
Keywords: text-to-image, inversion, rectified flow, image editing, FLUX - Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing (Published: 2024-11-12)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:
Keywords: image editing, image generation, Control, diffusion transformer - Taming Rectified Flow for Inversion and Editing (Published: 2024-11-07)
Authors: Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
Links:|
Keywords: inversion, video generation, video editing, rectified flow, FLUX, diffusion transformer - DiT4Edit: Diffusion Transformer for Image Editing (Published: 2024-11-05)
Authors: Kunyu Feng, Yue Ma, Bingyuan Wang, Chenyang Qi, Haozhe Chen, Qifeng Chen, Zeyu Wang
Links:
Keywords: image generation, Control, inversion, image editing, diffusion transformer - FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model (Published: 2024-10-17)
Authors: ZiDong Wang, Zeyu Lu, Di Huang, Cai Zhou, Wanli Ouyang, and Lei Bai
Links:|
Keywords: image generation, rectified flow, diffusion transformer - Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations (Published: 2024-10-14)
Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu
Links:
Keywords: Control, inversion, rectified flow, image editing, FLUX - Effective Diffusion Transformer Architecture for Image Super-Resolution (Published: 2024-09-29)
Authors: Kun Cheng, Lei Yu, Zhijun Tu, Xiao He, Liyu Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu
Links:
Keywords: image generation, image super-resolution, diffusion transformer - PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions (Published: 2024-09-23)
Authors: Weifeng Lin, Xinyu Wei, Renrui Zhang, Le Zhuo, Shitian Zhao, Siyuan Huang, Huan Teng, Junlin Xie, Yu Qiao, Peng Gao, Hongsheng Li
Links:|
Keywords: text-to-image, image generation, Control, Controllable, image editing, diffusion transformer - Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing (Published: 2024-08-23)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:|
Keywords: image editing, text-to-image, Control, diffusion transformer - Lazy Diffusion Transformer for Interactive Image Editing (Published: 2024-04-18)
Authors: Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi
Links:
Keywords: image editing, diffusion transformer - Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models (Published: 2023-12-19)
Authors: Dvir Samuel, Barak Meiri, Haggai Maron, Yoad Tewel, Nir Darshan, Shai Avidan, Gal Chechik, Rami Ben-Ari
Links:
Keywords: text-to-image, inversion, image editing, FLUX
Showing the latest 50 out of 201 papers
- Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model (Published: 2025-06-18)
Authors: Anirud Aggarwal, Abhinav Shrivastava, Matthew Gwilliam
Links:|
|
Keywords: image generation, Control, diffusion transformer, FLUX - MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning (Published: 2025-06-12)
Authors: Yuxuan Luo, Yuhui Yuan, Junwen Chen, Haonan Cai, Ziyi Yue, Yuwei Yang, Fatima Zohra Daha, Ji Li, Zhouhui Lian
Links:
Keywords: text-to-image, image generation, FLUX - ReSim: Reliable World Simulation for Autonomous Driving (Published: 2025-06-11)
Authors: Jiazhi Yang, Kashyap Chitta, Shenyuan Gao, Long Chen, Yuqian Shao, Xiaosong Jia, Hongyang Li, Andreas Geiger, Xiangyu Yue, Li Chen
Links:
Keywords: Controllable, Control, diffusion transformer - HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations (Published: 2025-06-11)
Authors: Marco Federici, Riccardo Del Chiaro, Boris van Breugel, Paul Whatmough, Markus Nagel
Links:
Keywords: image generation, diffusion transformer - HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation (Published: 2025-06-10)
Authors: Ziyao Huang, Zixiang Zhou, Juan Cao, Yifeng Ma, Yi Chen, Zejing Rao, Zhiyong Xu, Hongmei Wang, Qin Lin, Yuan Zhou, Qinglin Lu, Fan Tang
Links:|
Keywords: video generation, Control, diffusion transformer - MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation (Published: 2025-06-09)
Authors: Junhao Chen, Yulia Tsvetkov, Xiaochuang Han
Links:
Keywords: image generation, Control, diffusion transformer - FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers (Published: 2025-06-04)
Authors: Xuanhua He, Quande Liu, Zixuan Ye, Weicai Ye, Qiulin Wang, Xintao Wang, Qifeng Chen, Pengfei Wan, Di Zhang, Kun Gai
Links:|
Keywords: video generation, video editing, Control, diffusion transformer - Image Editing As Programs with Diffusion Models (Published: 2025-06-04)
Authors: Yujia Hu, Songhua Liu, Zhenxiong Tan, Xingyi Yang, Xinchao Wang
Links:|
Keywords: text-to-image, image generation, image editing, diffusion transformer - EmoArt: A Multidimensional Dataset for Emotion-Aware Artistic Generation (Published: 2025-06-04)
Authors: Cheng Zhang, Hongxia xie, Bin Wen, Songhan Zuo, Ruoxuan Zhang, Wen-huang Cheng
Links:
Keywords: text-to-image, image generation, FLUX - OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation (Published: 2025-06-02)
Authors: Sen Liang, Zhentao Yu, Zhengguang Zhou, Teng Hu, Hongmei Wang, Yi Chen, Qin Lin, Yuan Zhou, Xin Li, Qinglin Lu, Zhibo Chen
Links:
Keywords: Controllable, video generation, Control, diffusion transformer - Evaluating Robot Policies in a World Model (Published: 2025-05-31)
Authors: Julian Quevedo, Percy Liang, Sherry Yang
Links:
Keywords: video generation, Control, diffusion transformer - Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control (Published: 2025-05-31)
Authors: Danfeng li, Hui Zhang, Sheng Wang, Jiacheng Li, Zuxuan Wu
Links:
Keywords: text-to-image, image generation, Control, FLUX, diffusion transformer - Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation (Published: 2025-05-31)
Authors: Muhammad Adnan, Nithesh Kurella, Akhil Arunkumar, Prashant J. Nair
Links:|
Keywords: text-to-image, video generation, diffusion transformer - Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning (Published: 2025-05-30)
Authors: Stepan Shabalin, Ayush Panda, Dmitrii Kharlapenko, Abdur Raheem Ali, Yixiong Hao, Arthur Conmy
Links:
Keywords: text-to-image, image generation, Control, FLUX - LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers (Published: 2025-05-29)
Authors: Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
Links:
Keywords: image generation, rectified flow, image editing, FLUX, diffusion transformer - Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model (Published: 2025-05-29)
Authors: Qingyu Shi, Jinbin Bai, Zhuoran Zhao, Wenhao Chai, Kaidong Yu, Jianzong Wu, Shuangyong Song, Yunhai Tong, Xiangtai Li, Xuelong Li, Shuicheng Yan
Links:
Keywords: text-to-image, image generation, diffusion transformer - HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer (Published: 2025-05-28)
Authors: Qi Cai, Jingwen Chen, Yang Chen, Yehao Li, Fuchen Long, Yingwei Pan, Zhaofan Qiu, Yiheng Zhang, Fengbin Gao, Peihan Xu, Yimeng Wang, Kai Yu, Wenxuan Chen, Ziwei Feng, Zijian Gong, Jianzhuang Pan, Yi Peng, Rui Tian, Siyu Wang, Bo Zhao, Ting Yao, Tao Mei
Links:|
|
Keywords: text-to-image, image generation, image editing, diffusion transformer - Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers (Published: 2025-05-28)
Authors: Weilun Feng, Chuanguang Yang, Haotong Qin, Xiangqi Li, Yu Wang, Zhulin An, Libo Huang, Boyu Diao, Zixiang Zhao, Yongjun Xu, Michele Magno
Links:|
Keywords: video generation, image generation, diffusion transformer - AlignGen: Boosting Personalized Image Generation with Cross-Modality Prior Alignment (Published: 2025-05-28)
Authors: Yiheng Lin, Shifang Zhao, Ting Liu, Xiaochao Qu, Luoqi Liu, Yao Zhao, Yunchao Wei
Links:
Keywords: text-to-image, image generation, diffusion transformer - Frame In-N-Out: Unbounded Controllable Image-to-Video Generation (Published: 2025-05-27)
Authors: Boyang Wang, Xuweiyi Chen, Matheus Gadelha, Zezhou Cheng
Links:
Keywords: Controllable, video generation, Control, diffusion transformer - Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots (Published: 2025-05-26)
Authors: Guangting Zheng, Yehao Li, Yingwei Pan, Jiajun Deng, Ting Yao, Yanyong Zhang, Tao Mei
Links:|
Keywords: text-to-image, image generation, diffusion transformer - Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM (Published: 2025-05-26)
Authors: Peng Liu, Xiaoming Ren, Fengkai Liu, Qingsong Xie, Quanlong Zheng, Yanhao Zhang, Haonan Lu, Yujiu Yang
Links:
Keywords: video generation, Control, diffusion transformer - MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models (Published: 2025-05-26)
Authors: Hang Hua, Ziyun Zeng, Yizhi Song, Yunlong Tang, Liu He, Daniel Aliaga, Wei Xiong, Jiebo Luo
Links:
Keywords: text-to-image, image generation, FLUX - Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation (Published: 2025-05-24)
Authors: Shuo Yang, Haocheng Xi, Yilong Zhao, Muyang Li, Jintao Zhang, Han Cai, Yujun Lin, Xiuyu Li, Chenfeng Xu, Kelly Peng, Jianfei Chen, Song Han, Kurt Keutzer, Ion Stoica
Links:
Keywords: video generation, Control, diffusion transformer - Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter (Published: 2025-05-24)
Authors: Weizhi Zhong, Huan Yang, Zheng Liu, Huiguo He, Zijian He, Xuesong Niu, Di Zhang, Guanbin Li
Links:
Keywords: text-to-image, image generation, diffusion transformer - RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration (Published: 2025-05-23)
Authors: Sudarshan Rajagopalan, Kartik Narayan, Vishal M. Patel
Links:
Keywords: image generation, diffusion transformer - Interspatial Attention for Efficient 4D Human Video Generation (Published: 2025-05-21)
Authors: Ruizhi Shao, Yinghao Xu, Yujun Shen, Ceyuan Yang, Yang Zheng, Changan Chen, Yebin Liu, Gordon Wetzstein
Links:|
Keywords: Controllable, video generation, Control, diffusion transformer - Scaling Diffusion Transformers Efficiently via $μ$P (Published: 2025-05-21)
Authors: Chenyu Zheng, Xinyu Zhang, Rongzhen Wang, Wei Huang, Zhi Tian, Weilin Huang, Jun Zhu, Chongxuan Li
Links:
Keywords: text-to-image, image generation, diffusion transformer - LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer (Published: 2025-05-20)
Authors: Changgu Chen, Xiaoyan Yang, Junwei Shu, Changbo Wang, Yang Li
Links:|
Keywords: video generation, Control, diffusion transformer - Swin DiT: Diffusion Transformer using Pseudo Shifted Windows (Published: 2025-05-19)
Authors: Jiafu Wu, Yabiao Wang, Jian Li, Jinlong Peng, Yun Cao, Chengjie Wang, Jiangning Zhang
Links:|
Keywords: image generation, diffusion transformer - SounDiT: Geo-Contextual Soundscape-to-Landscape Generation (Published: 2025-05-19)
Authors: Junbo Wang, Haofeng Tan, Bowen Liao, Albert Jiang, Teng Fei, Qixing Huang, Zhengzhong Tu, Shan Ye, Yuhao Kang
Links:
Keywords: image generation, diffusion transformer - Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis (Published: 2025-05-15)
Authors: Bingda Tang, Boyang Zheng, Xichen Pan, Sayak Paul, Saining Xie
Links:
Keywords: text-to-image, image generation, Control, diffusion transformer - BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset (Published: 2025-05-14)
Authors: Jiuhai Chen, Zhiyang Xu, Xichen Pan, Yushi Hu, Can Qin, Tom Goldstein, Lifu Huang, Tianyi Zhou, Saining Xie, Silvio Savarese, Le Xue, Caiming Xiong, Ran Xu
Links:
Keywords: image generation, diffusion transformer - Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches (Published: 2025-05-14)
Authors: Yutong Hu, Pinhao Song, Kehan Wen, Renaud Detry
Links:
Keywords: image generation, diffusion transformer - DanceGRPO: Unleashing GRPO on Visual Generation (Published: 2025-05-12)
Authors: Zeyue Xue, Jie Wu, Yu Gao, Fangyuan Kong, Lingting Zhu, Mengzhao Chen, Zhiheng Liu, Wei Liu, Qiushan Guo, Weilin Huang, Ping Luo
Links:
Keywords: text-to-image, video generation, rectified flow, FLUX - Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition (Published: 2025-05-09)
Authors: Zhiyuan Chen, Keyi Li, Yifan Jia, Le Ye, Yufei Ma
Links:|
Keywords: image generation, diffusion transformer - Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers (Published: 2025-05-07)
Authors: Divyansh Srivastava, Xiang Zhang, He Wen, Chenru Wen, Zhuowen Tu
Links:
Keywords: image generation, Control, Controllable, image editing, diffusion transformer - Deepfakes on Demand: the rise of accessible non-consensual deepfake image generators (Published: 2025-05-06)
Authors: Will Hawkins, Chris Russell, Brent Mittelstadt
Links:
Keywords: text-to-image, FLUX - DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization (Published: 2025-05-04)
Authors: Wenchuan Wang, Mengqi Huang, Yijing Tu, Zhendong Mao
Links:
Keywords: video generation, Control, diffusion transformer - JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers (Published: 2025-05-01)
Authors: Kwon Byung-Ki, Qi Dai, Lee Hyoseok, Chong Luo, Tae-Hyun Oh
Links:|
Keywords: image generation, Control, diffusion transformer - Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions (Published: 2025-04-30)
Authors: ZiYi Dong, Chengxing Zhou, Weijian Deng, Pengxu Wei, Xiangyang Ji, Liang Lin
Links:
Keywords: image generation, diffusion transformer - DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer (Published: 2025-04-28)
Authors: Junpeng Jiang, Gangyi Hong, Miao Zhang, Hengtong Hu, Kun Zhan, Rui Shao, Liqiang Nie
Links:
Keywords: video generation, Control, diffusion transformer - Physics-based super-resolved simulation of 3D elastic wave propagation adopting scalable Diffusion Transformer (Published: 2025-04-24)
Authors: Hugo Gabrielidis, Filippo Gatti, Stéphane Vialle
Links:
Keywords: image generation, diffusion transformer - Boosting Generative Image Modeling via Joint Image-Feature Synthesis (Published: 2025-04-22)
Authors: Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis
Links:
Keywords: image generation, diffusion transformer - Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration (Published: 2025-04-21)
Authors: Junyuan Deng, Xinyi Wu, Yongxing Yang, Congchao Zhu, Song Wang, Zhenyao Wu
Links:
Keywords: text-to-image, image generation, Control, FLUX, diffusion transformer - Text-Audio-Visual-conditioned Diffusion Model for Video Saliency Prediction (Published: 2025-04-19)
Authors: Li Yu, Xuanzhe Sun, Wei Zhou, Moncef Gabbouj
Links:
Keywords: image generation, diffusion transformer - Entropy Rectifying Guidance for Diffusion and Flow Models (Published: 2025-04-18)
Authors: Tariq Berrada Ifriqi, Adriana Romero-Soriano, Michal Drozdzal, Jakob Verbeek, Karteek Alahari
Links:
Keywords: text-to-image, image generation, diffusion transformer - InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework (Published: 2025-04-16)
Authors: Jiale Tao, Yanbing Zhang, Qixun Wang, Yiji Cheng, Haofan Wang, Xu Bai, Zhengguang Zhou, Ruihuang Li, Linqing Wang, Chunyu Wang, Qin Lin, Qinglin Lu
Links:|
Keywords: Controllable, image generation, Control, diffusion transformer - Using LLMs as prompt modifier to avoid biases in AI image generators (Published: 2025-04-15)
Authors: René Peinl
Links:|
Keywords: text-to-image, image generation, FLUX - D$^2$iT: Dynamic Diffusion Transformer for Accurate Image Generation (Published: 2025-04-13)
Authors: Weinan Jia, Mengqi Huang, Nan Chen, Lei Zhang, Zhendong Mao
Links:|
Keywords: image generation, diffusion transformer
Showing the latest 50 out of 138 papers
- iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer (Published: 2025-06-15)
Authors: Zhelun Shen, Chenming Wu, Junsheng Zhou, Chen Zhao, Kaisiyuan Wang, Hang Zhou, Yingying Li, Haocheng Feng, Wei He, Jingdong Wang
Links:
Keywords: video generation, diffusion transformer - DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers (Published: 2025-06-12)
Authors: Lizhen Wang, Zhurong Xia, Tianshu Hu, Pengrui Wang, Pengfei Wang, Zerong Zheng, Ming Zhou
Links:|
Keywords: video generation, diffusion transformer - HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation (Published: 2025-06-10)
Authors: Ziyao Huang, Zixiang Zhou, Juan Cao, Yifeng Ma, Yi Chen, Zejing Rao, Zhiyong Xu, Hongmei Wang, Qin Lin, Yuan Zhou, Qinglin Lu, Fan Tang
Links:|
Keywords: video generation, Control, diffusion transformer - Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers (Published: 2025-06-05)
Authors: Haosong Liu, Yuge Cheng, Zihan Liu, Aiyue Chen, Jing Lin, Yiwu Yao, Chen Chen, Jingwen Leng, Yu Feng, Minyi Guo
Links:
Keywords: video generation, diffusion transformer - LayerFlow: A Unified Model for Layer-aware Video Generation (Published: 2025-06-04)
Authors: Sihui Ji, Hao Luo, Xi Chen, Yuanpeng Tu, Yiyang Wang, Hengshuang Zhao
Links:
Keywords: video generation, diffusion transformer - FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers (Published: 2025-06-04)
Authors: Xuanhua He, Quande Liu, Zixuan Ye, Weicai Ye, Qiulin Wang, Xintao Wang, Qifeng Chen, Pengfei Wan, Di Zhang, Kun Gai
Links:|
Keywords: video generation, video editing, Control, diffusion transformer - Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas (Published: 2025-06-03)
Authors: Austin Silveria, Soham V. Govande, Daniel Y. Fu
Links:
Keywords: video generation, diffusion transformer, FLUX - Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers (Published: 2025-06-03)
Authors: Pengtao Chen, Xianfang Zeng, Maosen Zhao, Peng Ye, Mingzhu Shen, Wei Cheng, Gang Yu, Tao Chen
Links:
Keywords: video generation, diffusion transformer - OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation (Published: 2025-06-02)
Authors: Sen Liang, Zhentao Yu, Zhengguang Zhou, Teng Hu, Hongmei Wang, Yi Chen, Qin Lin, Yuan Zhou, Xin Li, Qinglin Lu, Zhibo Chen
Links:
Keywords: Controllable, video generation, Control, diffusion transformer - LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model (Published: 2025-06-02)
Authors: Xiaodong Wang, Zhirong Wu, Peixi Peng
Links:|
Keywords: video generation, diffusion transformer - Playing with Transformer at 30+ FPS via Next-Frame Diffusion (Published: 2025-06-02)
Authors: Xinle Cheng, Tianyu He, Jiayi Xu, Junliang Guo, Di He, Jiang Bian
Links:
Keywords: video generation, diffusion transformer - Evaluating Robot Policies in a World Model (Published: 2025-05-31)
Authors: Julian Quevedo, Percy Liang, Sherry Yang
Links:
Keywords: video generation, Control, diffusion transformer - Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation (Published: 2025-05-31)
Authors: Muhammad Adnan, Nithesh Kurella, Akhil Arunkumar, Prashant J. Nair
Links:|
Keywords: text-to-image, video generation, diffusion transformer - STORK: Improving the Fidelity of Mid-NFE Sampling for Diffusion and Flow Matching Models (Published: 2025-05-30)
Authors: Zheng Tan, Weizhen Wang, Andrea L. Bertozzi, Ernest K. Ryu
Links:|
Keywords: video generation, FLUX - Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers (Published: 2025-05-28)
Authors: Weilun Feng, Chuanguang Yang, Haotong Qin, Xiangqi Li, Yu Wang, Zhulin An, Libo Huang, Boyu Diao, Zixiang Zhao, Yongjun Xu, Michele Magno
Links:|
Keywords: video generation, image generation, diffusion transformer - Frame In-N-Out: Unbounded Controllable Image-to-Video Generation (Published: 2025-05-27)
Authors: Boyang Wang, Xuweiyi Chen, Matheus Gadelha, Zezhou Cheng
Links:
Keywords: Controllable, video generation, Control, diffusion transformer - Minute-Long Videos with Dual Parallelisms (Published: 2025-05-27)
Authors: Zeqing Wang, Bowen Zheng, Xingyi Yang, Zhenxiong Tan, Yuecong Xu, Xinchao Wang
Links:
Keywords: video generation, diffusion transformer - RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy (Published: 2025-05-27)
Authors: Aiyue Chen, Bin Dong, Jingru Li, Jing Lin, Kun Tian, Yiwu Yao, Gongyi Wang
Links:
Keywords: video generation, diffusion transformer - Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM (Published: 2025-05-26)
Authors: Peng Liu, Xiaoming Ren, Fengkai Liu, Qingsong Xie, Quanlong Zheng, Yanhao Zhang, Haonan Lu, Yujiu Yang
Links:
Keywords: video generation, Control, diffusion transformer - The Role of Video Generation in Enhancing Data-Limited Action Understanding (Published: 2025-05-26)
Authors: Wei Li, Dezhao Luo, Dongbao Yang, Zhenhang Li, Weiping Wang, Yu Zhou
Links:
Keywords: video generation, diffusion transformer - SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation (Published: 2025-05-25)
Authors: Shenggan Cheng, Yuanxin Wei, Lansong Diao, Yong Liu, Bujiao Chen, Lianghua Huang, Yu Liu, Wenyuan Yu, Jiangsu Du, Wei Lin, Yang You
Links:
Keywords: video generation, video editing, diffusion transformer - Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation (Published: 2025-05-24)
Authors: Shuo Yang, Haocheng Xi, Yilong Zhao, Muyang Li, Jintao Zhang, Han Cai, Yujun Lin, Xiuyu Li, Chenfeng Xu, Kelly Peng, Jianfei Chen, Song Han, Kurt Keutzer, Ion Stoica
Links:
Keywords: video generation, Control, diffusion transformer - VORTA: Efficient Video Diffusion via Routing Sparse Attention (Published: 2025-05-24)
Authors: Wenhao Sun, Rong-Cheng Tu, Yifu Ding, Zhao Jin, Jingyi Liao, Shunyu Liu, Dacheng Tao
Links:
Keywords: video generation, diffusion transformer - DVD-Quant: Data-free Video Diffusion Transformers Quantization (Published: 2025-05-24)
Authors: Zhiteng Li, Hanxuan Li, Junyi Wu, Kai Liu, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang
Links:|
Keywords: video generation, diffusion transformer - Training-Free Efficient Video Generation via Dynamic Token Carving (Published: 2025-05-22)
Authors: Yuechen Zhang, Jinbo Xing, Bin Xia, Shaoteng Liu, Bohao Peng, Xin Tao, Pengfei Wan, Eric Lo, Jiaya Jia
Links:|
Keywords: video generation, diffusion transformer - Interspatial Attention for Efficient 4D Human Video Generation (Published: 2025-05-21)
Authors: Ruizhi Shao, Yinghao Xu, Yujun Shen, Ceyuan Yang, Yang Zheng, Changan Chen, Yebin Liu, Gordon Wetzstein
Links:|
Keywords: Controllable, video generation, Control, diffusion transformer - Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers (Published: 2025-05-20)
Authors: Sucheng Ren, Qihang Yu, Ju He, Alan Yuille, Liang-Chieh Chen
Links:
Keywords: video generation, diffusion transformer, FLUX - LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer (Published: 2025-05-20)
Authors: Changgu Chen, Xiaoyan Yang, Junwei Shu, Changbo Wang, Yang Li
Links:|
Keywords: video generation, Control, diffusion transformer - DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance (Published: 2025-05-17)
Authors: Xuan Shen, Chenxia Han, Yufa Zhou, Yanyue Xie, Yifan Gong, Quanyi Wang, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jiuxiang Gu
Links:|
Keywords: video generation, diffusion transformer - DanceGRPO: Unleashing GRPO on Visual Generation (Published: 2025-05-12)
Authors: Zeyue Xue, Jie Wu, Yu Gao, Fangyuan Kong, Lingting Zhu, Mengzhao Chen, Zhiheng Liu, Wei Liu, Qiushan Guo, Weilin Huang, Ping Luo
Links:
Keywords: text-to-image, video generation, rectified flow, FLUX - Generative Pre-trained Autoregressive Diffusion Transformer (Published: 2025-05-12)
Authors: Yuan Zhang, Jiacheng Jiang, Guoqing Ma, Zhiying Lu, Haoyang Huang, Jianlong Yuan, Nan Duan
Links:
Keywords: video generation, diffusion transformer - DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization (Published: 2025-05-04)
Authors: Wenchuan Wang, Mengqi Huang, Yijing Tu, Zhendong Mao
Links:
Keywords: video generation, Control, diffusion transformer - DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer (Published: 2025-04-28)
Authors: Junpeng Jiang, Gangyi Hong, Miao Zhang, Hengtong Hu, Kun Zhan, Rui Shao, Liqiang Nie
Links:
Keywords: video generation, Control, diffusion transformer - DiTPainter: Efficient Video Inpainting with Diffusion Transformers (Published: 2025-04-22)
Authors: Xian Wu, Chang Liu
Links:
Keywords: video inpainting, video generation, diffusion transformer - Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis (Published: 2025-04-20)
Authors: Jingjing Ren, Wenbo Li, Zhongdao Wang, Haoze Sun, Bangzhen Liu, Haoyu Chen, Jiaqi Xu, Aoxue Li, Shifeng Zhang, Bin Shao, Yong Guo, Lei Zhu
Links:
Keywords: video generation, diffusion transformer - VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate (Published: 2025-04-16)
Authors: Zhihang Yuan, Rui Xie, Yuzhang Shang, Hanling Zhang, Siyuan Wang, Shengen Yan, Guohao Dai, Yu Wang
Links:
Keywords: video generation, diffusion transformer - Analysis of Attention in Video Diffusion Transformers (Published: 2025-04-14)
Authors: Yuxin Wen, Jim Wu, Ajay Jain, Tom Goldstein, Ashwinee Panda
Links:
Keywords: video editing, diffusion transformer - Diffusion Transformers for Tabular Data Time Series Generation (Published: 2025-04-10)
Authors: Fabrizio Garuti, Enver Sangineto, Simone Luetto, Lorenzo Forni, Rita Cucchiara
Links:
Keywords: video generation, diffusion transformer - DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation (Published: 2025-04-09)
Authors: Wangbo Zhao, Yizeng Han, Jiasheng Tang, Kai Wang, Hao Luo, Yibing Song, Gao Huang, Fan Wang, Yang You
Links:
Keywords: text-to-image, image generation, video generation, FLUX, diffusion transformer - DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion (Published: 2025-04-05)
Authors: Maksim Siniukov, Di Chang, Minh Tran, Hongkun Gong, Ashutosh Chaubey, Mohammad Soleymani
Links:
Keywords: Controllable, video generation, Control, diffusion transformer - SkyReels-A2: Compose Anything in Video Diffusion Transformers (Published: 2025-04-03)
Authors: Zhengcong Fei, Debang Li, Di Qiu, Jiahua Wang, Yikun Dou, Rui Wang, Jingtao Xu, Mingyuan Fan, Guibin Chen, Yang Li, Yahui Zhou
Links:
Keywords: Controllable, video generation, Control, diffusion transformer - OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking (Published: 2025-04-03)
Authors: Zhongjian Wang, Peng Zhang, Jinwei Qi, Guangyuan Wang, Chaonan Ji, Sheng Xu, Bang Zhang, Liefeng Bo
Links:
Keywords: video generation, diffusion transformer - JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization (Published: 2025-03-30)
Authors: Kai Liu, Wei Li, Lai Chen, Shengqiong Wu, Yanhao Zheng, Jiayi Ji, Fan Zhou, Rongxin Jiang, Jiebo Luo, Hao Fei, Tat-Seng Chua
Links:|
Keywords: video generation, diffusion transformer - DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation (Published: 2025-03-27)
Authors: Haoyu Zhao, Zhongang Qi, Cong Wang, Qingping Zheng, Guansong Lu, Fei Chen, Hang Xu, Zuxuan Wu
Links:|
Keywords: video generation, Control, diffusion transformer - Wan: Open and Advanced Large-Scale Video Generative Models (Published: 2025-03-26)
Authors: Team Wan, Ang Wang, Baole Ai, Bin Wen, Chaojie Mao, Chen-Wei Xie, Di Chen, Feiwu Yu, Haiming Zhao, Jianxiao Yang, Jianyuan Zeng, Jiayu Wang, Jingfeng Zhang, Jingren Zhou, Jinkai Wang, Jixuan Chen, Kai Zhu, Kang Zhao, Keyu Yan, Lianghua Huang, Mengyang Feng, Ningyi Zhang, Pandeng Li, Pingyu Wu, Ruihang Chu, Ruili Feng, Shiwei Zhang, Siyang Sun, Tao Fang, Tianxing Wang, Tianyi Gui, Tingyu Weng, Tong Shen, Wei Lin, Wei Wang, Wei Wang, Wenmeng Zhou, Wente Wang, Wenting Shen, Wenyuan Yu, Xianzhong Shi, Xiaoming Huang, Xin Xu, Yan Kou, Yangyu Lv, Yifei Li, Yijing Liu, Yiming Wang, Yingya Zhang, Yitong Huang, Yong Li, You Wu, Yu Liu, Yulin Pan, Yun Zheng, Yuntao Hong, Yupeng Shi, Yutong Feng, Zeyinzi Jiang, Zhen Han, Zhi-Fan Wu, Ziyu Liu
Links:|
Keywords: video generation, video editing, diffusion transformer - Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation (Published: 2025-03-25)
Authors: Tianhao Qi, Jianlong Yuan, Wanquan Feng, Shancheng Fang, Jiawei Liu, SiYu Zhou, Qian He, Hongtao Xie, Yongdong Zhang
Links:|
Keywords: video generation, diffusion transformer - AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers (Published: 2025-03-25)
Authors: Jiazhi Guan, Kaisiyuan Wang, Zhiliang Xu, Quanwei Yang, Yasheng Sun, Shengyi He, Borong Liang, Yukang Cao, Yingying Li, Haocheng Feng, Errui Ding, Jingdong Wang, Youjian Zhao, Hang Zhou, Ziwei Liu
Links:|
Keywords: video generation, diffusion transformer - FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling (Published: 2025-03-25)
Authors: Qiusheng Huang, Xiaohui Zhong, Xu Fan, Lei Chen, Hao Li
Links:
Keywords: video generation, FLUX - Long-Context Autoregressive Video Modeling with Next-Frame Prediction (Published: 2025-03-25)
Authors: Yuchao Gu, Weijia Mao, Mike Zheng Shou
Links:
Keywords: video generation, diffusion transformer - Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks (Published: 2025-03-21)
Authors: Bhishma Dedhia, David Bourgin, Krishna Kumar Singh, Yuheng Li, Yan Kang, Zhan Xu, Niraj K. Jha, Yuchen Liu
Links:
Keywords: video generation, diffusion transformer
- Scalable Diffusion Models with Transformers (ICCV 2023)
Authors: William Peebles, Saining Xie
Code: 🔗 GitHub
Keywords: diffusion model, transformer architecture
Feel free to submit Pull Requests to improve this list! Please follow these formats:
- Paper entry format:
**[Paper Title](link)** - Brief description
- Project entry format:
[Project Name](link) - Project description
Thanks to @longxiang-ai for the template.