Zhixuan Liu

I am a first year PhD student in the Robotics Institute at Carnegie Mellon University, advised by Prof. Jean Oh and Dr. Ji Zhang at roBot Intelligence Group (BIG). I received a Master's degree in Robotics from CMU in 2024. Before joining CMU, I received my Bachelor's degree in Computer Science and Engineering from The Chinese University of Hong Kong, Shenzhen in 2022.

Email / LinkedIn / Google Scholar / GitHub

Research Interests

My research interests lie in the area of robotics, generative models and computer vision.

Publications

	MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments Zhixuan Liu, Haokun Zhu, Rui Chen, Jonathan Francis, Soonmin Hwang, Ji Zhang, Jean Oh ICCV, 2025 Paper / Project Page / Code MOSAIC generates multi-view consistent images based on depth prior along robot navigation trajectories. It handles arbitrary viewpoint changes in multi-room environments and generalizes to open vocabulary contexts.
	SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu, Wenxuan Peng, Youngsik Yun, Andrew Hundt, Jihie Kim, Jean Oh CVPR, 2024 Paper / Project Page / Code SCoFT leverages the model's intrinsic biases to refine itself, for the purpose of shifting away from misrepresentations of a culture and achieve equitable image generation.
	Towards Equitable Representation in Text-to-Image Synthesis Models with the Cross-Cultural Understanding Benchmark (CCUB) Dataset Zhixuan Liu, Youeun Shin, Beverley-Claire Okogwu, Youngsik Yun, Lia Coleman, Peter Schaldenbrand, Jihie Kim, Jean Oh AAAI workshop on Creative AI Across Modalities, 2023 Paper / Code Fine-tuning the the text-to-image generative model (Stable Diffusion) and LLM (GPT-3) using our CCUB dataset to achieve culturally-aware text-to-image synthesis.
	StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation Peter Schaldenbrand, Zhixuan Liu, Jean Oh IJCAI, 2022 NeurIPS Workshop on Machine Learning for Creativity and Design, 2021 (Oral) Paper / Oral Presentation / Code / Demo / What's AI on YouTube StyleCLIPDraw is a text-to-drawing synthesis model with artistic control via a given style image and content control via a language description.
	Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization Peter Schaldenbrand, Zhixuan Liu, Jean Oh NeurIPS Workshop on Machine Learning for Creativity and Design, 2022 Paper / Project Page / Demo / Code An approach to generating videos in real-time based on a series of given language descriptions.
	SongBot: An Interactive Music Generation Robotic System for Non-musicians Learning from a Song Kaiwen Xue, Zhixuan Liu, Jiaying Li, Xiaoqiang Ji, Huihuan Qian IEEE International Conference on Real-time Computing and Robotics (RCAR), 2021 Paper / Video SongBot is an interactive music generation system for the non-musician learners to get inspired from a song.