Robotics paper index
EquiSteer: Cross-Attention Steering Towards a Fairer Text-Guided Image Generation
One-line summary
A robotics research paper on EquiSteer: Cross-Attention Steering Towards a Fairer Text-Guided Image Generation.
Engineering notes
Engineering notes will be added by the Robot Papers editorial team.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为 VLA、具身智能、人形机器人控制、机器人操作等高价值论文补充中文说明。
Original abstract
Text-to-image diffusion models power everyday creative tasks, but they still reproduce the demographic biases in their training data. On common prompts such as ``a photo of a nurse,'' ``a photo of a CEO'', they skew their outputs toward one gender, driven by the statistics of training data rather than anything in the text. Existing debiasing methods show promise in narrow settings but require retraining, batch-level control, or prompt-specific tuning, limiting their scalability. We propose \emph{EquiSteer}, a training-free method that works per sample by steering cross-attention (CA) activations at inference time. For each target attribute, EquiSteer precomputes steering vectors from contrastive prompts. Then at generation time, a prompt-aware gate leaves attribute-specific prompts untouched, while for neutral ones it clears existing attribute signals from the CA activations and injects a target attribute. Across SD-1.5, SD-2.1, SDXL, and SANA, EquiSteer reduces the average parity gap by up to $87\%$, with minimal effect on image quality and text-image alignment. Code is available at \href{https://github.com/Atmyre/EquiSteer}{https://github.com/Atmyre/EquiSteer}.%
Links and sources
Need this topic turned into a technical roadmap?
Robot Papers can prepare a custom robotics literature review, code map, dataset map, and B2B technology assessment.
Request B2B research
Comments