|
|
|
|
|
|
|
|
|
|
|
|
|
[Code] |
[Paper (NeurIPS 2025)] |
|
Attribution performance vs. throughput (MSCOCO Models). Previous methods (AbU, D-TRAK) offer high attribution performance but are computationally expensive for deployment. Fast image similarity using off-the-shelf features (DINO) lacks attribution accuracy. We distill slower attribution methods into a feature space that retains attribution performance while enabling fast deployment.
Which feature space is good for attribution? (MSCOCO Models) We compare different feature spaces, before and after tuning for attribution. We measure mAP to the ground truth ranking, generated by AbU+. While text-only embeddings perform well before tuning, image-only embeddings become stronger after tuning. Combining both performs best and is our final method.
Qualitative results (MSCOCO Models). For each generated image and its text prompt on the left, we show top‐5 training images retrieved by: DINO + CLIP‐Text (top row), Ours (middle row), and the ground‐truth influential examples via AbU+ (bottom row). Compared to the untuned baseline, our distilled feature space yields attributions that match the ground‐truth examples more closely.
Results and discussions on Stable Diffuision Models. Below we first show qualitative results for Stable Diffusion models. For each generated image (left), we compare the DINO+CLIP-Text baseline (top row), our calibrated feature ranker (middle row), and AbU+ ground-truth attributions (bottom row). Both AbU+ and our method tend to retrieve images that reflect textual cues rather than visual similarity.
![]() |
Sheng-Yu Wang, Aaron Hertzmann, Alexei A. Efros, Richard Zhang, Jun-Yan Zhu. Data Attribution for Text-to-Image Models by Unlearning Synthesized Images. In NeurIPS, 2025. (Paper) |
|
|
Acknowledgements |
@inproceedings{wang2025fastgda,
title={Fast Data Attribution for Text-to-Image Models},
author={Wang, Sheng-Yu and Hertzmann, Aaron and Efros, Alexei A and Zhang, Richard and Zhu, Jun-Yan},
booktitle={NeurIPS},
year = {2025},
}