Rewriting Geometric Rules of a GAN

Sheng-Yu Wang¹

David Bau²

Jun-Yan Zhu¹

¹CMU

²Northeastern University

Code [GitHub]

SIGGRAPH 2022 [Paper]

Slides [pptx]

With our method, a user can edit a GAN model to synthesize many unseen objects with the desired shape. The user is asked to warp just a handful of generated images by defining several control points to obtain the customized models. While the edited models change an object’s shape, other visual cues, such as pose, color, texture, and background, are faithfully preserved after the modification.

Abstract

Deep generative models make visual content creation more accessible to novice users by automating the synthesis of diverse, realistic content based on a collected dataset. However, the current machine learning approaches miss a key element of the creative process -- the ability to synthesize things that go far beyond the data distribution and everyday experience. To begin to address this issue, we enable a user to "warp" a given model by editing just a handful of original model outputs with desired geometric changes. Our method applies a low-rank update to a single model layer to reconstruct edited examples. Furthermore, to combat overfitting, we propose a latent space augmentation method based on style-mixing. Our method allows a user to create a model that synthesizes endless objects with defined geometric changes, enabling the creation of a new generative model without the burden of curating a large-scale dataset. We also demonstrate that edited models can be composed to achieve aggregated effects, and we present an interactive interface to enable users to create new models through composition. Empirical measurements on multiple test cases suggest the advantage of our method against recent GAN fine-tuning methods. Finally, we showcase several applications using the edited models, including latent space interpolation and image editing.

Comparison with text-to-image model

Our method enables editing modalities that are difficult to be described by text. It is unnatural to describe the warping edits precisely using text. To showcase this, we compare our edited models with DALLE-2. Here we attempted to provide the text prompts that best match the warping edits. Despite this, we observe that DALLE-2 leads to unintended color and texture changes, while our method yields consistent shape changes throughout all model samples.

Video

[Slides]

Paper

Sheng-Yu Wang, David Bau, Jun-Yan Zhu.
Rewriting Geometric Rules of a GAN.
In SIGGRAPH, 2022. (Paper)

[Bibtex]

Method

A user first edits a handful of samples from the pre-trained generative model. We then train a customized model so that it can synthesize new samples with a similar visual effect specified by the user edit. To prevent overfitting, we apply style-mixing augmentation to the edited samples. For each sample, we mix the original latent code with a new randomly sampled texture latent code. Since the augmented samples still preserve shapes and poses, we can apply the same user edit to obtain a training set with diverse texture variations. We learn the customized model on the augmented training set using the LPIPS.

Results

Warp edits. Below we show warped models with different object categories.

Color edits. Our method can also be applied to color edits. The colored strokes specify the locations to perform coloring changes, while the darker region defines the region to be preserved. The edited models produce precise coloring changes in the specified parts.

Compose edited models. We can compose the edited models into a new model with aggregated geometric changes, by simpling blending the model weights linearly. We present an interface for users to easily create a new model by composing the edited models made beforehand. Please visit this notebook for more details.

Latent space edits. Our edited models can generate smooth transitions between two random samples by interpolating the latent space. We can also apply GANSpace edits to our models to change the object attributes such as poses or colors.

Related works

Model editing:

S.-Y. Wang, D. Bau, J.-Y. Zhu. "Sketch Your Own GAN". In ICCV 2021.

R. Gal, O. Patashnik, H. Maron, A. Bermano, G. Chechik, D. Cohen-Or. "StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators.". In SIGGRAPH 2022.

Low-rank model updates:

D. Bau, S. Liu, T. Wang, J.-Y. Zhu, A. Torralba. "Rewriting a Deep Generative Model". In ECCV 2020.

K. Meng^*, D. Bau^*, A. Andonian, Y. Belinkov. "Locating and Editing Factual Associations in GPT". In NeurIPS 2022.

Few-shot finetuning:

T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, T. Aila. "Training Generative Adversarial Networks with Limited Data". In NeurIPS 2020.

S. Zhao, Z. Liu, J. Lin, J.-Y. Zhu, S. Hang. "Differentiable Augmentation for Data-Efficient GAN Training". In NeurIPS 2020.

U. Ojha, Y. Li, J. Lu, A. A. Efros, Y. J. Lee, E. Shechtman, R. Zhang. "Few-shot Image Generation via Cross-domain Correspondence". In CVPR 2021.

A. Sauer, K. Chitta, J. Muller, A. Geiger. "Projected GANs Converge Faster". In NeurIPS 2021.

N. Kumari, R. Zhang, E. Shechtman, J.-Y. Zhu. "Ensembling Off-the-shelf Models for GAN Training". In CVPR 2022.

Acknowledgements

We thank Richard Zhang, Nupur Kumari, Gaurav Parmar, George Cazenavette for the helpful discussions. We thank Nupur Kumari and George Cazenavette again for the significant help with proof reading and writing suggestions. We are grateful to Ruihan Gao for the edit examples. We truly appreciate that Flower, Sheng-Yu's sister's cat, agreed to have her portrait edited in the figure. S.-Y. Wang is partly supported by a Uber Presidential Fellowship. The work is partly supported by Adobe Inc. and Naver Corporation. Website template is from Colorful Colorization.