With our method, a user can edit a GAN model to synthesize many unseen objects with the desired shape. The user is asked to warp just a handful of generated images by defining several control points to obtain the customized models. While the edited models change an object’s shape, other visual cues, such as pose, color, texture, and background, are faithfully preserved after the modification.
Deep generative models make visual content creation more accessible to novice users by automating the synthesis of diverse, realistic content based on a collected dataset. However, the current machine learning approaches miss a key element of the creative process -- the ability to synthesize things that go far beyond the data distribution and everyday experience. To begin to address this issue, we enable a user to "warp" a given model by editing just a handful of original model outputs with desired geometric changes. Our method applies a low-rank update to a single model layer to reconstruct edited examples. Furthermore, to combat overfitting, we propose a latent space augmentation method based on style-mixing. Our method allows a user to create a model that synthesizes endless objects with defined geometric changes, enabling the creation of a new generative model without the burden of curating a large-scale dataset. We also demonstrate that edited models can be composed to achieve aggregated effects, and we present an interactive interface to enable users to create new models through composition. Empirical measurements on multiple test cases suggest the advantage of our method against recent GAN fine-tuning methods. Finally, we showcase several applications using the edited models, including latent space interpolation and image editing.
Comparison with text-to-image model
Our method enables editing modalities that are difficult to be described by text. It is unnatural to describe the warping edits precisely using text. To showcase this, we compare our edited models with DALLE-2. Here we attempted to provide the text prompts that best match the warping edits. Despite this, we observe that DALLE-2 leads to unintended color and texture changes, while our method yields consistent shape changes throughout all model samples.
A user first edits a handful of samples from the pre-trained generative model. We then train a customized model so that it can synthesize new samples with a similar visual effect specified by the user edit. To prevent overfitting, we apply style-mixing augmentation to the edited samples. For each sample, we mix the original latent code with a new randomly sampled texture latent code. Since the augmented samples still preserve shapes and poses, we can apply the same user edit to obtain a training set with diverse texture variations. We learn the customized model on the augmented training set using the LPIPS.
Warp edits. Below we show warped models with different object categories.
Color edits. Our method can also be applied to color edits. The colored strokes specify the locations to perform coloring changes, while the darker region defines the region to be preserved. The edited models produce precise coloring changes in the specified parts.
Compose edited models. We can compose the edited models into a new model with aggregated geometric changes, by simpling blending the model weights linearly. We present an interface for users to easily create a new model by composing the edited models made beforehand. Please visit this notebook for more details.
Latent space edits. Our edited models can generate smooth transitions between two random samples by interpolating the latent space. We can also apply GANSpace edits to our models to change the object attributes such as poses or colors.
We thank Richard Zhang, Nupur Kumari, Gaurav Parmar, George Cazenavette for the helpful discussions. We thank Nupur Kumari and George Cazenavette again for the significant help with proof reading and writing suggestions. We are grateful to Ruihan Gao for the edit examples. We truly appreciate that Flower, Sheng-Yu's sister's cat, agreed to have her portrait edited in the figure. S.-Y. Wang is partly supported by a Uber Presidential Fellowship. The work is partly supported by Adobe Inc. and Naver Corporation. Website template is from Colorful Colorization.