StackGAN: Text to Photo-realistic Image Synthesis

This isn’t the first text-description-to-image-generation research I’ve seen, which is a bit mind blowing to say. It does get the best-quality results that I’ve seen so far.

This research, by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaolei Huang, Xiaogang Wang, and Dimitris Metaxas, uses a stack of neural networks to go from the text input to the generated-photo output.

Since the data is vectors, it can also interpolate between different sentences:

https://arxiv.org/abs/1612.03242