Learning to play TORCS

Ben Lau has been doing some detailed writeups on deep learning. Today’s post is on using the Deep Deterministic Policy Grading algorithm to drive a race car in TORCS. With just 300 lines of Python code and a detailed breakdown of how it all works, it’s a good introduction to the concepts.

(via https://www.youtube.com/watch?v=8CNck-hdys8)




Mike Tyka @ alt-AI

Mike Tyka has been working with neural networks and machine learning at Google for quite some time, and this talk from alt-AI is a great introduction to how neural networks can be used for artistic outputs, including a clear explanation of what is going on under the hood.

Even if you know the basics, the t-SNE map of the image classes is worth watching.



Tired of seeing “roses are red” memes with terrible scansion, Darius Kazemi created a bot with terrible scansion that reports on headlines in the meme format.

Scansion is signifigant for another reason: metre and rhyme can introdue structure to generated text, making it feel more intentional and meaningful to the human reader.




A Mighty Host

This bot, by Mike Lynch describes new arrivals to the Trojan war. Inspired by Umberto Eco’s musings on the catalog of ships in the Iliad, the Haskell-based generator uses data sources like WordNet, NLTK, and Wikipedia, and invents new allies for the fleet:

Sporting raw sienna hijabs, two hundred iron-shouldered flat-greaved diggers arrived from Sokółka, home of white wolves.

Badr Ḩunayn, realm of diving petrels, sent four score claw-boned thin-doomed honghuzi.

From Burgos came fifteen dozen tight-beating live-jawed shield-maidens, carrying gladii.

From Nîmes, where they dread the large poodle, came eleven score broad-eating villains.

Nāmagiripettai, country of Ixodes spinipalpiss, sent five score moon-booted knights.

Twelve hundred pale-booted glad-chested corybantes, who drink the lucid waters of the Lyarujegeli, came from Aonla.

Eighteen score star-hung fast-handed paladins came from Cary, where they fight the Chihuahuan spotted whiptail. 

There’s some striking imagery in there.

I’m reminded of Borges’ use of lists, such as the Celestial Emporium of Benevolent Knowledge. And of C.S. Lewis’s work on medieval literature’s use of lists, encyclopedias, and bestiaries…influenced, I suspect, by Virgil’s Aeneid, which was influenced by the Iliad, and so we come full-circle.

The pan-temporal character of the hosts is also a suitably medieval conception of the Trojan war–medieval literature and art tended to conflate all time periods and depict them in contemporary dress and behavior. 

https://twitter.com/amightyhost
http://bots.mikelynch.org/amightyhost/










The Cartographic Emporium at gozzys.com

A map generator aimed at making dungeon and wilderness maps for roleplaying games. Unlike some other generators, it just makes the maps: no monsters or treasure lists here. The focus is, instead, on presentation.

You can always manually pair it with other generators, like the ones at Abulafia.

http://www.gozzys.com/




Title Generation for User Generated Videos

Human-authored clickbait has competition: here’s an algorithm that can automatically generate titles for videos. Combine it with a neural net that learns which titles get the most traffic and pretty soon the Internet will be able to curate cat videos without human intervention.

The researchers (Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, and Min Sun) recognized that most video captioning algorithms were trained on isolated clips rather than the longer, rambling versions that people are more likely to upload. So their approach identifies the most salient event in the video and captions that.

They also used an interesting additional data source: sentences that don’t have a paired video. This let them learn from a much larger vocabulary. 

Of course, as the recent debacle of Facebook’s trending algorithm demonstrates, purely AI-run automation is frequently inferior to a human augmented with AI support. Centaurs tend to outperform both humans and computers. Where better AI helps is that it lets you feed more automation into the AI half of the collaboration, empowering the human even more.

Speaking as someone who has produced a lot of stock video clips, this tech would really come in handy. Captioning and keywording videos takes a lot of thought, and searching through thousands of badly-keyworded videos is draining. A better way to automate 80% of the task is incredibly useful.

http://arxiv.org/abs/1608.07068






WaveFunctionCollapse

An image-and-tile generation approach inspired by quantum mechanics. Do watch it in action: it’s fascinating to watch and will give you a better idea of how the algorithm works. It starts with the average pixel values of the source image it is imitating, and on each step chooses an unobserved region with the lowest entropy and collapses it to a defined state.

It can also be used with tiles instead of pixels, or with three-dimensional voxel tiles. Additionally, it can be used with constraints, making it easier to combine it with other generative approaches.

All of which are excellent properties for generative algorithms to have. This is the exact kind of thing that I can see being used and reused in many different systems for many different effects.

The source code and way more detailed explanations can be found on the repository: https://github.com/mxgmn/WaveFunctionCollapse




Generating Faces with Deconvolution Networks

You remember yesterday’s Neural Photo Editor? The interactivity was great and all, but the pictures were kind of small and mushy. Is there ever going to be a neural network that has actually convincing detail?

If you hadn’t guessed it by now, the answer is that there already are: here’s an example of one. Using a dataset with higher-resolution images and the usual clever processing, it generates new faces and expressions with a high degree of detail.

Inspired by research into generating chairs with convolutional networks, Michael Flynn created this project. There’s a blogpost talking about it in more depth, but what I wanted to focus on was how it demonstrates flexibility.

While you can feed invalid inputs into it, most of the unit-length inputs result in reasonable faces and emotions, and can smoothly interpolate between them. Being able to transition between states (or pick something in-between) is a very useful property for any generative algorithm. And, of course, the quality of the results is only going to improve going into the future.

Despite that, I also like the glitchy nature of the attempted rotations; it’s not a look you can get otherwise. I do hope that there will be room in future research for some of these oddball side effects. You never know what weird look will be the foundation of a future aesthetic movement.

The code is on GitHub: https://github.com/zo7/facegen

(via https://www.youtube.com/watch?v=UdTq_Q-WgTs)







Neural Photo Editor

I’ve been talking about future artistic tools for a while. Here’s one example of how it might work.

This Neural Photo Editor takes a neural network trained on a dataset of celebrity faces and gives you an interface to edit the displayed image. By painting colors, you can shift the output in the direction you want.

It has its limitations, but it points a way towards a future toolset for image editing. Retouching portraits will probably come first, since that’s a nice restricted dataset, but you could train it on any kind of image. Imagine editing a landscape painting with a tool like this, or quickly enhancing your concept sketches. Or interactively controlling a style transfer on your photos.

It also illustrates flexibility: while some of the images generated aren’t very coherent, most of them are. Having an easy way to explore a mostly valid image-space is a very powerful tool. The low resolution is the biggest barrier to having this be usable right now, and that will be overcome soon enough.

https://github.com/ajbrock/Neural-Photo-Editor




SmoothLife (2011)

I’m fond of artificial life experiments–like Conway’s Game of Life, to pick the most common example of cellular automata–so this extension of the idea to use floating points values has been in my bookmarks for a while.

Stephan Rafler, who came up with this family of rulesets, explained how it worked with these slides, but the basic idea is that instead of looking at neighboring cells, it looks at the a circular region instead. Unlike some other attempts to generalize Life to a continuous space, the SmoothLife rulesets include many of the interesting features of Life, such as gliders.

The result is a very organic-looking simulation.

The source code is on SourceForge, though it may be easier to run the version that’s implemented as a ruleset for Ready.

(via https://www.youtube.com/watch?v=KJe9H6qS82I)