An Eye for an eye, a tooth for a tooth

It seems that almost every week there is a new and improved text to image system. This latest iteration from Kathrine Crowson (@rivershavewings ) is (currently) one of the best.

One of the reasons CLIP guided image generation is so compelling is that it challenges the artist to develop an understanding of how the world is represented inside CLIP – the philosophical problem of other minds, applied to a non-human entity.

Much time is spent on prompt engineering, trying to coax out pleasing images from the various systems. Even for the simplest sentence, there is rarely a one-to-one relationship between the prompt and the output.

Language is innately metaphorical; communication is a subtle game played between speakers that relies on huge shared cultural understanding.

CLIP, of course, does not have this background knowledge, yet it will produce a representation of any sequence of words you throw at it.

Proverbs are traditional sayings, short and pithy, and frequently metaphorical – their meaning relies on exactly the kind of subtle knowledge that CLIP lacks, hence the image results are a fascinating insight into disjunct between artificial and real intelligence.

I have generated approximately five thousand interpretations across hundreds of proverbs. Click the button at the top of this post and take a glimpse into the chasm of meaning.

Below are a few of my favourites.