Chapter in this post:
I had the app a few days ago wombo dream picked because it enables works of art to be created using artificial intelligence. You can find out how to use the app here in the article on Wombo AI read. In the current blog post, however, I would like to go into how the technology behind this app works.
VQGAN + CLIP – Algorithms for machine learning in interaction
Wombo Dream's app - like many other apps that create generative art - is basically based on two artificial neural networks that work together to create the images. The names of these two networks are VQGAN and CLIP.
VQGAN is a neural network used to generate images that look similar to other images. CLIP, on the other hand, is a neural network trained to determine how well a text description fits an image.
CLIP provides feedback to VQGAN on how best to match the image to the text prompt. VQGAN adjusts the image accordingly and passes it back to CLIP to check how well it fits the text. This process is repeated a few hundred times, resulting in the ki-generated images.
Both algorithms were developed by Ryan Murdock and Catherine Crowson combined who are enthusiastic about ki-generated art.
I would like to use an example to show how the procedure works. The text input for the following project was "nether portal rendered in Cinema 4D". A total of 250 iterations were run and I saved a screenshot every 50 runs. Here the result:
You could also run more iterations, but for small resolutions, 250 is a good value. In practice, numbers between 500 and 700 have proven to be helpful, since more iterations also mean more computing time, but ultimately only a few details are calculated that are hardly visible.
There are some people who run up to 2000 iterations, but I think that's more of a special case, rather overkill for amateur artists like me.
Create AI art without programming skills
The Wombo Dream App now offers the possibility to use this combination of the two algorithms VQGAN and CLIP without having any programming knowledge. The text input and the selection of the style are passed on to the programming and the AI then creates the corresponding image.
In addition to Wombo Dream, there are other apps and options for creating art with AI and the VQGAN and CLIP networks. I have created a small (certainly incomplete) list for you here:
My tip: Google Colab Pro
My current choice for generating AI-based art via text input is the Google Colab Notebook. It is basically free and you can still quickly understand how it works. If you don't spend 10 euros a month in Google Colab Pro invested, it also creates the images 6x faster than in the free model.
Speed is important, especially at the beginning, because as a newcomer you don't want to wait forever for an image to be generated. And a quick generation of the finished "work of art" ultimately also helps to experiment with the settings and instructions for the AI.
Related Articles
Jens has been running the blog since 2012. He appears as Sir Apfelot for his readers and helps them with problems of a technical nature. In his free time he drives electric unicycles, takes photos (preferably with his iPhone, of course), climbs around in the Hessian mountains or hikes with the family. His articles deal with Apple products, news from the world of drones or solutions for current bugs.
Wow, that's awsome Sir Apfelot, thank you.