How does the Wombo Dream app work?

How does the Wombo Dream app work?

I had the app a few days ago wombo dream picked because it enables works of art to be created using artificial intelligence. You can find out how to use the app here in the article on Wombo AI read. In the current blog post, however, I would like to go into how the technology behind this app works.

Painting with an AI and the Wombo AI app - it can be that easy.

Painting with an AI and the Wombo AI app - it can be that easy.

VQGAN + CLIP – Algorithms for machine learning in interaction

Wombo Dream's app - like many other apps that create generative art - is basically based on two artificial neural networks that work together to create the images. The names of these two networks are VQGAN and CLIP.

VQGAN is a neural network used to generate images that look similar to other images. CLIP, on the other hand, is a neural network trained to determine how well a text description fits an image.

CLIP provides feedback to VQGAN on how best to match the image to the text prompt. VQGAN adjusts the image accordingly and passes it back to CLIP to check how well it fits the text. This process is repeated a few hundred times, resulting in the ki-generated images.

Both algorithms were developed by Ryan Murdock and Catherine Crowson combined who are enthusiastic about ki-generated art.

I would like to use an example to show how the procedure works. The text input for the following project was "nether portal rendered in Cinema 4D". A total of 250 iterations were run and I saved a screenshot every 50 runs. Here the result:

Image 1: Everything always starts with a rather inconspicuous "seed" - a colored area with some light structures..

Image 1: Everything always starts with a rather inconspicuous "seed" - a colored area with some light structures.

 

Image 2: After 50 iterations, a lot has happened and you can see a kind of nether portal from Minecraft.

Image 2: After 50 iterations, a lot has happened and you can see a kind of nether portal from Minecraft.

 

Image 3: After the first 100 iterations, the roughest structures, the colors and the main motif have in principle already formed.

Image 3: After the first 100 iterations, the roughest structures, the colors and the main motif have in principle already formed.

 

Image 4: In the last calculations, the focus is mainly on subtleties.

Image 4: In the last calculations, the focus is mainly on subtleties.

 

Image 5: We've now gone through 200 iterations and the bright flames at the top of the portal are still doing something.

Image 5: We've now gone through 200 iterations and the bright flames at the top of the portal are still doing something.

 

Figure 6: The result is there after 250 iterations.

Figure 6: The result is there after 250 iterations. Basically a pretty work of art for Minecraft fans.

You could also run more iterations, but for small resolutions, 250 is a good value. In practice, numbers between 500 and 700 have proven to be helpful, since more iterations also mean more computing time, but ultimately only a few details are calculated that are hardly visible.

There are some people who run up to 2000 iterations, but I think that's more of a special case, rather overkill for amateur artists like me.

Create AI art without programming skills

The Wombo Dream App now offers the possibility to use this combination of the two algorithms VQGAN and CLIP without having any programming knowledge. The text input and the selection of the style are passed on to the programming and the AI ​​then creates the corresponding image.

In addition to Wombo Dream, there are other apps and options for creating art with AI and the VQGAN and CLIP networks. I have created a small (certainly incomplete) list for you here:

My tip: Google Colab Pro

My current choice for generating AI-based art via text input is the Google Colab Notebook. It is basically free and you can still quickly understand how it works. If you don't spend 10 euros a month in Google Colab Pro invested, it also creates the images 6x faster than in the free model.

Speed ​​is important, especially at the beginning, because as a newcomer you don't want to wait forever for an image to be generated. And a quick generation of the finished "work of art" ultimately also helps to experiment with the settings and instructions for the AI.

-

Did you like the article and did the instructions on the blog help you? Then I would be happy if you the blog via a Steady Membership or at Patreon would support.

1 comment

  1. Rich says:

    Wow, that's awsome Sir Apfelot, thank you.

Leave a Comment

Your e-mail address will not be published.