DALL-E: How much you can push it?

Simple introduction to DALL-E would be that it is an Artificial Intelligence service, which converts text to image. You can provide description of the visuals, give it to DALL-E and you should get your image back.

On the main page you can find a few examples, what it is possible to create and description, how those images have been created.

Examples on DALL-E’s front page

So, first thought was just to take the same description and try to replicate image. And very nice candidate appeared pink balloon dog.

Pink dog example, provided on DALL-E front page

After giving “3D render of a pink balloon dog in a violet room” description, DALL-E gave back four options. However, in my opinion, the one in DALL-E’s example looks slightly better.

3D render of a pink balloon dog in a violet room

Then I’ve decided to make task a bit more complicated and gave this query

3D render of a blue balloon giraffe in a green room

where I’ve changed colors and animal, instead of dog asking for a giraffe as giraffe folded from the balloon is very close to a dog just with much longer neck. So, I thought it shouldn’t be a big issue. As it appeared, it was quite a challenge for DALL-E. It was possible to recognize a giraffe, but it wasn’t looking like a giraffe folded from the balloon.

3D render of a blue balloon giraffe in a green room

Last challenge, which I’ve decided to give – multiple animals in one picture. And this was my query:

3D render of a blue balloon dog, red balloon giraffe and black balloon cat in an orange room

I’ve stated that each animal should be balloon animal and separately changed colors for each animal. Pretty much query was similar to the one, which is given in DALL-E’s example, just with multiple animals.

And this time results were not so great. In one of the pictures, it was only two animal present and instead of red balloon giraffe I got just two red balloons.

3D render of a blue balloon dog, red balloon giraffe and black balloon cat in an orange room

After doing this test, I have a few observations:

  • Some of the pictures are somehow predefined and by following same query, it should be possible to get quite nice results.
  • DALL-E gets confused if it is being asked to create multiple different type of objects as in my example it was different animals and it was enough to get not the best result.
  • Describing objects with different adjectives might also confuse DALL-E as it might lose track, which adjective belongs to which object.
  • I have written query, where different objectives were separated by commas. It might be that DALL-E is not very well familiar with punctuation marks.

Leave a Reply

Your email address will not be published. Required fields are marked *