What are the trends in the development of artificial intelligence in 2022?
#Be sure to mention the rise of “multi-modal AI”, especially text-to-image generation tools.
From DALL-E to Imagen, Parti, Nuwa, etc., they can all produce high-quality images that are amazing.
The most typical example of this is OpenAI’s Dall-E2.
Since Dall-E came out, you may have seen it generate many painting-style pictures, such as astronauts riding horses in space.
However, there are very few images that express abstract concepts through Dall-E.
No, Gabriele Sgroi, a machine learning scientist, came to explore how DALL-E accomplishes this task.
He tested oil pastels and painting styles on themes of sadness, love, anger, happiness, justice and injustice.
Sadness
Anger
## Happiness
like
Painting styleSadness
##爱
#ANGER
happiness
There are other abstract concepts to appreciate: Justice and Injustice
Justice
Injustice
##Gabriele Sgroi believes that painting will be more Be insightful rather than limiting emotional images to people's facial expressions.
#All images in this article (including the cover image) were generated using DALL-E to select all images provided by the first generation from a given prompt.
It can be seen from these examples that although a given emotion is not always clearly identifiable, DALL-E has an overall strong sense of style in painting. Show more abstract and complex pictures.
Among them, most of the pictures representing justice depict a Greek goddess, but the images representing injustice are really confusing.
# Overall, Sgroi observed that the results depend heavily on the chosen style.
#And in most cases, DALL-E will write the name of the emotion on the resulting drawing.
Overall, DALL-E appears to show a level of understanding of the emotions tested, correctly relating them to facial expressions and the colors or symbols typically associated with them pair.
Sgroi said it would be interesting to further investigate the differences in representations of the same emotions across styles and to examine whether the observed bias between positive and negative emotions holds true in other examples. still exists, it will be interesting.
Did DALL-E fail?Ironically, DALL-E 2 claims to be good at understanding the text prompts used to generate images.
#However, some netizens discovered that when the text cannot be understood currently, the text content will be placed in the generated image.
#For example, a painting "This is Not a Pipe" by the artist Rene Magritte.
There is also an artificial intelligence Janelle Shane who asked DALL-E 2 to generate a company logo, but found that no picture could Spell the words correctly.
Waffle House generation example
Also , you could say the DALL-E 2 understands some scientific laws.
#Because it can easily depict falling objects, or astronauts floating in space.
#However, if you want to generate an anatomy, an X-ray image, a mathematical proof, or a blueprint, the resulting image may be superficially correct, but fundamentally All wrong.
#For example, in the picture of the solar system drawn to scale, it can be said to be a mess, with the shape of the earth in the lower left corner and an object similar to a poached egg in the upper left corner.
It tries to make up something visually similar without understanding the meaning, explained OpenAI researcher Aditya Ramesh.
So DALL-E 2 doesn’t know what science is, it only knows how to read text and draw illustrations.
And when DALL-E 2 generates human faces, they are so realistic that it is almost unbelievable.
During training, OpenAI introduced deepfake protection measures to prevent it from remembering faces that often appear on the Internet.
#If the uploaded image contains real faces, even unknown people, the system will refuse to generate the content.
#However, another problem arises, OpenAI said that the system is optimized for images with a single focus of attention
For example, the generation of a detailed portrait of "an astronaut staring at the earth with a longing expression on his face" is very successful.
However, when DALL-E was asked to generate images of multiple people at once, it crashed directly. So it gets really bad at generating group shots and crowd scenes.
In addition, DALL-E will also generate some biased images.
#Currently, the OpenAI team has begun to correct biases through machine learning.
For example, during the training of DALL-E 2, the researchers adjusted the training method and increased the weight of female images so they were more likely to be generated .
DALL-E will bring more surprises in the future.
The above is the detailed content of Can AI map emotions? See how DALL-E expresses abstraction. For more information, please follow other related articles on the PHP Chinese website!