Microsoft Bing has improved its ability to generate images from text, and Adobe also released Firefly today to enter the generative AI game.
It was really lively last night.
While Nvidia GTC is in progress, Google has officially opened the test of Bard, and Microsoft Bing is not willing to be lonely here.
Today, Microsoft officially announced that the Bing search engine has accessed OpenAI’s DALL·E model, adding the function of AI-generated images.
In other words, after accessing ChatGPT, Bing has once again enhanced its Bing Image Creator to allow users to generate images using the DALL·E model.
“For users with Bing Preview access, Bing Image Creator will be fully integrated into the Bing chat experience, launching first in creative mode.” explained Yusuf Mehdi, head of consumer marketing at Microsoft. "By entering a description of the image, providing additional context such as location or activity, and selecting an art style, Image Creator will generate images based on the user's imagination."
Bing has three Response modes: creative mode, balanced mode and precise mode. The results generated by Bing in Creative mode are typically "original and imaginative," while Precise mode favors accuracy and relevance for more truthful and concise answers. Currently Image Creator can only be used in creative mode.
It is worth mentioning that even if you do not have access to the Bing preview version, users can still use Image Creator alone to try its image generation function by directly accessing bing.com/create. Currently, only English input is supported. Microsoft says it will support more language input over time.
In addition, Microsoft also launched new AI-powered visual stories (visual Stories) and Knowledge Cards 2.0 in Bing.
We briefly introduce to you the DALL·E series of research on OpenAI text-generated images.
On January 6, 2021, the OpenAI blog released two neural networks that connect text and images: DALL・E and CLIP. DALL・E can directly generate images based on text, while CLIP can complete the matching of image and text categories. The release of these two studies has aroused great concern in the community.
According to the blog, DALL・E can convert a large number of concepts expressed in the form of natural language into appropriate images. It can be said to be the 12 billion parameter version of GPT-3, which can generate images based on text descriptions.
DALL・E Example. Given the sentence "avocado-shaped chair", you can get images of green avocado chairs with different shapes.
2 months later, DALL·E’s papers and code were made public.
Around April 7, 2022, DALL・E ushered in an upgraded version-DALL・E 2. Compared to DALL・E, DALL・E 2 has higher resolution and lower latency in generating user-described images. Moreover, the new version also adds some new features, such as editing original images.
OpenAI also announced DALL・E 2’s research paper "Hierarchical Text-Conditional Image Generation with CLIP Latents".
Paper address: https://cdn.openai.com/papers/dall-e-2.pdf
This time Bing access DALL・E should be updated and iterated. To a certain extent, this makes up for the current ChatGPT’s lack of experience in cross-modal generation. However, when GPT-4’s multi-modal capabilities are opened, it may bring us more new experiences.
Finally, there is another generative AI released today that has attracted attention and discussion among people in the industry.
That’s when Adobe releases Firefly. This is a series of generative AI models for creative expression that allow users to quickly modify images by typing commands. Currently, Firefly has opened a beta version, and interested readers can apply to experience it.
Now it seems that more and more players are entering the game of generative AI, and the competition is becoming increasingly fierce.
The above is the detailed content of Microsoft Bing is enhanced again! Connect to the OpenAI DALL·E model to generate images from text. For more information, please follow other related articles on the PHP Chinese website!