According to news on March 4, Bing chat based on ChatGPT has made many users feel the power of AI, and Microsoft recently launched a more powerful all-round AI--Kosmos-1. ChatGPT is a plain text LLM, while it is a more powerful multi-modal large language model (MLLM).
Kosmos-1 can analyze the content of images, solve visual puzzles, and perform visual text recognition , pass visual IQ tests and understand natural language instructions, etc.
IT House learned from reports that Kosmos-1 can process text, audio, images, videos and other content, building an all-round artificial intelligence that can handle tasks like human thinking.
"As a fundamental component of intelligence, multimodal perception is a necessary condition for realizing artificial intelligence," the researchers wrote in their academic paper. Visual examples in the Kosmos-1 paper show the model analyzing images and answering questions about the images, reading text from the images, writing captions for the images, and performing a visual IQ test with 22-26% accuracy.
Microsoft says it plans to make Kosmos-1 available to developers, although the GitHub page cited by the paper had no apparent Kosmos-specific code at the time of publication.
The above is the detailed content of More powerful than ChatGPT, Microsoft launches all-round artificial intelligence model Kosmos-1. For more information, please follow other related articles on the PHP Chinese website!