Recently, the artificial intelligence chat robot ChatGPT has taken the internet by storm, and netizens are rushing to appreciate its super high emotional intelligence and huge power. Take the college entrance examination, modify code, conceive of novels... It continues to break through itself under the "prompt" of the majority of netizens, and can even use an entire program to splice together a puppy for you. These skills are only developed based on GPT-3.5. On March 15, the AI world was updated again, and the latest version of GPT-4 was also released by OpenAI.
Compared with before, GPT-4 not only demonstrates more powerful language understanding capabilities, but also can process image content, and its score in the exam can even surpass 90% of humans. So, what other capabilities does such a "defying" GPT-4 have? How is it made?
After the release of GPT-4, the OpenAI official website was congested for a time, and many users expressed on social media that they immediately subscribed to the Plus service. After the release of GPT-4, the term "GPT-4" quickly appeared on the hot lists in the United States, Japan and other regions, as well as on the domestic Weibo hot search list, Douyin hot social list, etc.
According to the official introduction of OpenAI, GPT-4 is a large multi-modal model that can receive image and text input and output text. While it is less capable than humans in many real-world scenarios, it demonstrates human-level performance on a variety of professional and academic benchmarks. For example, GPT-4 can pass the simulated SAT (American Scholastic Aptitude Test) and achieve the top 10%, while GPT-3.5 can achieve the bottom 10%.
We see that in the official demonstration video, Greg Brockman, President and Co-founder of Open AI, also gave users a sneak peek of the image recognition capabilities of the latest version of the system. Not yet public, only being tested by a company called Be My Eyes. This feature will allow GPT-4 to analyze and respond to images submitted with prompts and answer questions or perform tasks based on those images. "GPT-4 is not just a language model, it is also a visual model," Brockman said. "It can flexibly accept input with arbitrary interspersed images and text, a bit like a document."
In the demonstration In another moment, Greg Brockman submitted a photo of a hand-drawn and preliminary website sketch to GPT-4, and the system created a working website based on the drawing.
## The industry believes that ChatGPT-4 is 571 times more powerful than GPT-3, and the three major professions that benefit most from this latest achievement are Writer, marketer and entrepreneur. Based on the comparison between the two, industry insiders also pointed out that GPT-4 has other advantages, such as more training data, more diverse and creative responses, and a shorter reaction time of one second. We believe that one of the main trends reflected in this upgrade is multi-modality. The model has become more complex and larger. Different types of data can be put into the same model. Make a better understanding of our surrounding environment and the real world. In addition, GPT-4 also shows its superiority in terms of multi-language. Among the 26 languages tested, GPT-4 performed better than the English language performance of other large language models such as GPT-3.5 in 24 languages, including some low-resource languages such as Latvian and Welsh. In the Chinese context, GPT-4 is able to achieve 80.1% accuracy.However, OpenAI also listed the shortcomings of GPT-4 on its official website. It still has known limitations including social bias, fabrication of facts, and confrontation generation. OpenAI stated that as society accepts AI models, it will increase transparency, encourage and promote user education and broader artificial intelligence literacy, and strive to expand people's input channels in cultivating AI models.
With the development of GPT-4, we have found that although its capabilities are not as good as humans in many real-world scenarios, it performs well in various professional and academic benchmark tests It has shown a level comparable to that of humans, which also means that GPT-4 has indeed made a step forward in commercialization.
Previously, the performance of GPT-3 in the professional field has been considered unsatisfactory. In the Uniform Bar Exam (MBE MEE MPT) in the United States, GPT-3.5 can only rank in the bottom 10%, and The results of GPT-4 can already be ranked in the top 10%. GPT-4's capabilities in professional fields have been greatly improved. In some professional fields, it has gradually approached or even surpassed humans. This provides GPT-4 with more possibilities in many ToB business fields.
For example, in areas such as professional skills auxiliary tools, knowledge retrieval applications, vocational education and training, etc., the capabilities brought by GPT-4 may be revolutionary.
After the release of GPT-4, Microsoft stated immediately: “If you have used the new Bing preview version at any time in the past five weeks, you have already understood the power of OpenAI’s latest model in advance. ". This means that New Bing has already used GPT-4. In the past few weeks, many people have experienced Bing that has been enhanced by GPT-4, and only opened the use of text capabilities. Although Microsoft did not use the word "world premiere" to describe it, after all, Microsoft has invested US$13 billion (approximately RMB 90 billion) in OpenAPI, and the treatment in exchange is reasonable.
In addition to Microsoft’s New Bing, many companies currently incorporate GPT-4 into their products, including language learning tool software Duolingo, software to help visually impaired users BeMyEyes, Mobile payment company Stripe, international financial services company Morgan Stanley, etc.
But we have to admit that although GPT-4 has greatly broadened the commercialization scenarios in which large models may be implemented, many people still think that computing power and R&D costs are difficult to overcome in the process of implementing large models. obstacles. After all, the R&D and computing power expenditures for large models currently seem to be frighteningly high. Previously, the single training and daily operating expenses disclosed by ChatGPT were in millions of dollars. It may be difficult to control costs for commercial use in the short term.
As we all know, ChatGPT is a large-scale natural language processing model developed by OpenAI, but many people don’t know that its development history can be traced back to 2015. In 2015, OpenAI was co-founded by Tesla's Musk, Sam Altman and other investors, aiming to promote the development of the field of artificial intelligence through advanced artificial intelligence technology. Musk left in 2018 due to differences in the company's development direction.
Previously, OpenAI was famous for launching the GPT series of natural language processing models. Since 2018, OpenAI has begun to release the generative pre-trained language model GPT (Generative Pre-trained Transformer), which can be used to generate various content such as articles, codes, machine translation, and Q&A.
The number of parameters of each generation of GPT models has exploded. The number of parameters of GPT-2 released in February 2019 was 1.5 billion. In May 2020, when OpenAI released GPT-3, it was already The world's most advanced natural language generation model. GPT-3 has 175 billion parameters.
When GPT-3 appeared, as an unsupervised model (now often called a self-supervised model), it could almost complete most tasks of natural language processing, such as problem-oriented search, reading comprehension, and semantics. Inference, machine translation, article generation, automated question answering, and more.
Moreover, the model has excellent performance in many tasks, such as reaching the current state-of-the-art level in French-English and German-English machine translation tasks. The automatically generated articles are almost impossible to distinguish between humans and machines. What’s even more surprising is that it achieves almost 100% accuracy on two-digit addition and subtraction tasks, and can even automatically generate code based on the task description. An unsupervised model has many functions and good effects, and it seems to give people hope for general artificial intelligence. This may be the main reason why GPT-3 has such a great impact.
In 2021, OpenAI announced that it will launch a new name called "DALL-E", an artificial intelligence technology capable of generating images. At the same time, OpenAI is also developing more advanced natural language processing technology.
I believe everyone knows the story after that. At the end of 2022, ChatGPT based on GPT-3.5 swept the world at the speed of light and became the world's largest "technical breakdown." Until now, the release of GPT-4 has been featured in many countries. Hot search.
What is the future of GPT? It is foreseeable that with the continuous advancement of algorithm technology and computing power technology, ChatGPT will further move towards a more advanced version with stronger functions, be applied in more and more fields, and generate more and better conversations for human beings. and content.
Maybe GPT is still far from the current "AI", but we might as well regard it as a starting point, the starting point of a new "AI" era.
The above is the detailed content of Once again changing the 'AI' world, GPT-4 has been waiting for a long time.. For more information, please follow other related articles on the PHP Chinese website!