This article is reprinted with the authorization of AI New Media Qubit (public account ID: QbitAI). Please contact the source for reprinting.
Domestic players are entering ChatGPT in full swing, and their success in various industries is obvious to all.
But it is not yet clear when exactly one can start working, especially in some difficult and high-barrier industries, such as medical care.
Now, a Harvard Medical School professor has personally tested the performance of ChatGPT.
The results showed that it correctly diagnosed 39 out of 45 cases, with an accuracy rate of 87% (exceeding the 51% diagnosis rate of existing machines); and provided 30 cases with Proper triage recommendations were made.
He said that the performance of ChatGPT’s assisted diagnosis is close to that of doctors. In this case, when can I start working?
In fact, this is also the problem faced by most domestic players at present: The bonus is here, how to take it first?
Previously we have also systematically sorted out the technical and ecological difficulties behind replicating the Chinese version of ChatGPT. Obviously, it cannot be achieved in the short term.
Now a new idea has emerged: Directly create an industry vertical version of ChatGPT.
Is this approach feasible?
The creation of ChatGPT, the technical core cannot bypass the three elements of computing power, data and algorithm.
In terms of computing power, OpenAI relies on the cow Microsoft - it has 285,000 CPU cores and 10,000 Nvidia V100 GPUs. Just training a GPT-3 costs as much as 4.6 million U.S. dollars; Data, the GPT series has been iteratively optimized, and GPT-3, which once amazed everyone, has 175 billion parameters, while the previous version of GPT-2 only had 1.5 billion parameters; AlgorithmNature also has many years of profound accumulation, otherwise it would not be able to have "human-like" autonomous learning characteristics, and further demonstrate the ability to quickly adapt to multiple fields and multiple scenarios.
Coupled with ecological feedback technology, an iterative closed loop is formed. OpenAI has built its own "GPT ecosystem" in the form of open interfaces since GPT-3. According to statistics from the gpt3demo website, there are currently 656 applications developed using GPT-3 series models.
Such
technical and ecological barriers determine that it is not that easy to forge ChatGPT. In this case, solutions for the vertical version of ChatGPT have also begun to be discussed in the industry.First of all, from a technical
point of view, their core challenge is to achieve or exceed the effects of ChatGPT in tasks in vertical fields with fewer parameters, such as tens of billions of parameters.This may be more difficult than reproducing ChatGPT, because the number of parameters is much smaller, and you cannot just rely on "violent aesthetics", but also require superb model design and compression skills.
Another challenge is the difference indata sources
.Like Google and Microsoft, they actually have natural general data sources, but the accumulation of specialized data cannot be compared with vertical players.
Especially in livelihood industries such as medical care, which are highly professional and cover a wide range of areas. The high-quality data required may not be smaller than ChatGPT, and most of the data cannot be captured online.But for vertical players who have been deeply rooted in this for many years, they have already built their own industrial ecology and have rich industry data and knowledge. The accumulation has laid the necessary foundation for the reproduction of ChatGPT.
And from the perspective ofvalue requirements
, the value represented by vertical industries is real. For example, the demand for medical care itself is not small. Once ChatGPT is implemented in medical care, it will represent great social value.In the past, users would habitually use search and APP to help diagnose their own diseases, but often the results were minimal.
Harvard Medical School professor Ateev Mehrotra once tested that the average accuracy rate of existing online diagnostics is only 51%, while ChatGPT has 87%, so he believes that ChatGPT has the potential to become a game changer in medical diagnosis. . In order to accelerate the implementation of ChatGPT applications, it is feasible to create a vertical version of ChatGPT from the perspective of technical difficulty and value demand. Now there are AI players in China who are already doing this. Yunzhisheng ChatGPT Industry EditionLatest exposure of progress, intelligent voice track unicorn Yunzhisheng is promoting the construction of ChatGPT Industry Edition——Using medical care as the entry point, build the medical industry version of ChatGPT. At the same time, build a platform based on the industry version of ChatGPT, quickly expand to other fields, and then use the domain model to integrate MoE (Mixture of Experts) technology to train a general ChatGPT model.
And this idea from special to general. In fact, it is Yun Zhisheng's consistent "U X" approach. Here, "U" refers to the universal large model algorithm development and efficient training base platform; "X" refers to the dedicated large model version used in multiple industry fields.
In fact, this is also becoming the idea for many companies to enter ChatGPT. In this way, they can take advantage of existing dedicated data.
However, it is not that easy, not to mention that Yunzhisheng chose the medical industry, which has higher requirements for the quality of generated content, as its entry point.
The most important problem is to improve the reliability of medical knowledge. What ChatGPT is best at is talking nonsense seriously. In fact, it is not a big problem to put chat search and content production on Bing now, and users enjoy it.
But when applied in the industry, it is often difficult for non-professionals to detect it, which can lead to various risks. Therefore, the industry version of ChatGPT must put an end to all nonsense, especially in industries such as medical care, education, industry, etc., which have extremely high content generation requirements and low fault tolerance rates, and also have higher data quality requirements.
Secondly, it is to achieve “cost-effectiveness” in the industry. For any technology to be implemented on a large scale, it must solve the problem of "how to maximize the effect with limited resources."
This is also the only way for the ChatGPT industry to be implemented - The model can achieve the same effect as ChatGPT with a smaller parameter scale. This also brings a lot of problems to these companies.
In fact, Yunzhisheng also admitted that the parameters of the ChatGPT industry version may also need to reach tens of billions of scale, and it is not small to achieve effective results and achieve large-scale application.
To some extent, creating an industry version of ChatGPT is more difficult than the current universal ChatGPT, but when the ChatGPT industry is truly implemented, these problems must be solved. In summary, it is to realize ChatGPT engineering capabilities.
This is a path that everyone who enters the game cannot avoid but must pass through.
Based on this, there is no doubt that Yun Zhisheng’s choice is more difficult - medical treatment as the entry point. This is an area that has always been considered to be an area with high industry barriers, strong professionalism, and technical difficulty. This is also the reason why there are very few medical AI players compared to the prosperity of other industries.
But once the medical version of ChatGPT is opened, the implementation of other fields, including the final general large model, will be done with twice the result with half the effort.
As an AI company founded in 2012, they have been paying close attention to cutting-edge AI technology and actively promoted the industrial application of technology, including the upgrade and industrial application of deep learning algorithms in 2012, and the Atlas supercomputer in 2016. The platform, knowledge graph and full-stack AI technology applications have been upgraded to AGI cognitive technology based on the ChatGPT framework.
At the same time, he has been deeply involved in the medical industry for nearly 10 years and has accumulated industry knowledge, data and applications. He also won the first prize of the Beijing Science and Technology Progress Award in 2019.
In response to whether it is confident in creating an industry version of ChatGPT, Yunzhisheng said: We are completely confident.
To summarize, building ChatGPT is inseparable from high-quality data, leading algorithms and sufficient computing power. The vertical version of ChatGPT also requires deeper engineering capabilities.
From these aspects, Yunzhisheng is indeed an industry reference.
In terms of data, in the past 10 years, Yunzhisheng has accumulated a full range of industry data, including patient-oriented guidance, pre-consultation, patient education and follow-up systems, as well as clinical-oriented Voice medical records, medical record quality control, single disease quality control and medical risk management systems have been implemented in nearly 400 hospitals. It is said that the data scale has reached 5T, providing a data foundation for large language models in the medical industry.
In terms of algorithm, the cognitive intelligence represented by ChatGPT is itself the core technological advantage of Yunzhisheng. They have built one of the largest medical knowledge graphs in the country. From 2019 to 2022, Yunzhisheng's cognitive intelligence technology won 7 championships and 5 runner-up awards in relevant domestic and foreign evaluations. Its self-developed medical pre-training language model CirBERTa once topped the Chinese medical information processing challenge list.
In terms of computing power, the floating-point computing power of the Yunzhisheng supercomputing platform can reach 8 billion times per second, which can provide computing power guarantee for models with hundreds of billions of parameters.
In terms of large model engineering, Yunzhisheng has developed the CirBERTa model, reproduced the GPT-2 model, and used model compression and knowledge distillation mechanisms to achieve online The nearly hundred-fold acceleration in reasoning efficiency lays the foundation for the widespread application of large models.
In addition, as the industry version of ChatGPT, content quality assurance is also a key part.
The solution provided by Yunzhisheng is to use the continuous learning and knowledge embedding technology applied in CirBERTa to optimize the knowledge acquisition and update mechanism of the ChatGPT model based on the accumulation of existing knowledge graphs.
According to reports, this can ensure the correctness of the knowledge in ChatGPT answers, and at the same time, it can also provide knowledge traceability information.
In addition, using the industry-leading medical record quality control technology of Yunzhisheng, you can automatically discover problems in the generated medical records, and then automatically generate reinforcement learning based on human feedback as the core technology of ChatGPT (RLHF, Reinforcement Learning from Human Feedback)Required user feedback data to accelerate model optimization.
Finally, back to the incident itself, the previous discussions of ChatGPT’s value to the industry were all from the perspective of macro-level industrial ecology and model innovation, such as human-computer interaction, information distribution, content production, etc.
Nowadays, as more and more vertical enterprises enter the market, the significance of ChatGPT to enterprises is also emerging - a new AGI technology paradigm choice: Based on the "large-scale universal basic model lightweight industry" Application optimization” industry knowledge integration and problem solving methods.
In the past, players in these scenarios may have been in a state of ignorance when it came to the exploration of AI, "Seeing a mountain as a mountain, seeing a mountain as not a mountain", now there is a "smaller mountain, and knowing that there will be a road." ".
The "intelligence" demonstrated by ChatGPT has brought them a clear technical direction.
Yunzhisheng CEO Huang Wei also has a deep understanding. Even compared to AlphaGo, he believes that the impact of ChatGPT is much deeper, equivalent to a new "industrial revolution" .
The biggest advantage of this revolution is that through the self-supervised attention mechanism, it can make full use of massive unsupervised data to train general basic models, and combine perception, cognition and generation with a unified framework to achieve "end-to-end" "Integration to present machine intelligence directly from high-quality generated results. The artificially guided data-driven learning method adopted by the machine is completely different from the logical thinking mode of humans. It is similar to the jet "aerodynamics" mechanism used by airplanes, which is completely different from the "wings flapping" method adopted by birds.
Whether it is for the entire industry or an individual enterprise, the value that ChatGPT brings really makes them unable to follow it.
Especially for players in some scenarios, they are still the most likely group of people to eat up ChatGPT bonuses.
They have scenarios, data, and deep industry barriers. Once they have ChatGPT capabilities, they can be the first to implement it in the industry. This is a first-mover advantage that other players cannot achieve.
When the last AI wave hit, it was the scene players who were the first to take advantage of the AI bonus. It’s just that now ChatGPT appears directly through a technical path, and the implementation speed is naturally much faster than before.
Yunzhisheng CEO Huang Wei also gave a clear time point:
The successful application plan will be implemented within the year.
The above is the detailed content of How far is it before ChatGPT can implement medical services? Harvard professor's performance in personal test is close to that of doctors, Yun Zhisheng is exposed to create industry version. For more information, please follow other related articles on the PHP Chinese website!