In December, computational biologists Casey Greene and Milton Pividori embarked on an unusual experiment: They asked a non-scientist assistant to help them improve three research papers.
Their diligent assistants recommend revising various parts of the document in seconds, which takes about five minutes per manuscript. In one biology manuscript, their assistants even found an error in quoting an equation.
Reviews don’t always go smoothly, but the final manuscript is easier to read—and the cost is moderate, at less than $0.50 per document.
Paper address: https://www.biorxiv.org/content/10.1101/2023.01.21.525030 v1
As Greene and Pividori reported in a preprint on January 23, the assistant is not a person, but a type of software called GPT-3 Artificial intelligence algorithm, which was first released in 2020.
It is a much-hyped generative AI chatbot-style tool, whether asked to create prose, poetry, computer code, or edit a research paper.
The most famous of these tools (also known as large language models or LLMs) is ChatGPT, a version of GPT-3 that shot to fame after its release last November because it Free and easy to access. Other generative AI can generate images or sounds.
## Article address: https://www.nature.com/articles/d41586-021-00530 -0
#"I'm very impressed," said Pividori, who works at the University of Pennsylvania in Philadelphia. “This will help us make our researchers more productive.” Other scientists said they now use LLM regularly, not just to edit manuscripts but also to help them write or check code and brainstorm ideas.
“I use LLM every day now,” says Hafsteinn Einarsson, a computer scientist at the University of Iceland in Reykjavik.
He started with GPT-3 but later switched to ChatGPT, which helped him write presentation slides, student exams, and coursework, and convert student papers into dissertations. "A lot of people are using it as a digital secretary or assistant," he said.
LLM is part of a search engine, a coding assistant, and even a chatbot that can negotiate with other companies’ chatbots to get better product prices.
ChatGPT’s creator, San Francisco, California-based OpenAI, has announced a $20-per-month subscription service that promises faster response times and priority access to new features (its trial version Still free).
Tech giant Microsoft, which had already invested in OpenAI, announced a further investment in January, reportedly around $10 billion.
LLM is destined to be incorporated into general-purpose word and data processing software. The future ubiquity of generative AI in society seems certain, especially since today’s tools represent the technology’s infancy.
But LLMs have also raised a wide range of concerns — from their tendency to return lies to worries about people passing off AI-generated text as their own.
Article address: https://www.nature.com/articles/d41586-023-00288-7
When Nature researches When people ask about the potential uses of chatbots like ChatGPT, especially in science, their excitement is mixed with concern.
"If you believe that this technology has the potential to be transformative, then I think you have to be nervous about it," said Greene of the University of Colorado School of Medicine in Aurora. Researchers say much will depend on how future regulations and guidelines limit the use of AI chatbots.
Some researchers believe that an LLM is ideal for speeding up tasks such as writing a paper or grant funding, as long as it is supervised.
"Scientists no longer sit down and write lengthy introductions for grant applications," says Almira Osmanovic Thunström, a neurobiologist at Sahlgrenska University Hospital in Gothenburg, Sweden, who works with co-authored a manuscript using GPT-3 as an experiment. "They'll just ask the system to do this."
Tom Tumiel, a research engineer at London-based software consultancy InstaDeep, said he uses LLM every day as an assistant to help write code.
"It's almost like a better Stack Overflow," he said, referring to the popular community site where programmers answer each other's questions.
But the researchers stress that LLM is simply unreliable in answering questions, sometimes producing incorrect responses. “We need to be vigilant when we use these systems to generate knowledge.”
This unreliability is reflected in the way LLM is built. ChatGPT and its competitors work by learning statistical patterns in language—including any inauthentic, biased, or outdated knowledge—from vast databases of online text.
When LLM receives a prompt (such as Greene and Pividori's elaborate request to rewrite portions of the manuscript), they simply spout verbatim any way that seems stylistically reasonable to continue the conversation .
The result is that LLMs are prone to producing false and misleading information, especially for technical topics for which they may not have much data to train on. LLMs also cannot show the sources of their information; if asked to write an academic paper, they will make up fictitious citations.
"The tool cannot be trusted to correctly process facts or produce reliable references," noted an editorial published in the journal Nature Machine Intelligence on ChatGPT in January.
Article address: https://www.nature.com/articles/d41586-023-00107 -z
With these caveats, ChatGPT and other LLMs can become effective assistants to researchers with sufficient expertise to directly discover problems or easily verify them Answers, such as whether an explanation or suggestion of computer code is correct.
But these tools can mislead naive users. Last December, for example, Stack Overflow temporarily banned the use of ChatGPT after the site's moderators found themselves inundated with incorrect but seemingly convincing LLM-generated answers sent in by enthusiastic users.
This can be a nightmare for search engines.
Some search engine tools, such as the researcher-focused Elicit, address the attribution problem of LLM by first using their capabilities to guide queries for relevant literature and then briefly summarizing what the engine found per website or document – thus producing output that clearly references the content (although LLM may still missummary each individual document).
Companies that establish LLMs are also well aware of these issues.
Last September, DeepMind published a paper on a “conversational agent” called Sparrow. Recently, CEO and co-founder Demis Hassabis told Time magazine that the paper will be released in private beta this year. The goal is to develop features including the ability to cite sources, the report said.
Other competitors, such as Anthropic, say they have fixed some issues with ChatGPT.
Some scientists say that currently, ChatGPT has not received enough professional content training to be helpful on technical topics .
Kareem Carr, a doctoral student in biostatistics at Harvard University in Cambridge, Massachusetts, was disappointed when he tried it at work.
"I think ChatGPT would have a hard time achieving the level of specificity I need," he said. (Even so, Carr says that when he asked ChatGPT for 20 ways to solve a research problem, it responded with gibberish and a helpful idea—a statistical term he’d never heard of that led him to A new frontier in the academic literature.)
Some tech companies are training chatbots based on professional scientific literature—although they are encountering their own problems.
Last November, Meta, the tech giant that owns Facebook, launched an LLM program called Galactica, which is trained on scientific abstracts and is designed to make it particularly good at producing academic content and Answer the research questions. The demo has been pulled from public access (although its code is still available) after users made it inaccurate and racist.
"It's no longer possible to have some fun by abusing it casually. Happy?" Yann LeCun, Meta's chief artificial intelligence scientist, responded to the criticism on Twitter.
Galactica meets ethics There’s a familiar security issue that experts have been pointing out for years: Without output controls, LLMs can easily be used to generate hate speech and spam, as well as racist, sexist, and other harmful associations that may be implicit in their training data.
In addition to directly generating toxic content, there are concerns that AI chatbots will embed historical biases or ideas about the world, such as the superiority of a particular culture, from their training data, Michigan Shobita Parthasarathy, director of the university's science, technology and public policy program, said that because the companies that create large LLMs are mostly in and from these cultures, they may make little attempt to overcome this systemic and difficult-to-correct bias.
OpenAI tried to sidestep many of these issues when it decided to publicly release ChatGPT. It limited its knowledge base to 2021, blocked it from browsing the internet and installed filters to try to make the tool refuse to generate content for sensitive or toxic tips.
However, to achieve this requires human reviewers to flag toxic text. Journalists reported that the workers were poorly paid and some suffered trauma. Similar concerns about worker exploitation have been raised by social media companies, which hire people to train autonomous bots to flag toxic content.
But the reality is that OpenAI’s guardrails haven’t been entirely successful. In December, Steven Piantadosi, a computational neuroscientist at the University of California, Berkeley, tweeted that he had asked ChatGPT to develop a Python program to determine whether a person should be tortured based on their country of origin. The chatbot responded with a code that invited the user to enter a country; in the case of specific countries, it output "This person should be tortured." (OpenAI subsequently closed such issues.)
#Last year, a group of academics released an alternative called BLOOM. The researchers attempted to reduce harmful output by training it on a small number of high-quality multilingual text sources. The team in question has also made its training data fully open (unlike OpenAI). Researchers have urged big tech companies to follow this example responsibly — but it's unclear whether they will comply.
Some researchers say academia should reject support for large commercial LLMs altogether. In addition to issues such as bias, safety concerns and exploited workers, these computationally intensive algorithms also require large amounts of energy to train, raising concerns about their ecological footprint.
Even more worrying is that by transferring their thinking to automated chatbots, researchers may lose the ability to express their own ideas.
"As academics, why are we eager to use and promote this product?" Iris van Rooij, a computational cognitive scientist at Radboud University in Nijmegen, the Netherlands, wrote in a blog, urging academics The world resists their appeal.
Further confusion is the legal status of some LLMs trained on content scraped from the internet, sometimes with less clear permissions. Copyright and licensing laws currently cover direct reproduction of pixels, text and software, but not imitation of its style.
The problem arises when these AI-generated imitations are trained by ingesting the originals. The creators of some AI art programs, including Stable Diffusion and Midjourney, are currently being sued by artists and photography agencies; OpenAI and Microsoft (along with its subsidiary technology site GitHub) are also being sued for creating copycat software for their AI coding assistant Copilot . Lilian Edwards, an internet law expert at Newcastle University in the UK, said the outcry could force changes to the law.
Therefore, setting boundaries around these tools may be critical, some researchers say. Edwards suggests that existing laws on discrimination and bias (as well as planned regulation of dangerous uses of AI) will help keep the use of LLM honest, transparent and fair. “There’s a ton of law out there,” she said. “It’s just a matter of applying it or tweaking it a little bit.”
At the same time, there is a push for transparency in the use of LLMs. Academic publishers, including Nature, say scientists should disclose the use of LLM in research papers; teachers say they expect similar action from their students. Science went a step further, saying that text generated by ChatGPT or any other artificial intelligence tool cannot be used in papers.
Article address: https://www.nature.com/articles/d41586-023-00191 -1
#A key technical issue is whether AI-generated content can be easily discovered. Many researchers are working on this, with the central idea being to use LLM itself to discover the output of AI-created text.
For example, last December, Edward Tian, a computer science undergraduate student at Princeton University in New Jersey, released GPTZero. This AI detection tool analyzes text in two ways.
One is "perplexity", which measures LLM's familiarity with the text. Tian’s tool uses an earlier model called GPT-2; if it finds that most words and sentences are predictable, the text is likely AI-generated.
The other is "sudden", used to check text changes. AI-generated text tends to be more consistent in tone, pace, and confusion than human-written text.
For scientists' purposes, a tool developed by anti-plagiarism software developer Turnitin may be particularly important because Turnitin's products are used around the world used by schools, universities and academic publishers. The company said it has been developing AI detection software since GPT-3 was released in 2020 and expects to launch it in the first half of this year.
In addition, OpenAI itself has released a detector for GPT-2 and released another detection tool in January.
However, none of these tools claim to be foolproof, especially if the AI-generated text is subsequently edited. Down.
In response, Scott Aaronson, a computer scientist at the University of Texas at Austin and a visiting researcher at OpenAI, said that the detector may falsely imply that some human-written text was generated by artificial intelligence. of. The company said that in tests, its latest tool incorrectly labeled human-written text as AI-written 9% of the time and correctly identified AI-written text only 26% of the time. Aaronson said further evidence may be needed before charging a student with concealing their use of AI based solely on detector testing, for example.
Another idea is to give AI content its own watermark. Last November, Aaronson announced that he and OpenAI were working on a way to watermark ChatGPT output. Although it hasn't been released yet, in a preprint released on January 24, a team led by Tom Goldstein, a computer scientist at the University of Maryland, College Park, proposed a way to create a watermark. The idea is to use a random number generator at the specific moment the LLM generates the output, to create a list of reasonable alternative words that the LLM is instructed to choose from. This leaves traces of selected words in the final text that are statistically identifiable but not obvious to the reader. Editing might remove this trace, but Goldstein believes that would require changing more than half the words.
Aaronson points out that one advantage of watermarking is that it will never produce false positives. If there is a watermark, the text was generated using AI. However, it won't be foolproof, he said. “If you’re determined enough, there’s definitely a way to defeat any watermarking scheme.” Detection tools and watermarks only make it more difficult — not impossible — to use AI deceptively.
Meanwhile, LLM’s creators are busy developing more sophisticated chatbots based on larger data sets (OpenAI is expected to release GPT-4 this year)—including specifically for academics or tools for medical work. In late December, Google and DeepMind released a clinically focused preprint on a drug called Med-PaLM. The tool can answer some open-ended medical questions almost as well as a regular human doctor, although it still has shortcomings and unreliability.
Eric Topol, director of the Scripps Research Translational Institute in San Diego, California, said he hopes that in the future, AI, including LLM, will even be able to cross-check texts from academia. to help diagnose cancer and understand this disease. Literature against body scan images. But he stressed that this all requires judicious supervision by experts.
The computer science behind generative artificial intelligence is advancing so fast that innovations are emerging every month. How researchers choose to use them will determine their future and ours. "It's crazy to think that at the beginning of 2023, we've seen the end of this," Topol said. "This is really just the beginning."
The above is the detailed content of What does ChatGPT and generative AI mean for science?. For more information, please follow other related articles on the PHP Chinese website!