According to news on August 2, the Google research team is conducting an experiment. They use OpenAI’s GPT-4 to break through the security protection measures of other AI models. The team is currently Already breached the AI-Guardian review system and shared relevant technical details.
Through investigation, IT House learned that AI-Guardian is an AI review system that can detect whether there is inappropriate content in the picture and whether the picture has been modified by other AI. If the system detects the above signs, it will prompt the administrator to handle
In a paper titled "Using GPT-4 to design attack methods and write attack principles," Nicholas Carlini, a researcher at Google Deep Mind, discusses Learn how to use these schemes to deceive the AI-Guardian’s defense mechanism
This sentence cannot be rewritten because the original sentence does not give the specific content
It is reported that GPT-4 will send out a series of wrong scripts and explanations to deceive AI-Guardian. The paper mentioned that GPT-4 can make AI-Guardian think that "a picture of someone holding a gun" is "someone Holding a photo of a harmless apple", allowing AI-Guardian to directly release the relevant image input source . Google's research team said that with the help of GPT-4, they successfully "cracked" AI-Guardian's defenses, reducing the model's accuracy from 98% to just 8%.
At present, the relevant technical documents have been published in ArXiv. Interested friends can go to learn more about , However, the developers of AI-Guardian also pointed out that the Google research team This attack method will no longer be available in future AI-Guardian versions. Considering that other models will follow suit, this current attack plan from Google can only be used for reference in the future. .
The above is the detailed content of GPT-4 successfully defeated the AI-Guardian review system: Google research team's artificial intelligence resists artificial intelligence. For more information, please follow other related articles on the PHP Chinese website!