According to news on February 2, Microsoft Software Engineering Manager Shane Jones recently discovered that OpenAI’s DALL-E 3 model has a vulnerability, which is said to be able to generate a series of inappropriate content. . Shane Jones reported the vulnerability to the company, but was asked to keep it confidential. However, he eventually decided to disclose the vulnerability to the outside world.
▲ Picture source Shane Jones’s report disclosed to the public
This site noticed that Shane Jones discovered through independent research in December last year that OpenAI text generated images There is a vulnerability in the DALL-E 3 model. This vulnerability can bypass AI Guardrail, resulting in a series of NSFW inappropriate content. The discovery has attracted widespread attention and sparked in-depth discussions about the safety and ethics of AI systems. The company OpenAI has stated that they will fix the vulnerability as soon as possible to ensure the correct and secure operation of their systems. This incident once again reminds us that we need to pay high attention to safety and ethical issues when developing and applying artificial intelligence technology.
Shane Jones subsequently reported the vulnerability to Microsoft and OpenAI and published an open letter on LinkedIn. He claimed that these vulnerabilities may pose security risks to the public and called on OpenAI to temporarily remove the DALL-E 3 model until the vulnerabilities are resolved.
Subsequently, Shane Jones was approached by Microsoft’s legal department and senior executives, who warned him to immediately delete the LinkedIn open letter and stop disclosing any content to the outside world, but no explanation was given. Shane Jones repeatedly sought internal communication from the company, but received no response from the company, and the vulnerability was not fixed. Afterwards, Shane Jones disclosed the relevant vulnerability to the media and relevant departments.
Shane Jones mentioned that The recent AI-generated indecent photos of the famous singer Taylor Swift (Taylor Swift) that appeared on the Internet are related to this vulnerability.These indecent photos It is said to be generated using the Microsoft Designer AI function, and the underlying model of Designer is DALL-E 3. Therefore, Microsoft committed a major negligence in issuing a "sealing order" in this incident.
Since then, Microsoft has officially responded to Engadget and other media, claiming that it will address the concerns of relevant employees and fix related vulnerabilities. However, Microsoft also claimed that the vulnerability disclosed by Shane Jones actually has a low success rate and "cannot be bypassed. "All security mechanisms Microsoft has set up for the model", "It is currently unclear whether this vulnerability is related to the Taylor Swift indecent photo incident" .
The above is the detailed content of The OpenAI DALL-E 3 model has a vulnerability that generates 'inappropriate content.' A Microsoft employee reported it and was slapped with a 'gag order.'. For more information, please follow other related articles on the PHP Chinese website!