Recently, Stability.ai, a company founded and funded by Emad Mostaque, announced the public release of artistic works created by AI.
You might think this is just another attempt at AI in the arts, but it’s actually much more than that. There are two reasons. First, unlike DALL-E 2, Stable Diffusion is open source. This means that anyone can leverage its backbone to build applications for specific text-to-image creation tasks for free. Additionally, the developers of Midjourney implemented a feature that allows users to combine it with Stable Diffusion, which has led to some amazing results.
Just imagine what will happen in the next few months. Second, unlike the DALL-E mini and Disco Diffusion, Stable Diffusion can create stunningly realistic and artistic work, nothing to envy of OpenAI or Google's models. People even claim that it is the new SOTA among "generative search engines". (Unless otherwise stated, all images in this article were created using Stable Diffusion).
Stable Diffusion embodies the best features of the AI art world: it is arguably the best AI art model available, and it is open source. This is simply unheard of and will have a huge impact. What’s even more interesting is that news about these services may reach you through the most unexpected sources. Your parents, your children, your partner, your friends or your colleagues. These people are often outsiders to what is happening in the field of artificial intelligence, and they are about to discover the latest trends in this field. Art could be the way AI finally knocks on the door of those who are blind to the future. Isn't this very poetic?
Stability.ai was born to create "open AI tools that allow us to realize our potential." Not just a research model that never gets into the hands of most people, but a tool with real-world applications, open for me and you to use and explore.
This is what makes it different from other tech companies, like OpenAI, which jealously guards the secrets of its best systems (GPT-3 and DALL-E 2), or Google, which never even Intend to release your own (PaLM, LaMDA, Imagen or Parti) as a private beta. This public release of Stability.ai goes beyond sharing model weights and code—which, while critical to the health of science and technology, are not something most people care about. And also provides a code-free, ready-to-use website for those of us who don’t want to or don’t know how to code.
The website is called DreamStudio Lite, which is free to use and can generate up to 200 pictures. Like DALL-E 2, it has a paid subscription model, which gets you 1,000 images for £10 (OpenAI refills with 15 credits per month, but to get more credits you have to buy the 115 pack for $15). The cost of DALL-E is US$0.03/image, while the cost of Stable Diffusion is £0.01/image. Additionally, Stable Diffusion can be used at scale via API (cost scales linearly, so you can get 100K generations for £1000). In addition to image generation, Stability.ai will soon announce DreamStudio Pro (audio/video) and Enterprise (studio). Another feature that DreamStudio may implement soon is the ability to generate images from other images instead of the usual text-to-image setup. Like this:
On the website, there is also a resource about prompt engineering. If you are new to this area, you may be able to use it. Plus, unlike DALL-E 2, you can control parameters to influence the outcome and retain more agency over it. Stability.ai has done everything to facilitate access to models. OpenAI was the first and had to go slower to assess the potential risks and biases inherent in the model, but they didn't need to keep the model in closed beta for so long or build such a business model that restricted creativity. Both Midjourney and Stable Diffusion have proven this.
Open source technology has its own limitations. Openness should come before privacy and tight control, but not before security. As the company explains in the announcement, it is "a license that allows for both commercial and non-commercial use," with a focus on open and responsible downstream use of the model. It also mandates that derivative works be subject to at least the same user-based restrictions.
The open source model is a good model in itself, but if we don’t want this technology to end up hurting people, or adding more arrogance to the internet in the form of misinformation, It is equally important to establish reasonable guardrails. “Because these models are trained on a wide range of image-text pairs scraped from the Internet, the models may reproduce some social biases and produce unsafe content, so open mitigation strategies and public discussion of these biases can allow everyone to Be part of this conversation." In any case, open security > privacy and control.
With a solid foundation of ethical values and openness, Stable Diffusion promises to outperform its competitors in real-world impact.
For those who want to download it and run it on their computers, you should know that it requires 6.9Gb of VRAM - which is suitable for high-end consumer-grade GPUs, making it smaller than the DALL-E 2 To be lightweight, but still out of reach for most users. The rest, like me, can start using Dream Studio right away.
Stable Diffusion is widely regarded as the best AI art model currently available and will become the foundation for countless applications, networks and services, redefining how we create and interact with art. interactive. But now, apps specifically designed for different use cases will be built from the ground up for everyone to use. People are enhancing children's drawings, making collages with outer drawings and inner drawings, designing magazine covers, drawing comics, creating morphed and animated videos, generating images from images, and more. Some of these applications are already possible in DALL-E and Midjourney, but Stable Diffusion can push the current creative revolution into the next stage. In the words of Andrej Karpathy, former Tesla AI director and Li Feifei's disciple, "artistic creation has entered a new era of human AI cooperation."
Like Stable Diffusion AI art models involve a new class of tools and should be understood with a new frame of mind for the new reality we live in. We cannot simply draw analogies or parallels to other eras and expect to be able to accurately explain or predict the future. Some things will be similar, some won't. We must treat this coming future as uncharted territory.
There is no doubt that the public release of Stable Diffusion is the most important and influential event ever in the field of artificial intelligence art models, And this is just the beginning.
Emad Mostaque, one of the authors, said on Twitter: "Expect quality to continue to rise across the board as we release faster, better and more specific models. Not just images, but audio next month, Then move on to 3D, video. Language, code, and more training.
We are on the verge of a multi-year revolution in the way we interact, connect and understand art, and creativity in general. And not just in the philosophical, intellectual realm, but as something that everyone now shares and experiences. The creative world will change forever and we must have open and respectful conversations to create a better future for all. Only when open source technology is used responsibly can we create the change we want to see.
The above is the detailed content of Not just DALL·E! Now AI painters can model and make videos. I can't even imagine what will happen in the future.. For more information, please follow other related articles on the PHP Chinese website!