This year is the first year of the explosion of AI video generation, and algorithm models and product applications represented by Sora are constantly emerging. In just a few months, we’ve seen the launch of dozens of video generation tools, and AI-based video creation is starting to take off. But new technologies also cause more challenges and doubts. In addition to the well-known "blind box opening" phenomenon, video content generated by AI has also been frequently criticized for its poor controllability and cumbersome processing workflow. OpenAI once invited a professional video production team to test Sora. Among them, the Shy Kids team from Toronto used Sora to produce a short film with a balloon man theme, which perfectly combined creativity and AI technology. People are impressed.
In fact, the entire short film is not the result of Sora’s direct output, but consists of multiple video clips. When Sora generates different videos, it is difficult to ensure the consistency of the protagonist. Therefore, after introducing a lot of manual post-editing, they presented the final short film effect. The creators of Shy Kids concluded, "Sora's technology is cool, but its generation process is difficult to control." Accurate control of generated content is an important requirement in AI video creation, and it is also what today's algorithms face a big challenge.
To this end, at the just-concluded Shanghai World Artificial Intelligence Conference (WAIC), DAMO Academy released the one-stop AI video creation platform "Xunguang". It is positioned as PUGC’s one-stop AI video creation platform, which can assist users in creating scripts, storyboards, etc., and improve the efficiency of the entire creative process through workflow integration, supporting rich AI for generating and uploading materials. Editing provides more than ten AI editing functions such as character control, scene control, style transfer, camera movement control, target addition/elimination/modification, etc., allowing elements and objects in the video to be accurately controlled. Dharma Academy hopes to further improve the efficiency of AI video creation through the Xunguang platform. The goal is to use AI capabilities to reshape the entire process of traditional video production and create a new video workflow in the AI era. First time in the industryLayer-based video editingIn the early stages of Xunguang’s research and development, DAMO Academy also conducted extensive and intensive research with film and television media practitioners and creators , to understand their needs and pain points for video AIGC creation. They found that video layers were the most frequently mentioned and most urgent need among almost all video creators. Based on this, Xunguang Platform has launched a systematic video layer editing function for the first time in the industry. By inputting text, users can generate a video that matches the text description and has a transparent background, and blend it into other background videos with one click. Based on the traditional video generation capabilities, content is generated in a more flexible form such as layers. Xunguang also provides the layer disassembly function. With a single tap, the selected target will be immediately disassembled into separate layer videos, and then different background videos can be embedded smoothly. Users can fuse different foreground layers with different backgrounds to combine more new videos. The ability of layer fusion further stimulates AI creativity and imagination, while maintaining the consistency of scenes and characters between multiple shots. In the view of DAMO Academy, AI will not replace the work of creators, but will optimize the workflow of video creation and become a new engine driven by creativity. One-stop AI creation platformSimpler interaction, richer editing capabilitiesScript creation, storyboard design, material editing... The traditional video creation steps have a clear division of labor , The cycle is lengthy. With the support of AI technology, the creative steps that were originally scattered in different production processes can now be completed smoothly on the light-finding platform. "We hope to make video editing as simple, intuitive and easy to use as operating ppt." Chen Weihua, a senior algorithm expert at the Visual Technology Laboratory of Damo Academy, introduced at the scene that a major highlight of the light-seeking platform is the interaction aspect. The Xunguang platform fully takes into account the characteristics of AI video creation when designing, abstracting each video project into multiple sub-shots. Users can automatically generate a group of sub-shots based on the script, or they can upload the original video materials themselves. , divided into multiple sub-shots by the algorithm. In the creative space, users can easily view each shot. Multiple shots within a scene can be collapsed or expanded. The order between scenes can be adjusted by dragging. Shots within a scene can also be Drag and drop. Users can also add and create new sub-shots at any location, call image generation or video generation capabilities to generate content, or add various existing materials. For each shot, Xunguang provides complete and intelligent AI video editing capabilities for processing, which can be edited at the semantic level rather than the pixel level based on user intentions. Any local targets such as the human body, face, foreground, and background in the split shot can be finely edited and modified. For example, understanding the camera movement control of spatial depth of field; Another example, being able to understand the target elimination/modification of the relative relationship between objects. In terms of editing global elements of the video, the light-finding platform provides more than 20 style migrations. Xunguang also provides practical video editing functions such as frame rate control and video super-resolution. “We hope that all elements in a video can be edited and modified, so as to provide users with the greatest freedom in creation,” said Chen Weihua. Today, we are in the midst of a wave of change in AIGC, and AI has the potential to give birth to new video workflows. Both professional film and television practitioners and UGC users who love creation will benefit from it. "If you want to do your job well, you must first sharpen your tools." Damo Academy hopes that the Xunguang video creation platform can become an exclusive video studio for every creator, achieving a closer relationship between AI and creators , efficient collaboration, truly unleashing the productivity of AI. To this end, DAMO Academy Vision Technology Laboratory has made a lot of technical reserves. The laboratory is committed to the research of multi-modal visual signal understanding and generation technology. The current key research directions include more accurate image/video/3D content generation, more controllable image/video/3D content editing, and more efficient generation. Frames, multimodal understanding - generative frames, etc. Chen Weihua said that "Xunguang" will be launched for internal testing in the near future and will continue to iterate and optimize interactions. Creators are welcome to customize their own AI workflow. Internal beta application address: https://xunguang.damo-vision.com/The above is the detailed content of DAMO Academy releases one-stop AI video creation platform 'Xunguang' to create a new AI workflow. For more information, please follow other related articles on the PHP Chinese website!