Let's talk about the recurrence of Sora: the one who is looked up to and the one who is forgotten-AI-php.cn

Table of Contents

Sora-like model

Technical architecture innovation before DiT

Guest lineup

Activity Highlights

Technical Exchange Community

Home

Technology peripherals

Let's talk about the recurrence of Sora: the one who is looked up to and the one who is forgotten

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Mar 27, 2024 pm 07:21 PM

openai industry video generation sora

On February 16, OpenAI released Sora, a blockbuster model in the field of video generation.

Sora’s belief in Scaling Law and its groundbreaking technological innovations has kept it at the forefront. At the same time, it also proves once again that "vigor can produce miracles" is still applicable to the field of Vincentian video.

#The technical details disclosed by Sora are far from enough to get a full picture. At the same time, Sora is not officially open to the public yet. Since then, thoughts and discussions about Sora have never stopped. 再谈复现 Sora：被仰望与被遗忘的

Sora brought the biggest impact on the entire AI field. Some video generation ideas and frameworks. This also triggered a craze for recreating Sora that continues to this day.

The motivation to reproduce Sora comes from the technical persistence and technical ideals of technicians on the one hand, and the foreseeable business value in the future on the other.

In addition, what cannot be ignored is that this artificial intelligence technology research institution, which continues to be nicknamed CloseAI, has become a benchmark in the industry, with almost every product released. can bring about disruptive innovation. But OpenAI seems to be going further and further on the road of insisting on closed source, which has further ignited the public's passion for reproducing Sora. We can believe that in the next few months, multiple Sora-like models will be released one after another and will be open sourced.

In the more than a month since Sora was released, what is the progress of the discussion and reproduction of its related technological innovations? Let’s take a look below.

Regarding the reproduction of Sora, this article starts from the following three aspects:

## It’s been more than a month since Sora was released. What is the current progress of the reproduction?
How likely is it to happen again? What is the technical foundation in the country?
Is Sora a world model? Can you help us get to AGI? Is it necessary to reproduce it?

Sora-like model

Three models that have been launched and discussed more They are Snap Video, Open-Sora 1.0, and Mora.

##Snap Video

Snap Video is a Sora-like model released on February 29. It uses an extensible spatio-temporal Transformer from Snap, the company that developed the SnapChat picture sharing software, as well as institutions such as the University of Trento.

Portal:

《The first batch of Sora-like models appeared, Sarabu launched Snap Video, the effect is better than Pika, not inferior to Gen-2

Open-Sora 1.0

Portal:

"Don't wait for OpenAI, wait for Open-Sora to be fully open source"

Mora is a multi-agent framework proposed a few days ago by researchers from Lehigh University and Microsoft Research. The framework integrates several advanced visual AI agents to replicate what Sora has demonstrated. General video generation capabilities.

Portal: "Re-engraving Sora's universal video generation capabilities, the open source multi-agent framework Mora is here"

Although The current model reproduction effect is still not as good as Sora, but in just over a month, there have been obvious technological breakthroughs, which is an optimistic signal. According to incomplete statistics, nearly 10 domestic teams are reproducing Sora, let us wait and see.

Technical architecture innovation before DiT

The DiT (Diffusion Transformer) used by Sora ) architecture is currently its biggest technological innovation, but looking back, perhaps the domestic progress is earlier.

U-ViT Architecture

## U-ViT Architecture

In September 2022, the Tsinghua team submitted a paper titled "All are Worth Words: A ViT Backbone for Diffusion Models", which was earlier than DiT 2 months. This paper proposes to use the Transformer-based network architecture U-ViT to replace the CNN-based U-Net, which coincides with Sora's idea of integrating the Transformer and diffusion models.

Portal:

"Are domestic companies expected to make Sora?" This large model team from Tsinghua University gives hope》

Video Diffusion Transformer (VDT), which was released on the arXiv website in May 2023, is led by the research team of Renmin University of China and cooperates with the University of California, Berkeley, and the University of Hong Kong. It is a Transformer-based Video Unification Generate framework. A detailed explanation of the reasons for adopting the Transformer architecture is also given.

Portal:

"Domestic universities build Sora-like model VDT, universal video diffusion Transformer is accepted by ICLR 2024"

Maybe in In terms of innovation of core technologies, domestic exploration does not lag behind, but leads the way. However, due to resource constraints and technical road planning and other reasons, it has not been able to achieve effects similar to Sora before.

Sora has undoubtedly verified a technically feasible path, and our own leading exploration in technical architecture will be more conducive to us reproducing Sora, and even I am more optimistic about the effect of surpassing Sora in some areas.

Is Sora a world model?

#Another hot discussion triggered by Sora is about the world model.

The videos generated by Sora undoubtedly have a certain understanding of the physical world, such as the classic "Pirate ship entangled in a coffee cup", which can be seen with the naked eye and involves professional fluid dynamics. , light and other characteristics of the physical world.

But some scientists, represented by Yann LeCun, strongly prove that Sora’s training method has nothing to do with the world model.

So is Sora a world model? Does he understand the physical world? Discussions about this have spread to various forums and live broadcasts. It can be seen that everyone has different opinions on the topic of what a world model is.

What we can be clear about is that if Sora is a world model, then the ideal of general artificial intelligence (AGI) may arrive sooner than we expect. Then it is necessary to reproduce Sora.

We remain curious about Sora and continue to explore possible answers to the following questions.

Can Sora’s previous video generation architecture/technology still be used? How to use?
Who is forgotten after Sora? Who is looked up to?
How do other startups/teams outside of Sora do this? do what?
Will Sora change the mainstream technology architecture? Will the architecture represented by DiT be the mainstream architecture choice in the future?
Should domestic technological power reproduce Sora? Why?
It is known that nearly 10 teams are reproducing Sora. What is the future pattern we may see?
Why OpenAI? Can OpenAI’s model be replicated?
What is the global video generation landscape like after Sora? How will it develop and change?
How do you think some star startups have publicly stated that they will not do Sora?
What is the future of multi-modal large models?
How do you view Sora’s impact from different perspectives? (Perspectives of investors, non-technical people, state-owned enterprises, AI entrepreneurs, practitioners, etc.)
What kind of social role does OpenAI play? What do you think of this company?
……

The impact brought by Sora is subversive, so the solution to the above problems will continue. As a team focused on the exploration and application practice of cutting-edge AI technologies, our AI technology forum once again focuses on the field of video generation.

On April 13th, at Liudaokou, Beijing, we planned a technical forum to focus on technological innovation, thinking and application practice after the release of Sora. The event will bring together many important guests, and we will also discuss the issues mentioned above in more depth.

In the foreseeable future, I believe that this event can have a certain positive effect and inspiration, with a view to promoting the technological development and dissemination of my country's AI open source community.

Guest lineup

This forum has a strong guest lineup, we have invited:

Mr. Zhang Junlin, a well-known technical expert in the industry, will give an in-depth dismantling of Sora’s core technology
The popular video generation model PixelDance The author, teacher Zeng Yan from ByteDance, shares the technological innovation and application behind PixelDance
The team leader of the Sora-like model VDT, from a startup company incubated by Renmin University of China—— Dr. Gao Yizhao, CEO of Sophon Engine, breaks down the technological innovation and practice of VDT in detail
Investors are an important role that cannot be separated from the AI field. Teacher Chen Shi, as the head of Fengrui Capital Investment partners will bring unique observations from the perspective of investors/institutions
State-owned enterprises responded quickly after the release of Sora and occupied a place in the AI field. From China Mobile Information Technology Co., Ltd. Mr. Tong Tong, the head of algorithm technology, will share his new thinking
The technical head of the Sora-like model Open-Sora 1.0, Mr. Bian Zhengda, CTO from Luchen Technology, is also Will break down in detail how to reproduce Sora, as well as the unique thinking and practice from their team
There are more important guests, and we are inviting them one after another...

Zhang Junlin

Director of the Chinese Information Society of China, Ph.D. of the Institute of Software, Chinese Academy of Sciences

Currently serves as the head of new technology research and development for Sina Weibo. Previously, he served as a senior technical expert at Alibaba and was responsible for the new technology team. Author of technical books "This is Search Engine: Detailed Explanation of Core Technology" and "Big Data Daily Record: Architecture and Algorithms".

Zeng Yan

ByteDance Research Algorithm Engineer

Focus on cutting-edge research in areas such as video generation and multi-modal pre-training. The model he leads in research and development has provided powerful services for ByteDance’s video generation, short video review, e-commerce customer service, Toutiao, educational problem solving and other businesses, and he has published eight related papers as the first author in TPAMI, ICML , CVPR, ACL and other top international conferences and journals, and also serves as a reviewer for TPAMI, ICML, NIPS, ICLR and other conferences. The PixelDance video generation basic model led by the company achieved the combination of high dynamics and stability for the first time in the industry, and generated a 3-minute continuous plot animation for the first time.

陈石

Fengrui Capital InvestmentPartner

##Focus on technology, software, and the Internet investment in , consumption and other fields. Before joining Fengrui Capital, he had 5 years of management experience in Alibaba. He served as vice president of Alibaba Mobile Business Group, senior executive of Alibaba Culture and Entertainment Group, international class committee member of Youku and UC, and was deeply involved in UC, AutoNavi, Youku, and Tudou. , Shenma Search, UC International and other product lines business decision-making and management execution.

15 years of continuous entrepreneurship, as a member of the core management team, deeply involved in UC (the world's largest third-party mobile browser, acquired by Alibaba in 2014) and Lakala (a well-known Chinese company During the entrepreneurial process of a third-party payment company (SZ: 300773), he served as vice president and CTO respectively; he was once a happy programmer, user growth expert, and technology enthusiast.

# holds bachelor’s and master’s degrees in Mechanical and Electrical Engineering from Beijing University of Aeronautics and Astronautics. In 2023, he was named EqualOcean's "Top 30 Global Global Investors in 2023" and Jiazi Guangnian's "Top 20 Best Investors in Artificial Intelligence and Big Data in 2022-2023".

Gao Yizhao

Sophon Engine CEO

##Ph.D. from Hillhouse School of Artificial Intelligence, Renmin University of China. An expert in multi-modal large models, he has published many top journals and conference papers, and has led a multi-person team to complete Wenlan large model training. Participate in the development and promotion of Sophon engine related models and products throughout the process.

Bian Zhengda

CTO of Luchen Technology

Graduated from the National University of Singapore. He published a paper at SC, the world's top supercomputing conference. He has 7 years of experience in high-performance AI systems and is the core developer of the Colossal-AI system.

Tong Tong

Head of Algorithm Technology of China Mobile Information Technology Co., Ltd.

Ph.D. in AI from the Institute of Automation, Chinese Academy of Sciences. Currently, he is responsible for the research and development of multi-modal large models, digital humans, intelligent agents and other fields at China Mobile Information Technology Co., Ltd., and has realized the implementation of key technologies such as Vincent pictures, Vincent videos, large model action recognition and target detection. Published a total of 12 papers, 12 company patents, and 4 soft publications.

More experts are being confirmed, so stay tuned.

Video generation technology and application - Sora era

This site’s AI technology forum always maintains sensitive tracking of technological breakthroughs in the AI field. , in order to deeply explore Sora's impact on technology and its impact on all walks of life, we specially planned the "Video Generation Technology and Application - Sora Era" AI technology forum.

We hope to help enterprises and practitioners keep up with the trend of technological development and have a comprehensive understanding of technological breakthroughs and application practices in cutting-edge fields such as Sora, video generation technology, and multi-modal large models. .

Faced with the onslaught of AI video generation, only by actively embracing learning and daring to try can we seize the technological trend and break through.

Looking forward to meeting you in Haidian District, Beijing on April 13, 2024.

The registration channel for the forum is officially opened. Scan the QR code on the poster to go directly to the event page. Due to the late release of guest introductions, the early bird discount period for this forum has been extended.

From now until 23:55 on April 7th, you can purchase tickets to participate in the conference Get a direct discount of 200 yuan and enjoy a special early bird ticket price of 699 yuan (original price 899 yuan). There are even more exclusive discounts for group purchases of five people, please see the event details page for details.

Past participants of this site’s AI technology forum, please add Alice’s WeChat account separately to get direct access to the exclusive discount link.

Activity Highlights

Free permanent viewing of the video and courseware of the forum event "Video Generation Frontier Research and Application" (the previous event has been purchased Please contact Alice for deduction. After purchasing this issue, remember to find Alice to redeem the previous video)
Watch permanently the post-event video of this "Video Generation Technology and Application - Sora Era" forum event And courseware
Gathers university professors and heavyweight technical experts from the industry to master the latest technology and broaden technical horizons
Communicate face-to-face with technical experts , in-depth connection after the meeting
covering core technology dismantling, star product best practices, technology future discussions and prospects
Full process to assist learning : Gift pack of learning materials before and after the conference
Join the video generation high-quality technology exchange community and follow up on the industry’s cutting-edge technology and information in a timely manner
Enjoy a 15% discount on tickets for related paid activities under this site

Technical Exchange Community

In order to facilitate technical exchanges, we also specially A video generation technology exchange group has been established. Technical practitioners who care about Sora, video generation and multi-modal large models are welcome to scan the QR code to join the conversation and exchange technical details and industry observations in depth.

Regarding issues related to business cooperation, group purchasing, invoices, content and other related issues for this event, please add Alice, the person in charge of this event, or consult via email.

WeChat: 15650753618

Email: jiayaning@jiqizhixin.com

##About Invoice: After successful registration, you can apply for an invoice on the Activity Bank App after the event. The invoice is an electronic VAT invoice. After the invoice is successfully issued, it will be sent to the registration email address.

#Become a forum volunteer: Participate in the implementation of specific matters at the event site, such as sign-in, guidance, order management, etc. Work meals are included. Current students are given priority. If interested, please contact Alice.

The above is the detailed content of Let's talk about the recurrence of Sora: the one who is looked up to and the one who is forgotten. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

1 months ago By DDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

1 months ago By DDD

R.E.P.O. Best Graphic Settings

2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

1 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7396

Java Tutorial

1630

CakePHP Tutorial

1358

Laravel Tutorial

1268

PHP Tutorial

1217

Related knowledge

DeepMind robot plays table tennis, and its forehand and backhand slip into the air, completely defeating human beginners Aug 09, 2024 pm 04:01 PM

But maybe he can’t defeat the old man in the park? The Paris Olympic Games are in full swing, and table tennis has attracted much attention. At the same time, robots have also made new breakthroughs in playing table tennis. Just now, DeepMind proposed the first learning robot agent that can reach the level of human amateur players in competitive table tennis. Paper address: https://arxiv.org/pdf/2408.03906 How good is the DeepMind robot at playing table tennis? Probably on par with human amateur players: both forehand and backhand: the opponent uses a variety of playing styles, and the robot can also withstand: receiving serves with different spins: However, the intensity of the game does not seem to be as intense as the old man in the park. For robots, table tennis

The first mechanical claw! Yuanluobao appeared at the 2024 World Robot Conference and released the first chess robot that can enter the home Aug 21, 2024 pm 07:33 PM

On August 21, the 2024 World Robot Conference was grandly held in Beijing. SenseTime's home robot brand "Yuanluobot SenseRobot" has unveiled its entire family of products, and recently released the Yuanluobot AI chess-playing robot - Chess Professional Edition (hereinafter referred to as "Yuanluobot SenseRobot"), becoming the world's first A chess robot for the home. As the third chess-playing robot product of Yuanluobo, the new Guoxiang robot has undergone a large number of special technical upgrades and innovations in AI and engineering machinery. For the first time, it has realized the ability to pick up three-dimensional chess pieces through mechanical claws on a home robot, and perform human-machine Functions such as chess playing, everyone playing chess, notation review, etc.

Claude has become lazy too! Netizen: Learn to give yourself a holiday Sep 02, 2024 pm 01:56 PM

The start of school is about to begin, and it’s not just the students who are about to start the new semester who should take care of themselves, but also the large AI models. Some time ago, Reddit was filled with netizens complaining that Claude was getting lazy. "Its level has dropped a lot, it often pauses, and even the output becomes very short. In the first week of release, it could translate a full 4-page document at once, but now it can't even output half a page!" https:// www.reddit.com/r/ClaudeAI/comments/1by8rw8/something_just_feels_wrong_with_claude_in_the/ in a post titled "Totally disappointed with Claude", full of

At the World Robot Conference, this domestic robot carrying 'the hope of future elderly care' was surrounded Aug 22, 2024 pm 10:35 PM

At the World Robot Conference being held in Beijing, the display of humanoid robots has become the absolute focus of the scene. At the Stardust Intelligent booth, the AI robot assistant S1 performed three major performances of dulcimer, martial arts, and calligraphy in one exhibition area, capable of both literary and martial arts. , attracted a large number of professional audiences and media. The elegant playing on the elastic strings allows the S1 to demonstrate fine operation and absolute control with speed, strength and precision. CCTV News conducted a special report on the imitation learning and intelligent control behind "Calligraphy". Company founder Lai Jie explained that behind the silky movements, the hardware side pursues the best force control and the most human-like body indicators (speed, load) etc.), but on the AI side, the real movement data of people is collected, allowing the robot to become stronger when it encounters a strong situation and learn to evolve quickly. And agile

Li Feifei's team proposed ReKep to give robots spatial intelligence and integrate GPT-4o Sep 03, 2024 pm 05:18 PM

Deep integration of vision and robot learning. When two robot hands work together smoothly to fold clothes, pour tea, and pack shoes, coupled with the 1X humanoid robot NEO that has been making headlines recently, you may have a feeling: we seem to be entering the age of robots. In fact, these silky movements are the product of advanced robotic technology + exquisite frame design + multi-modal large models. We know that useful robots often require complex and exquisite interactions with the environment, and the environment can be represented as constraints in the spatial and temporal domains. For example, if you want a robot to pour tea, the robot first needs to grasp the handle of the teapot and keep it upright without spilling the tea, then move it smoothly until the mouth of the pot is aligned with the mouth of the cup, and then tilt the teapot at a certain angle. . this

ACL 2024 Awards Announced: One of the Best Papers on Oracle Deciphering by HuaTech, GloVe Time Test Award Aug 15, 2024 pm 04:37 PM

At this ACL conference, contributors have gained a lot. The six-day ACL2024 is being held in Bangkok, Thailand. ACL is the top international conference in the field of computational linguistics and natural language processing. It is organized by the International Association for Computational Linguistics and is held annually. ACL has always ranked first in academic influence in the field of NLP, and it is also a CCF-A recommended conference. This year's ACL conference is the 62nd and has received more than 400 cutting-edge works in the field of NLP. Yesterday afternoon, the conference announced the best paper and other awards. This time, there are 7 Best Paper Awards (two unpublished), 1 Best Theme Paper Award, and 35 Outstanding Paper Awards. The conference also awarded 3 Resource Paper Awards (ResourceAward) and Social Impact Award (

Hongmeng Smart Travel S9 and full-scenario new product launch conference, a number of blockbuster new products were released together Aug 08, 2024 am 07:02 AM

This afternoon, Hongmeng Zhixing officially welcomed new brands and new cars. On August 6, Huawei held the Hongmeng Smart Xingxing S9 and Huawei full-scenario new product launch conference, bringing the panoramic smart flagship sedan Xiangjie S9, the new M7Pro and Huawei novaFlip, MatePad Pro 12.2 inches, the new MatePad Air, Huawei Bisheng With many new all-scenario smart products including the laser printer X1 series, FreeBuds6i, WATCHFIT3 and smart screen S5Pro, from smart travel, smart office to smart wear, Huawei continues to build a full-scenario smart ecosystem to bring consumers a smart experience of the Internet of Everything. Hongmeng Zhixing: In-depth empowerment to promote the upgrading of the smart car industry Huawei joins hands with Chinese automotive industry partners to provide

AI in use | Microsoft CEO's crazy Amway AI game tortured me thousands of times Aug 14, 2024 am 12:00 AM

Editor of the Machine Power Report: Yang Wen The wave of artificial intelligence represented by large models and AIGC has been quietly changing the way we live and work, but most people still don’t know how to use it. Therefore, we have launched the "AI in Use" column to introduce in detail how to use AI through intuitive, interesting and concise artificial intelligence use cases and stimulate everyone's thinking. We also welcome readers to submit innovative, hands-on use cases. Oh my God, AI has really become a genius. Recently, it has become a hot topic that it is difficult to distinguish the authenticity of AI-generated pictures. (For details, please go to: AI in use | Become an AI beauty in three steps, and be beaten back to your original shape by AI in a second) In addition to the popular AI Google lady on the Internet, various FLUX generators have emerged on social platforms

See all articles