current location:Home > Technical Articles > Technology peripherals > AI
- Direction:
- All web3.0 Backend Development Web Front-end Database Operation and Maintenance Development Tools PHP Framework Daily Programming WeChat Applet Common Problem Other Tech CMS Tutorial Java System Tutorial Computer Tutorials Hardware Tutorial Mobile Tutorial Software Tutorial Mobile Game Tutorial
- Classify:
-
- LeCun's new work: layered world model, data-driven humanoid robot control
- With large models as intelligence blessings, humanoid robots have become a new trend. The robots in science fiction movies that "can tell that I'm not a human being" seem to be getting closer. However, thinking and acting like humans is still a difficult engineering problem for robots, especially humanoid robots. Taking a simple learning to walk as an example, using reinforcement learning to train may evolve into the following: There is no problem in principle (following the reward mechanism), and the goal of going up the stairs has been achieved. Except that the process is relatively abstract and different from most human beings. Behavioral patterns may be different. The reason why robots have difficulty acting "naturally" like humans is due to the high-dimensional nature of the space of observation and action, as well as the inherent instability of the bipedal form. In this regard, LeCun participated in
- AI 1090 2024-06-13 11:37:17
-
- Good news in the field of 3D asset generation: The Institute of Automation and Beijing University of Posts and Telecommunications teams jointly create a new paradigm of material generation
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com In today’s digital era, 3D assets play an important role in the construction of the metaverse, the realization of digital twins, and the application of virtual reality and augmented reality, promoting technological innovation. and improvement of user experience. Existing 3D asset generation methods often leverage generative models
- AI 899 2024-06-13 11:09:54
-
- In 18 months, the OpenAI team developed GPT-4o
- Ultraman: Without his (Prafulla Dhariwal's) vision, talent, belief and determination, there would be no GPT-4o. "GPT-4o would not have been possible without the vision, talent, belief and long-term determination of @prafdhar. It is these efforts (and the work of many others) that have led to what I hope will be a revolution in the way computers are used." Two days after OpenAI released its new generation flagship generation model GPT-4o, OpenAI CEO Altman commented on one of the people involved in the project. After 18 months of working with multiple teams at OpenAI, co-founder Greg Brockman said: “GPT-4o is the result of the entire team’s efforts.
- AI 718 2024-06-13 10:33:27
-
- Scientists use GenAI to discover new insights in physics
- With help from scientists at MIT and the University of Basel in Switzerland, researchers have developed a new machine learning (ML) framework that could help uncover new insights about materials science. The results of this study are published in Physical Review Letters. This research uses a neural network-based approach to quickly predict and optimize material properties and characteristics by analyzing large amounts of material data. This GenAI framework is highly automated and efficient and can help accelerate the progress of materials research. The researchers say their framework could be applied to a variety of applications. When water transforms from liquid to solid, it undergoes important transformation properties, such as volume and density. Phase changes in water are so common that we don't even think about them seriously, but it's a complex physical system. during phase change
- AI 438 2024-06-13 10:32:22
-
- The world model also spreads! The trained agent turns out to be pretty good
- World models provide a way to train reinforcement learning agents in a safe and sample-efficient manner. Recently, world models have mainly operated on discrete latent variable sequences to simulate environmental dynamics. However, this method of compressing into compact discrete representations may ignore visual details that are important for reinforcement learning. On the other hand, diffusion models have become the dominant method for image generation, posing challenges to discrete latent models. Promoted by this paradigm shift, researchers from the University of Geneva, the University of Edinburgh, and Microsoft Research jointly proposed a reinforcement learning agent trained in the diffusion world model-DIAMOND (DIffusionAsaModelOfeNvironmentDreams). Paper address: https:
- AI 429 2024-06-13 10:12:24
-
- 2024 Zhiyuan Conference Agenda Revealed丨Artificial Intelligence Talent Development Exchange Meeting
- From June 14th to 15th, 2024, the 6th Beijing Intelligent Source Conference will be held in a combination of offline and online. The offline venue will be located at the Zhongguancun National Independent Innovation Demonstration Zone Conference Center. The 2024 Zhiyuan Conference once again brings together outstanding researchers of the year with a global perspective to exchange new ideas, explore new ideas, and lead new frontiers. Registration channels are now officially open. Artificial Intelligence Talent Development Exchange Meeting | On the afternoon of June 14th, the 2024 Beijing Zhiyuan Conference will hold a closed-door exchange meeting on Artificial Intelligence Talent Development. We sincerely invite you to discuss key issues in the development of artificial intelligence talents. The discussion forum will focus on the fields of intelligence, natural language, machine vision, multimodality, reinforcement learning, AI for Science and other directions to provide you with
- AI 1276 2024-06-13 10:00:59
-
- HKU Byte proposes a new paradigm of multi-modal large models, simulating human perception first and then cognition, to accurately locate objects in the picture
- Currently, multimodal large models (MLLM) have demonstrated strong cognitive understanding capabilities on multiple visual tasks. However, most large multi-modal models are limited to one-way image understanding, making it difficult to map the understood content back to the image. For example, the model can easily tell what objects are in the picture, but it cannot accurately identify the objects in the picture. The lack of positioning capabilities directly limits the application of multi-modal large models in downstream fields such as image editing, autonomous driving, and robot control. In response to this problem, researchers from the University of Hong Kong and the ByteDance commercialization team proposed a new paradigm, Groma, which uses regional image coding to improve the perceptual positioning capabilities of multi-modal large models. After integrating positioning, Groma can directly associate text content and image areas to display
- AI 814 2024-06-12 22:18:00
-
- Tsinghua University and Zhipu AI open source GLM-4: launching a new revolution in natural language processing
- Since the launch of ChatGLM-6B on March 14, 2023, the GLM series models have received widespread attention and recognition. Especially after ChatGLM3-6B was open sourced, developers are full of expectations for the fourth-generation model launched by Zhipu AI. This expectation has finally been fully satisfied with the release of GLM-4-9B. The birth of GLM-4-9B In order to give small models (10B and below) more powerful capabilities, the GLM technical team launched this new fourth-generation GLM series open source model: GLM-4-9B after nearly half a year of exploration. This model greatly compresses the model size while ensuring accuracy, and has faster inference speed and higher efficiency. The GLM technical team’s exploration has not
- AI 1050 2024-06-12 20:38:02
-
- 7B? 13B? 175B? Interpret parameters of large models
- There are also large and small models, and their size is measured by the number of parameters. GPT-3 has 17.5 billion parameters, and Grok-1 is even more impressive, with 31.4 billion parameters. Of course, there are also slimmer ones like Llama, whose number of parameters is only between 7 billion and 70 billion. The 70B mentioned here may not refer to the amount of training data, but to the densely packed parameters in the model. These parameters are like small "brain cells". The more they are, the smarter the model can be and the better it can understand the intricate relationships in the data. With these "brain cells," models may perform better at tasks. However, many times these parameters, especially in large-scale models, can cause problems. These "brain cells" are
- AI 834 2024-06-12 20:04:15
-
- YoloCS: Effectively reduce the space complexity of feature maps
- Paper address: YOLOCS:ObjectDetectionbasedonDenseChannelCompressionforFeatureSpatialSolidification(arxiv.org)01 Overview In today's sharing, the researchers examined the correlation between channel features and convolution kernels during feature purification and gradient backpropagation, focusing on the front-end within the network. forward and backward propagation. Therefore, the researchers proposed a feature space solidification method called dense channel compression. Based on the core concepts of the method, two innovative modules for backbone and head networks are introduced: dense channel compression (DCFS) for feature space solidification and asymmetric multi-stage compression.
- AI 455 2024-06-12 17:49:26
-
- Meta launches 'Chameleon' to challenge GPT-4o, 34B parameters lead the multi-modal revolution! 10 trillion token training refreshes SOTA
- The emergence of GPT-4o once again created a new paradigm for multi-modal model development! Why do you say that? OpenAI calls it "the first 'native' multi-modal" model, which means that GPT-4o is different from all previous models. Traditional multi-modal basic models usually use a specific "encoder" or "decoder" for each modality to separate different modalities. However, this approach limits the model's ability to effectively fuse cross-modal information. GPT-4o is the “first end-to-end” trained model that spans text, visual, and audio modes, with all inputs and outputs processed by a single neural network. And now, the industry's first model that dares to challenge GPT-4o has appeared! Recently, from the Meta group
- AI 928 2024-06-12 13:18:58
-
- 3 times the generation speed and reduced memory costs, an efficient decoding framework that surpasses Medusa2 is finally here
- Efficiently decode n-token sequences, CLLMs+Jacobi decoding framework. Traditionally, large language models (LLMs) are thought of as sequential decoders, decoding each token one by one. A research team from Shanghai Jiao Tong University and the University of California shows that pre-trained LLMs can be easily taught to become efficient parallel decoders and introduces a new family of parallel decoders called Consistent Large Language Models (CLLMs) , able to reduce inference latency by efficiently decoding an n-token sequence at each inference step. In this paper, the research shows that “imitating the cognitive process that humans use to express word-for-word expressions after forming complete sentences in their heads can be effectively learned by simply fine-tuning pre-trained LLMs.
- AI 1006 2024-06-12 11:55:28
-
- Ilya's first action after leaving his job: Liked this paper, and netizens rushed to read it
- Since Ilya Sutskever officially announced his resignation from OpenAI, his next move has become the focus of everyone's attention. Some people even paid close attention to his every move. No, Ilya just liked ❤️ a new paper - netizens jumped on it: the paper comes from MIT, the author put forward a hypothesis, summed up in one sentence like this: Neuro Networks trained on different data and modalities with different goals are tending to form a shared statistical model of the real world in their representation space. They named this speculation the Platonic Representation Hypothesis, in reference to Plato's Allegory of the Cave and his ideas about the nature of ideal reality. The selection of Ilya is still guaranteed. Some netizens called it the best they have seen this year after watching it.
- AI 715 2024-06-12 11:22:14
-
- GraphRAG enhanced for knowledge graph retrieval (implemented based on Neo4j code)
- Graph Retrieval Enhanced Generation (GraphRAG) is gradually becoming popular and has become a powerful complement to traditional vector search methods. This method takes advantage of the structural characteristics of graph databases to organize data in the form of nodes and relationships, thereby enhancing the depth and contextual relevance of retrieved information. Graphs have natural advantages in representing and storing diverse and interrelated information, and can easily capture complex relationships and properties between different data types. Vector databases are unable to handle this type of structured information, and they focus more on processing unstructured data represented by high-dimensional vectors. In RAG applications, combining structured graph data and unstructured text vector search allows us to enjoy the advantages of both at the same time, which is what this article will discuss. structure
- AI 1487 2024-06-12 10:32:28
-
- With built-in 10,000+ popular Github code libraries, Baidu officially released Comate Code Knowledge Enhancement 2.0
- On May 18, 2019, the 7th iTechClub North China Internet Technology Elite Summit Forum was held. The Director of Baidu Engineering Performance Department gave a keynote speech on "Towards a New Paradigm of AI Native R&D for Human-Machine Collaboration". He released the latest achievement of Baidu’s smart code assistant Comate - Comate Code Knowledge Enhancement 2.0. This is the first smart code assistant in China that supports real-time retrieval. It has built-in more than 10,000 Github popular code libraries, which has brought great benefits to developers around the world. An unprecedented programming experience. As one of the highlights of this conference, Comate Code Knowledge Enhancement 2.0 received great attention from attendees. The intelligent code assistant Comate is an intelligent code completion and
- AI 1086 2024-06-11 22:45:15