current location:Home > Technical Articles > Technology peripherals > AI
- Direction:
- All web3.0 Backend Development Web Front-end Database Operation and Maintenance Development Tools PHP Framework Daily Programming WeChat Applet Common Problem Other Tech CMS Tutorial Java System Tutorial Computer Tutorials Hardware Tutorial Mobile Tutorial Software Tutorial Mobile Game Tutorial
- Classify:
-
- From bare metal to a large model with 70 billion parameters, here is a tutorial and ready-to-use scripts
- We know that LLM is trained on large-scale computer clusters using massive data. This site has introduced many methods and technologies used to assist and improve the LLM training process. Today, what we want to share is an article that goes deep into the underlying technology and introduces how to turn a bunch of "bare metals" without even an operating system into a computer cluster for training LLM. This article comes from Imbue, an AI startup that strives to achieve general intelligence by understanding how machines think. Of course, turning a bunch of "bare metal" without an operating system into a computer cluster for training LLM is not an easy process, full of exploration and trial and error, but Imbue finally successfully trained an LLM with 70 billion parameters. and in the process accumulate
- AI 783 2024-07-24 20:13:31
-
- How to create an open source model that can defeat GPT-4o? Regarding Llama 3.1 405B, Meta is written in this paper
- After an "accidental leak" two days in advance, Llama 3.1 was finally officially released last night. Llama3.1 extends the context length to 128K and has three versions: 8B, 70B and 405B, once again single-handedly raising the competitive standards of large model tracks. For the AI community, the most important significance of Llama3.1405B is that it refreshes the upper limit of the capabilities of the open source basic model. Meta officials said that in a series of tasks, its performance is comparable to the best closed source model. The table below shows the performance of the current Llama3 series models on key benchmarks. It can be seen that the performance of the 405B model is very close to that of GPT-4o. At the same time, Meta announced "TheLlam"
- AI 1085 2024-07-24 18:42:03
-
- The performance is 11 times stronger. Georgia Tech and Tsinghua teams used AI to assist in discovering new energy storage materials, published in Nature sub-journal
- Editor | Radish skin electrostatic capacitors are key energy storage components in advanced power systems in the fields of defense, aviation, energy and transportation. Energy density is the figure of merit of an electrostatic capacitor and is primarily determined by the choice of dielectric material. Most industrial-grade polymer dielectric materials are flexible polyolefins or rigid aromatics that offer either high energy density or high thermal stability, but not both. Here, a research team from the Georgia Institute of Technology, the University of Connecticut, and Tsinghua University used artificial intelligence (AI), polymer chemistry, and molecular engineering to discover one of the polynorbornene and polyimide series. Tie
- AI 447 2024-07-24 17:42:52
-
- Neural networks also have spatial awareness! Learn to create maps in Minecraft, published in Nature sub-magazine
- This is the first time humans have demonstrated that neural networks can create their own maps. Imagine that you are in a strange town. Even if the surrounding environment is unfamiliar at first, you can explore around and eventually draw a map of the environment in your brain, which includes buildings, streets, signs, etc. that interact with each other. positional relationship between them. This ability to construct spatial maps in the brain underlies higher-order types of cognition in humans: for example, language is theorized to be encoded by map-like structures in the brain. However, even the most advanced artificial intelligence and neural networks cannot build such a map out of thin air. "There's a sense that even the most advanced
- AI 701 2024-07-24 09:38:12
-
- The first open source model to surpass GPT4o level! Llama 3.1 leaked: 405 billion parameters, download links and model cards are available
- Get your GPU ready! Llama3.1 finally appeared, but the source is not Meta official. Today, the leaked news of the new Llama large model went viral on Reddit. In addition to the basic model, it also includes benchmark results of 8B, 70B and the maximum parameter of 405B. The figure below shows the comparison results of each version of Llama3.1 with OpenAIGPT-4o and Llama38B/70B. It can be seen that even the 70B version exceeds GPT-4o on multiple benchmarks. Image source: https://x.com/mattshumer_/status/1815444612414087294 Obviously, version 3.1 of 8B and 70
- AI 1295 2024-07-23 20:51:33
-
- ECCV 2024|BlazeBVD, a general method for blind video de-flickering, is here, jointly proposed by Meitu and the National University of Science and Technology of China
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com In recent years, the short video ecosystem has rapidly emerged, and creative and editing tools around short videos are constantly emerging. Wink, a professional mobile video editing tool owned by Meitu Company , leading the way with its original video quality restoration capabilities, used at home and abroad
- AI 438 2024-07-23 15:13:34
-
- The embodied intelligent robot company invested by Xiaomi and the welding giant officially announced strategic cooperation
- Recently, "Xiaoyu Intelligent Manufacturing", the first embodied intelligence company invested by Xiaomi Group, has reached a major strategic cooperation with Tangshan Panasonic, a joint venture of industry giant Panasonic, aiming to jointly develop advanced large-model intelligent welding robots. On July 18, the strategic cooperation signing ceremony between Tangshan Panasonic Industrial Robot Co., Ltd. (hereinafter referred to as "Tangshan Panasonic") and Beijing Xiaoyu Intelligent Manufacturing Technology Co., Ltd. (hereinafter referred to as "Xiaoyu Intelligent Manufacturing") was successfully completed at the headquarters of Tangshan Panasonic. Panasonic Industrial Machinery Co., Ltd. general manager Hashiyama Yuichiro, executive deputy general manager Liu Zheng, Xiaoyu Intelligent Manufacturing founder and CEO Qiao Zhongliang, co-founder and vice president Li Chuan and other leaders attended the signing ceremony. Both parties expressed their enthusiasm for this cooperation send
- AI 475 2024-07-23 14:50:54
-
- Unlimited video generation, planning and decision-making, diffusion forced integration of next token prediction and full sequence diffusion
- Currently, autoregressive large-scale language models using the next token prediction paradigm have become popular all over the world. At the same time, a large number of synthetic images and videos on the Internet have already shown us the power of diffusion models. Recently, a research team at MITCSAIL (one of whom is Chen Boyuan, a PhD student at MIT) successfully integrated the powerful capabilities of the full sequence diffusion model and the next token model, and proposed a training and sampling paradigm: Diffusion Forcing (DF). Paper title: DiffusionForcing:Next-tokenPredictionMeetsFull-SequenceDiffusion Paper address: https:/
- AI 1164 2024-07-23 14:05:21
-
- After 'Alibaba Star', Alibaba Taotian restarted the recruitment of top technical talents, with an annual salary of one million as standard
- On July 22, Alibaba Taotian Group’s “T-Star Program for Top Talents” was officially launched. The project recruits competition, academic and practical experts from the world's cutting-edge technology fields to provide these "genius teenagers" with top technical topics, computing resources, R&D platform resources, and top-notch personnel who start with an annual salary of one million and are exclusively trained by "big bosses" growing space. The reporter learned that the T-Star plan is a continuation of the "Alibaba Star" plan. "Alibaba Star" originated in 2011 and its purpose is to attract the youngest and top technical talents to join. In the past, most of the people recruited were Ph.D.s and vice president-level
- AI 900 2024-07-22 21:20:23
-
- ICML 2024 Oral | Is DPO more suitable for LLM than PPO? Tsinghua Wuyi team's latest revelation
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com Wu Yi is an assistant professor at the Institute of Interdisciplinary Information at Tsinghua University. He was a full-time researcher at OpenAI. His research fields include reinforcement learning, large model alignment, human-computer interaction, and robot learning. Obtained a PhD from the University of California, Berkeley, in 2019, studying under Stu
- AI 402 2024-07-22 18:41:23
-
- New standard for AI imaging, only 1% of original data can achieve the best performance, general medical basic model published in Nature sub-journal
- Editor | Cabbage Leaf’s massively pre-trained base model has achieved great success in non-medical fields. However, training these models often requires large, comprehensive datasets, in contrast to the smaller and more specialized datasets common in biomedical imaging. Researchers at the Fraunhofer Institute for Digital Medicine MEVIS in Germany proposed a multi-task learning strategy that separates the number of training tasks from memory requirements. They trained a universal biomedical pre-trained model (UMedPT) on a multi-task database (including tomography, microscopy and X-ray images) and adopted various labeling strategies such as classification, segmentation and
- AI 1061 2024-07-22 17:38:00
-
- ECCV 2024 | To improve the performance of GPT-4V and Gemini detection tasks, you need this prompt paradigm
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com The authors of this article are from Zhejiang University, Shanghai Artificial Intelligence Laboratory, Chinese University of Hong Kong, University of Sydney and University of Oxford. Author list: Wu Yixuan, Wang Yizhou, Tang Shixiang, Wu Wenhao, He Tong, WanliOuyang, Philip Torr, Jia
- AI 605 2024-07-22 17:28:30
-
- KDD 2024|Hong Kong Rhubarb Chao team deeply analyzes the 'unknown boundary' of large models in the field of graph machine learning
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com The main author of this article is from the Data Intelligence Laboratory (DataIntelligenceLab) of the University of Hong Kong. Among the authors, the first author Ren Xubin and the second author Tang Jiabin are both first-year doctoral students at the School of Data Science at the University of Hong Kong, and their supervisor is Da
- AI 1194 2024-07-22 16:54:34
-
- The University of Science and Technology of China and Huawei Noah proposed Entropy Law to reveal the relationship between large model performance, data compression rate and training loss.
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com This work was completed by the IEEE Fellow Chen Enhong team of the National Key Laboratory of Cognitive Intelligence at the University of Science and Technology of China and Huawei's Noah's Ark Laboratory. Professor Chen Enhong’s team is deeply involved in the fields of data mining and machine learning, and has published many papers in top journals and conferences. Google Scholar
- AI 837 2024-07-22 16:39:35
-
- The weights, codes, and data sets are all open source, and the performance exceeds Mistral-7B. Apple's small model is here
- Are small models a trend? This week, OpenAI launched the small model GPT-4o-mini, and the small model track was officially launched. Recently joining this track is Apple. Recently, Apple, as one of the research institutions of the DataComp-LM (DCLM) project, released the DCLM-7B open source model on HuggingFace. The model performance has surpassed Mistral-7B and is approaching other leading open source models, including Llama3 and Gemma. Paper link: https://arxiv.org/pdf/2406.11794 Project link: https://huggingface.co/apple/DCLM-7
- AI 515 2024-07-22 16:18:40