current location:Home > Technical Articles > Technology peripherals > AI
- Direction:
- All web3.0 Backend Development Web Front-end Database Operation and Maintenance Development Tools PHP Framework Daily Programming WeChat Applet Common Problem Other Tech CMS Tutorial Java System Tutorial Computer Tutorials Hardware Tutorial Mobile Tutorial Software Tutorial Mobile Game Tutorial
- Classify:
-
- Falcon returns after a year! 11 billion parameters and 5.5 trillion tokens, performance surpassing Llama 3
- In the past few days, the world's attention seems to have been attracted by GPT-4o released by OpenAI. At the same time, OpenAI's challengers are also making history simultaneously. Just on May 14, the Technology Innovation Institute (TII) under the Abu Dhabi Advanced Technology Research Council (ATRC) released a new generation of Falcon2 model. Falcon211B has opened access. Falcon211BVLM will open the new generation "Falcon" (Falcon means Falcon) at 12 noon on May 14 to return to the arena. Once launched, it quickly topped the HN hot list. Last year, Falcon shocked everyone when it was first launched, surpassing Llama with a crushing advantage. According to HuggingFace
- AI 1102 2024-06-09 17:25:31
-
- OpenAI CEO responded to the 'hush agreement', the dispute still came over equity interests, Ultraman: It's my fault
- Since the resignation of Ilya and Jan, the head of super alignment, OpenAI is still distraught, and more and more people have resigned, which has also caused more conflicts. Yesterday, the focus of controversy came to a strict "hush agreement". Former OpenAI employee Kelsey Piper broke the news that any employee's onboarding document instructions include: "Within sixty days of leaving the company, you must sign a separation document containing a "general exemption." If you do not complete it within 60 days, Your equity benefits will be cancelled.” This screenshot of the document that caused the controversy prompted OpenAI CEO to quickly respond: We have never recovered anyone’s vested rights. If people do not sign the separation agreement (or do not agree to the non-disparagement agreement),
- AI 882 2024-06-09 17:07:32
-
- Tsinghua University, Huawei and others proposed iVideoGPT: specializing in interactive world models
- iVideoGPT meets the high interactivity needs of world models. Generative models have made significant progress in recent years, with video generation becoming a new frontier. An important application of these generative video models is to learn in an unsupervised manner on diverse Internet-scale data for building predictive world models. These world models are expected to accumulate common-sense knowledge about how the world works, allowing predictions of potential future outcomes based on the behavior of agents. By leveraging these world models, agents using reinforcement learning can imagine, reason, and plan within the world model, thereby acquiring new skills more safely and efficiently in the real world with a small amount of experimentation. Although generative models are fundamentally related to world models, they are
- AI 879 2024-06-09 17:06:01
-
- New work by Bengio et al.: Attention can be regarded as RNN. The new model is comparable to Transformer, but is super memory-saving.
- Advances in sequence modeling have been extremely impactful as they play an important role in a wide range of applications, including reinforcement learning (e.g., robotics and autonomous driving), time series classification (e.g., financial fraud detection and medical diagnosis), etc. . In the past few years, the emergence of Transformer has marked a major breakthrough in sequence modeling, mainly due to the fact that Transformer provides a high-performance architecture that can take advantage of GPU parallel processing. However, Transformer has a high computational overhead during inference, mainly due to the quadratic expansion of memory and computing requirements, thus limiting its application in low-resource environments (e.g., mobile and embedded devices). Although technologies such as KV caching can be used to improve inference efficiency,
- AI 608 2024-06-09 16:50:32
-
- Holding gauze and grasping needles, NVIDIA cooperates with many universities to develop surgical robots
- Editor | NVIDIA teamed up with researchers from the University of Toronto, the University of California, Berkeley, ETH Zurich, and the Georgia Institute of Technology to develop ORBIT-Surgical, a simulation framework for training robots that can improve the skills of technical teams while reducing the cognitive load of surgeons. ORBIT-Surgical is an artificial intelligence-based simulation framework that achieves highly realistic surgical simulation through a virtual surgical environment and intelligent coaching system. Doctors can interact with this system to simulate the various situations and complexities of real surgeries. This simulation technology can not only help train patients undergoing laparoscopic surgery (also
- AI 494 2024-06-09 13:23:16
-
- CLIP is selected as CVPR when used as RNN: it can segment countless concepts without training | Oxford University & Google Research
- CLIP is called cyclically to effectively segment countless concepts without additional training. Any phrase including movie characters, landmarks, brands, and general categories. This new result of the joint team of Oxford University and Google Research has been accepted by CVPR2024 and the code has been open sourced. The team proposed a new technology called CLIPasRNN (CaR for short), which solves several key problems in the field of open vocabulary image segmentation: No training data is required: traditional methods require a large number of mask annotations or image-text datasets for fine-tuning, CaR The technology works without any additional training data. Open vocabulary limitations: Pre-trained visual-language models (VLMs) are limited in their ability to handle open vocabularies after fine-tuning. C
- AI 375 2024-06-09 12:53:28
-
- Supports the synthesis of one-minute high-definition videos. Huake et al. proposed a new framework for human dancing video generation, UniAnimate.
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com Human dancing video generation is a compelling and challenging controllable video synthesis task, aiming to generate high-quality lifelike images based on input reference images and target pose sequences. continuous video. With the rapid development of video generation technology, especially the iterative evolution of generation models,
- AI 976 2024-06-09 11:10:58
-
- A silkier control algorithm than PID & Carnegie Mellon University
- MPC control algorithm, full name ModelPredictiveControl (Model Predictive Control), is a control technology based on system dynamic model. It works by predicting the future behavior of the system through mathematical models and optimizing the system's control inputs based on these predictions to achieve the desired output. The core idea of the MPC control algorithm is to obtain the best control input by optimizing the prediction results for a period of time in the future in each control cycle. This optimization is based on some prediction results to optimize the control input of the system to achieve the desired output. The MPC control algorithm is widely used and is especially suitable for control systems that need to satisfy some constraints. By combining system models and optimization techniques, MP
- AI 767 2024-06-09 09:57:28
-
- The shelling scandal makes the director of Stanford AI Lab angry! Two members of the plagiarism team took the blame and one person disappeared, and his criminal record was exposed. Netizens: Re-understand China's open source model
- The incident of the Stanford team plagiarizing a large model from Tsinghua University came later - the Llama3-V team admitted plagiarism, and two of the undergraduates from Stanford even cut themselves off from another author. The latest apology tweets were sent by SiddharthSharma and AkshGarg. Not among them, Mustafa Aljadery (Lao Mu for short) from the University of Southern California is accused of being the main fault party, and he has been missing since yesterday: We hope that Lao Mu will make the first statement, but we have been unable to contact him since yesterday. Siddharth, myself (Akshi) and Lao Mu released Llama3-V together, and Lao Mu wrote the code for the project. Siddharth and my role is to help him get started on Medium and T
- AI 1161 2024-06-09 09:38:08
-
- Being intercepted by OpenAI again, Google launched an open source visual language model: PaliGemma
- Foreword This model combines the SigLIP visual model and the Gemma language model. Both models are open components, making PaliGemma excellent at processing tasks that combine vision and language. The usage scenarios of PaliGemma include image subtitles, image tags and visual question answering. These application scenarios take advantage of PaliGemma's ability to understand image content and extract key features, and then convert this information into language output to enable interaction with users or automated content generation. This flexibility makes PaliGemma suitable not only for research and development environments, but also for commercial applications such as customer service, content recommendation systems, etc. Pictures What can PaliGemma do? Pictures can be used when prompted.
- AI 550 2024-06-09 09:17:06
-
- LightGBM actual combat + random search parameter adjustment: accuracy rate 96.67%
- Hello everyone, I am Peter~LightGBM is a classic machine learning algorithm. Its background, principles and characteristics are very worthy of study. LightGBM's algorithm yields features such as high efficiency, scalability, and high accuracy. This article will briefly introduce the characteristics and principles of LightGBM as well as some cases based on LightGBM and random search optimization. LightGBM Algorithm In the field of machine learning, Gradient Boosting Machines (GBMs) are a class of powerful ensemble learning algorithms that build a powerful model by gradually adding weak learners (usually decision trees) to minimize prediction errors. GBMs are often used to minimize pre-
- AI 706 2024-06-08 22:45:30
-
- The Mistral open source code model takes the throne! Codestral is crazy about training in over 80 languages, and domestic Tongyi developers are asking to participate!
- Produced by 51CTO technology stack (WeChat ID: blog51cto) Mistral released its first code model Codestral-22B! What’s crazy about this model is not only that it’s trained on over 80 programming languages, including Swift, etc. that many code models ignore. Their speeds are not exactly the same. It is required to write a "publish/subscribe" system using Go language. The GPT-4o here is being output, and Codestral is handing in the paper so fast that it’s hard to see! Since the model has just been launched, it has not yet been publicly tested. But according to the person in charge of Mistral, Codestral is currently the best-performing open source code model. Friends who are interested in the picture can move to: - Hug the face: https
- AI 1113 2024-06-08 21:55:01
-
- Towards 'Closed Loop' | PlanAgent: New SOTA for closed-loop planning of autonomous driving based on MLLM!
- The deep reinforcement learning team of the Institute of Automation, Chinese Academy of Sciences, together with Li Auto and others, proposed a new closed-loop planning framework for autonomous driving based on the multi-modal large language model MLLM - PlanAgent. This method takes a bird's-eye view of the scene and graph-based text prompts as input, and uses the multi-modal understanding and common sense reasoning capabilities of the multi-modal large language model to perform hierarchical reasoning from scene understanding to the generation of horizontal and vertical movement instructions, and Further generate the instructions required by the planner. The method is tested on the large-scale and challenging nuPlan benchmark, and experiments show that PlanAgent achieves state-of-the-art (SOTA) performance on both regular and long-tail scenarios. Compared with conventional large language model (LLM) methods, PlanAgent
- AI 314 2024-06-08 21:30:27
-
- Modularly reconstruct LLaVA. To replace components, just add 1-2 files. The open source TinyLLaVA Factory is here.
- The TinyLLaVA+ project was jointly created by Professor Wu Ji’s team from the Multimedia Signal and Intelligent Information Processing Laboratory (MSIIP) of the Department of Electronics, Tsinghua University, and Professor Huang Lei’s team from the School of Artificial Intelligence, Beihang University. The MSIIP Laboratory of Tsinghua University has long been committed to research fields such as intelligent medical care, natural language processing and knowledge discovery, and multi-modality. The Beijing Airlines team has long been committed to research fields such as deep learning, multi-modality, and computer vision. The goal of the TinyLLaVA+ project is to develop a small cross-language intelligent assistant with multi-modal capabilities such as language understanding, question and answer, and dialogue. The project team will give full play to their respective advantages, jointly overcome technical problems, and realize the design and development of intelligent assistants. This will provide opportunities for intelligent medical care, natural language processing and knowledge
- AI 421 2024-06-08 21:21:29
-
- Is the U.S. far behind in robot applications? After 15 years, ten top universities restarted the 'National Robotics Roadmap”
- Robotics technology has a history of 70 years, and the United States has been leading the way since its inception. As of 2009, when the United States released its National Robotics Roadmap for the first time, the United States' application in industrial applications (such as automobiles, aerospace, and home appliances) has dropped to fourth place in the world. Since 15 years ago, the United States has continued to grow in adoption of robotics technology, ranking tenth in the world. The Asian robot market has expanded 5-10 times that of the U.S. market. China is "far ahead" in this field. In 2023, China purchased approximately 52% of robots sold, indicating that robotics is no longer a national priority in the United States. Most recently, from the University of California, Pennsylvania
- AI 1061 2024-06-08 20:57:00