current location:Home > Technical Articles > Tech
- Direction:
- All web3.0 Backend Development Web Front-end Database Operation and Maintenance Development Tools PHP Framework Daily Programming WeChat Applet Common Problem Other Tech CMS Tutorial Java System Tutorial Computer Tutorials Hardware Tutorial Mobile Tutorial Software Tutorial Mobile Game Tutorial
- Classify:
-
- Which human preference optimization algorithm is better? Follow the master to understand DPO, IPO and KTO
- Although approaches to collect human labels on the relative quality of model-generated content and fine-tune unsupervised large language models to conform to these preferences through reinforcement learning from human feedback (RLHF) have greatly advanced the development of conversational AI. However, since RLHF is a complex and often unstable process, research on directly using optimization functions to align human preferences with model results has become a hot issue nowadays. This article is a blog on huggingface, which compares the performance of three common human preference optimization algorithms nowadays. The authors conducted extensive experiments to evaluate three feasible methods for tuning language models without reinforcement learning (or preference tuning), using different models and different hyperparameters. this
- AI 727 2024-08-05 21:19:22
-
- Xiaohongshu's 11th anniversary letter admits the disease of large companies: big bureaucracy, delayed decision-making, and the need to start over
- News from this site on August 2. According to Sanyan Technology, on the occasion of Xiaohongshu’s 11th anniversary, the company’s founders Mao Wenchao (name: Xingya) and Qu Fang (name: Mulan) confessed in their 11th anniversary letter , with the rapid development of the company, the so-called "big company disease" also appeared in Xiaohongshu. The letter mentioned that during the two organizational surveys conducted by Xiaohongshu people last year, from the daily feedback from classmates, we saw bad cases that deviated from the original intention of starting a business and only increased organizational consumption. "For example, some students have a very strong official position and do not get involved themselves. When encountering difficulties, they will only push front-line students to do things and solve them; some leaders spend time every day analyzing the upper-level intentions word by word, and do not comment on the problems that have affected the user experience. Important issues are being ignored." "There are still some responsible persons who do not
- It Industry 1045 2024-08-05 21:06:32
-
- 'The best of both worlds', designing molecules from scratch, deep learning architecture S4 for chemical language modeling
- Editor | KX Generative deep learning is reshaping drug design. Chemical language models (CLMs), which generate molecules as strings of molecules, are particularly important to this process. Recently, researchers from Eindhoven University of Technology in the Netherlands introduced a latest deep learning architecture (S4) into de novo drug design. The Structured State Space Sequence (S4) model has excellent performance in learning the global properties of the sequence. So, can S4 advance chemical language modeling designed from scratch? To answer this question, the researchers systematically compared S4 to state-of-the-art CL on a range of drug discovery tasks.
- AI 923 2024-08-05 20:58:22
-
- How to choose a compression and quantization scheme for large models? Comprehensive evaluation of Qllm-Eval quantification scheme of Wuwen Core Dome: multi-model, multi-parameter, multi-dimensional
- Large-scale language models based on the Transformer architecture have shown excellent performance in various benchmark tests, but parameter scales in the order of tens of billions, hundreds of billions, or even trillions will bring high service costs. For example, GPT-3 has 175 billion parameters, uses FP16 storage, and the model size is about 350GB, while even Nvidia’s latest B200 GPU has only 192GB of memory, not to mention other GPUs and edge devices. Large model compression means "downsizing" large models and inserting them into resource-constrained scenarios to reduce model storage, memory access and computing overhead. Improve the inference throughput of large models without losing model performance as much as possible, so that large models can be used in IoT edge devices, embedded robots, offline mobile applications, etc.
- AI 603 2024-08-05 20:56:12
-
- Does fine-tuning large models have to rely on human data? DeepMind: Self-training with feedback is better
- Faced with the current common practice of fine-tuning large models mainly relying on human-generated data, Google DeepMind has explored a more efficient way to reduce this dependence. As you and I can see, large language models (LLMs) are changing the deep learning landscape, demonstrating superior capabilities in generating human-quality text and solving a variety of language tasks. While the industry has further improved performance on specific tasks through supervised fine-tuning of human-collected data, obtaining high-quality human data faces significant bottlenecks. This is especially true for tasks that involve solving complex problems, requiring significant resources and expertise. How to solve it? Synthetic data generated by models is a potential alternative that can be scalable and cost-effective as long as the quality of the data is maintained.
- AI 890 2024-08-05 20:48:40
-
- Integration of new qualities and resonance of computing power: Bose Quantum releases a new generation of 550 computational qubit coherent optical quantum computer
- On April 18, 2024, Beijing Bose Quantum Technology Co., Ltd. (hereinafter referred to as "Bose Quantum") successfully held the 2024 new product launch conference in Wangjing, Beijing, with the theme of "Integration of new qualities and resonance of computing power". Pound released core research results such as the new generation of coherent light quantum computer with 550 computational qubits - "Tiangong Quantum Brain 550W" and Kaiwu SDK, which fully demonstrates the integration of quantum computing and AI and is the starting point for practical quantum computing. In 2024, quantum technology will be an important part of the development of future industries and new productive forces. Beijing’s future industrial layout clearly proposes
- AI 994 2024-08-05 20:43:00
-
- Apple lets large models learn to be lazy: spit out the first token faster and maintain accuracy
- Being lazy makes you work better. Llama 3.1 has just been released, have you tried it yet? Even if your personal computer is the latest top configuration, running the smallest 8B version may still cause significant delays. In order to improve the reasoning efficiency of the model, researchers have come up with a variety of methods, but many of them will cause the model to sacrifice some accuracy. Recently, a research team from Apple and MetaAI proposed a new method that can increase the inference speed of Llama2 pre-filling stage to more than 2 times while ensuring that the accuracy does not drop significantly. This may improve Llama3.1 The acceleration provides some inspiration. They call this approach LazyLLM, which stands for Lazy Large Language Model. Paper title: LazyL
- AI 562 2024-08-05 20:41:02
-
- Technology Last Night and This Morning 0805: The world's first 18650 potassium battery is launched; Lalamove driver was complained for refusing to transport a body; Avita 11/12 extended range version will be launched in September
- "Technology Last Night and This Morning" time, hello everyone, it is Monday, August 5, 2024. Today's important scientific and technological information is: the world's first 18650 potassium-ion battery is released, which can replace lithium batteries. Group1 company announced the launch of the world's first 18650 potassium-ion battery. 18650 cylindrical case potassium-ion battery, a breakthrough that promises to provide a sustainable and cost-effective alternative to traditional lithium-ion batteries. >>View details The Thai Prime Minister ordered an investigation into Temu, a subsidiary of Pinduoduo: whether it complies with the law and pays the required taxes. According to Thaibsworld, Thai Prime Minister Srettha Thavisin has ordered the Ministry of Digital Economy and Society, the Taxation Bureau and the police Investigate Pinduoduo’s e-commerce companies
- It Industry 1107 2024-08-05 20:38:50
-
- Nature sub-journal, 10 times faster, reverse protein sequence design method based on Transformer
- Editor | Radish Skin Protein design and engineering are advancing at an unprecedented pace thanks to advances in deep learning. However, current models cannot naturally account for non-protein entities during the design process. Here, researchers at the Ecole Polytechnique Fédérale de Lausanne (EPFL) in Switzerland propose a deep learning method based entirely on a geometric transformer of atomic coordinates and element names that can predict proteins based on backbone scaffolds with constraints imposed by different molecular environments. sequence. Using this method, researchers can produce highly thermostable, catalytically active enzymes with a high success rate. This is expected to increase the versatility of protein design pipelines to achieve desired functions. This study uses "Context-awaregeometricde
- AI 934 2024-08-05 20:33:31
-
- Mass production of Apple's first foldable device is hindered, Jeff Pu said it will be difficult to achieve in 2025 or 2026
- According to news on August 3, MacRumors recently obtained a new report released to investors by Jeff Pu, an analyst at Haitong International Securities, which pointed out that the mass production plan of Apple’s first folding device has encountered “delay” and may not be able to be launched as expected in 2025. Mass production will be achieved in 2026 or 2026. Previously, Jeff Pu predicted in a May report that Apple's first folding device would enter mass production in 2025 and 2026. He also predicted at the time that Apple might first launch a large-screen folding iPad or MacBook, and then launch a folding-screen iPhone with greater market potential. This series of predictions has attracted widespread attention from the market and consumers. However, according to the editor's understanding, the latest report has brought a
- It Industry 387 2024-08-05 20:32:02
-
- The author of Transformer returns to Google, and the founding team of Character.AI is 'acquired', as long as people don't want the company
- Will AI startups end up in big companies? When I woke up, the “chicken-eating contest” of generative AI was shrinking again. Startup Character.AI announced on Friday that it has signed an agreement with Google to obtain a non-exclusive license to Character.AI’s large language model (LLM) technology. Google also announced the rehiring of Noam Shazeer and Daniel DeFreitas. Among them, NoamShazeer is the founder and CEO of Character.AI and one of the authors of the Transformer paper. He once served as chief software engineer at Google. Daniel DeFreitas is the president of Character.AI and served as a senior engineer at Google.
- AI 820 2024-08-05 20:17:10
-
- The high-definition video is not real. The 3D scenes rendered in several photos make it difficult for you to distinguish the authenticity.
- Please note that the above animation is completely a 3D scene rendered from multiple photos. It is difficult for humans to detect their flaws. So let's take a look at how this scenario is realized. Grids and points are the most common representations of 3D scenes because they are explicit and well suited for fast GPU/CUDA-based rasterization. In contrast, state-of-the-art Neural Radiation Field (NeRF) methods are built on continuous scene representation, often using volumetric ray rendering optimized multi-layer perceptrons (MLP) to synthesize new perspectives on the captured scene. While the continuity of these methods helps with optimization, the random sampling required for rendering is expensive and noisy. Researchers from the University of the French Riviera have introduced a new method that combines the two methods
- AI 573 2024-08-05 20:15:51
-
- Why is the delayed interaction model standard for the next generation of RAG?
- The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com Zhang Yingfeng: Co-founder of Infra, with many years of experience in search, AI, and Infra infrastructure development, he is currently working on the construction of the next generation of RAG core products. In the development of RAG system, a good Reranker model is an indispensable link.
- AI 1115 2024-08-05 20:15:22
-
- ECCV2024 | Harvard team develops FairDomain to achieve fairness in cross-domain medical image segmentation and classification
- Editor | ScienceAI Author | YuTian Team In the field of artificial intelligence (AI), especially medical AI, addressing fairness issues is crucial to ensuring fair medical outcomes. Recently, efforts to enhance fairness have introduced new methods and datasets. However, the issue of fairness has been little explored in the context of domain transfer, even though clinics often rely on different imaging technologies (e.g., different retinal imaging modalities) for patient diagnosis. This paper proposes FairDomain, which is the first systematic study of algorithm fairness under domain transfer. We test state-of-the-art domain adaptation (DA) and domain generalization (DG) algorithms for medical image segmentation and classification tasks, aiming to Understand how bias is transferred between different domains.
- AI 1193 2024-08-05 20:04:36
-
- From now on, more than 100 million developers on GitHub can directly access the world's top large models to build AI applications
- The new feature "GitHubModels" launched by GitHub is expected to accelerate the arrival of the era of AI engineers. What? The familiar code hosting platform GitHub has evolved again! The platform has also begun to provide Playgroud with large AI models. All popular large models in the industry that you can name, including Microsoft's Phi-3, OpenAI's GPT-4o, Meta's Llama3.1, Cohere's CommandR+, and MistralAI's MistralLarge, can be tried in an interactive sandbox . In the coming months, Github will also add more language, visual, and other types of models. In other words, the model in this picture
- AI 1086 2024-08-05 19:36:38