current location:Home > Technical Articles > Tech

  • Which human preference optimization algorithm is better? Follow the master to understand DPO, IPO and KTO
    Which human preference optimization algorithm is better? Follow the master to understand DPO, IPO and KTO
    Although approaches to collect human labels on the relative quality of model-generated content and fine-tune unsupervised large language models to conform to these preferences through reinforcement learning from human feedback (RLHF) have greatly advanced the development of conversational AI. However, since RLHF is a complex and often unstable process, research on directly using optimization functions to align human preferences with model results has become a hot issue nowadays. This article is a blog on huggingface, which compares the performance of three common human preference optimization algorithms nowadays. The authors conducted extensive experiments to evaluate three feasible methods for tuning language models without reinforcement learning (or preference tuning), using different models and different hyperparameters. this
    AI 727 2024-08-05 21:19:22
  • Xiaohongshu's 11th anniversary letter admits the disease of large companies: big bureaucracy, delayed decision-making, and the need to start over
    Xiaohongshu's 11th anniversary letter admits the disease of large companies: big bureaucracy, delayed decision-making, and the need to start over
    News from this site on August 2. According to Sanyan Technology, on the occasion of Xiaohongshu’s 11th anniversary, the company’s founders Mao Wenchao (name: Xingya) and Qu Fang (name: Mulan) confessed in their 11th anniversary letter , with the rapid development of the company, the so-called "big company disease" also appeared in Xiaohongshu. The letter mentioned that during the two organizational surveys conducted by Xiaohongshu people last year, from the daily feedback from classmates, we saw bad cases that deviated from the original intention of starting a business and only increased organizational consumption. "For example, some students have a very strong official position and do not get involved themselves. When encountering difficulties, they will only push front-line students to do things and solve them; some leaders spend time every day analyzing the upper-level intentions word by word, and do not comment on the problems that have affected the user experience. Important issues are being ignored." "There are still some responsible persons who do not
    It Industry 1045 2024-08-05 21:06:32
  • 'The best of both worlds', designing molecules from scratch, deep learning architecture S4 for chemical language modeling
    'The best of both worlds', designing molecules from scratch, deep learning architecture S4 for chemical language modeling
    Editor | KX Generative deep learning is reshaping drug design. Chemical language models (CLMs), which generate molecules as strings of molecules, are particularly important to this process. Recently, researchers from Eindhoven University of Technology in the Netherlands introduced a latest deep learning architecture (S4) into de novo drug design. The Structured State Space Sequence (S4) model has excellent performance in learning the global properties of the sequence. So, can S4 advance chemical language modeling designed from scratch? To answer this question, the researchers systematically compared S4 to state-of-the-art CL on a range of drug discovery tasks.
    AI 923 2024-08-05 20:58:22
  • How to choose a compression and quantization scheme for large models? Comprehensive evaluation of Qllm-Eval quantification scheme of Wuwen Core Dome: multi-model, multi-parameter, multi-dimensional
    How to choose a compression and quantization scheme for large models? Comprehensive evaluation of Qllm-Eval quantification scheme of Wuwen Core Dome: multi-model, multi-parameter, multi-dimensional
    Large-scale language models based on the Transformer architecture have shown excellent performance in various benchmark tests, but parameter scales in the order of tens of billions, hundreds of billions, or even trillions will bring high service costs. For example, GPT-3 has 175 billion parameters, uses FP16 storage, and the model size is about 350GB, while even Nvidia’s latest B200 GPU has only 192GB of memory, not to mention other GPUs and edge devices. Large model compression means "downsizing" large models and inserting them into resource-constrained scenarios to reduce model storage, memory access and computing overhead. Improve the inference throughput of large models without losing model performance as much as possible, so that large models can be used in IoT edge devices, embedded robots, offline mobile applications, etc.
    AI 603 2024-08-05 20:56:12
  • Does fine-tuning large models have to rely on human data? DeepMind: Self-training with feedback is better
    Does fine-tuning large models have to rely on human data? DeepMind: Self-training with feedback is better
    Faced with the current common practice of fine-tuning large models mainly relying on human-generated data, Google DeepMind has explored a more efficient way to reduce this dependence. As you and I can see, large language models (LLMs) are changing the deep learning landscape, demonstrating superior capabilities in generating human-quality text and solving a variety of language tasks. While the industry has further improved performance on specific tasks through supervised fine-tuning of human-collected data, obtaining high-quality human data faces significant bottlenecks. This is especially true for tasks that involve solving complex problems, requiring significant resources and expertise. How to solve it? Synthetic data generated by models is a potential alternative that can be scalable and cost-effective as long as the quality of the data is maintained.
    AI 890 2024-08-05 20:48:40
  • Integration of new qualities and resonance of computing power: Bose Quantum releases a new generation of 550 computational qubit coherent optical quantum computer
    Integration of new qualities and resonance of computing power: Bose Quantum releases a new generation of 550 computational qubit coherent optical quantum computer
    On April 18, 2024, Beijing Bose Quantum Technology Co., Ltd. (hereinafter referred to as "Bose Quantum") successfully held the 2024 new product launch conference in Wangjing, Beijing, with the theme of "Integration of new qualities and resonance of computing power". Pound released core research results such as the new generation of coherent light quantum computer with 550 computational qubits - "Tiangong Quantum Brain 550W" and Kaiwu SDK, which fully demonstrates the integration of quantum computing and AI and is the starting point for practical quantum computing. In 2024, quantum technology will be an important part of the development of future industries and new productive forces. Beijing’s future industrial layout clearly proposes
    AI 994 2024-08-05 20:43:00
  • Apple lets large models learn to be lazy: spit out the first token faster and maintain accuracy
    Apple lets large models learn to be lazy: spit out the first token faster and maintain accuracy
    Being lazy makes you work better. Llama 3.1 has just been released, have you tried it yet? Even if your personal computer is the latest top configuration, running the smallest 8B version may still cause significant delays. In order to improve the reasoning efficiency of the model, researchers have come up with a variety of methods, but many of them will cause the model to sacrifice some accuracy. Recently, a research team from Apple and MetaAI proposed a new method that can increase the inference speed of Llama2 pre-filling stage to more than 2 times while ensuring that the accuracy does not drop significantly. This may improve Llama3.1 The acceleration provides some inspiration. They call this approach LazyLLM, which stands for Lazy Large Language Model. Paper title: LazyL
    AI 562 2024-08-05 20:41:02
  • Technology Last Night and This Morning 0805: The world's first 18650 potassium battery is launched; Lalamove driver was complained for refusing to transport a body; Avita 11/12 extended range version will be launched in September
    Technology Last Night and This Morning 0805: The world's first 18650 potassium battery is launched; Lalamove driver was complained for refusing to transport a body; Avita 11/12 extended range version will be launched in September
    "Technology Last Night and This Morning" time, hello everyone, it is Monday, August 5, 2024. Today's important scientific and technological information is: the world's first 18650 potassium-ion battery is released, which can replace lithium batteries. Group1 company announced the launch of the world's first 18650 potassium-ion battery. 18650 cylindrical case potassium-ion battery, a breakthrough that promises to provide a sustainable and cost-effective alternative to traditional lithium-ion batteries. >>View details The Thai Prime Minister ordered an investigation into Temu, a subsidiary of Pinduoduo: whether it complies with the law and pays the required taxes. According to Thaibsworld, Thai Prime Minister Srettha Thavisin has ordered the Ministry of Digital Economy and Society, the Taxation Bureau and the police Investigate Pinduoduo’s e-commerce companies
    It Industry 1107 2024-08-05 20:38:50
  • Nature sub-journal, 10 times faster, reverse protein sequence design method based on Transformer
    Nature sub-journal, 10 times faster, reverse protein sequence design method based on Transformer
    Editor | Radish Skin Protein design and engineering are advancing at an unprecedented pace thanks to advances in deep learning. However, current models cannot naturally account for non-protein entities during the design process. Here, researchers at the Ecole Polytechnique Fédérale de Lausanne (EPFL) in Switzerland propose a deep learning method based entirely on a geometric transformer of atomic coordinates and element names that can predict proteins based on backbone scaffolds with constraints imposed by different molecular environments. sequence. Using this method, researchers can produce highly thermostable, catalytically active enzymes with a high success rate. This is expected to increase the versatility of protein design pipelines to achieve desired functions. This study uses "Context-awaregeometricde
    AI 934 2024-08-05 20:33:31
  • Mass production of Apple's first foldable device is hindered, Jeff Pu said it will be difficult to achieve in 2025 or 2026
    Mass production of Apple's first foldable device is hindered, Jeff Pu said it will be difficult to achieve in 2025 or 2026
    According to news on August 3, MacRumors recently obtained a new report released to investors by Jeff Pu, an analyst at Haitong International Securities, which pointed out that the mass production plan of Apple’s first folding device has encountered “delay” and may not be able to be launched as expected in 2025. Mass production will be achieved in 2026 or 2026. Previously, Jeff Pu predicted in a May report that Apple's first folding device would enter mass production in 2025 and 2026. He also predicted at the time that Apple might first launch a large-screen folding iPad or MacBook, and then launch a folding-screen iPhone with greater market potential. This series of predictions has attracted widespread attention from the market and consumers. However, according to the editor's understanding, the latest report has brought a
    It Industry 387 2024-08-05 20:32:02
  • The author of Transformer returns to Google, and the founding team of Character.AI is 'acquired', as long as people don't want the company
    The author of Transformer returns to Google, and the founding team of Character.AI is 'acquired', as long as people don't want the company
    Will AI startups end up in big companies? When I woke up, the “chicken-eating contest” of generative AI was shrinking again. Startup Character.AI announced on Friday that it has signed an agreement with Google to obtain a non-exclusive license to Character.AI’s large language model (LLM) technology. Google also announced the rehiring of Noam Shazeer and Daniel DeFreitas. Among them, NoamShazeer is the founder and CEO of Character.AI and one of the authors of the Transformer paper. He once served as chief software engineer at Google. Daniel DeFreitas is the president of Character.AI and served as a senior engineer at Google.
    AI 820 2024-08-05 20:17:10
  • The high-definition video is not real. The 3D scenes rendered in several photos make it difficult for you to distinguish the authenticity.
    The high-definition video is not real. The 3D scenes rendered in several photos make it difficult for you to distinguish the authenticity.
    Please note that the above animation is completely a 3D scene rendered from multiple photos. It is difficult for humans to detect their flaws. So let's take a look at how this scenario is realized. Grids and points are the most common representations of 3D scenes because they are explicit and well suited for fast GPU/CUDA-based rasterization. In contrast, state-of-the-art Neural Radiation Field (NeRF) methods are built on continuous scene representation, often using volumetric ray rendering optimized multi-layer perceptrons (MLP) to synthesize new perspectives on the captured scene. While the continuity of these methods helps with optimization, the random sampling required for rendering is expensive and noisy. Researchers from the University of the French Riviera have introduced a new method that combines the two methods
    AI 573 2024-08-05 20:15:51
  • Why is the delayed interaction model standard for the next generation of RAG?
    Why is the delayed interaction model standard for the next generation of RAG?
    The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com Zhang Yingfeng: Co-founder of Infra, with many years of experience in search, AI, and Infra infrastructure development, he is currently working on the construction of the next generation of RAG core products. In the development of RAG system, a good Reranker model is an indispensable link.
    AI 1115 2024-08-05 20:15:22
  • ECCV2024 | Harvard team develops FairDomain to achieve fairness in cross-domain medical image segmentation and classification
    ECCV2024 | Harvard team develops FairDomain to achieve fairness in cross-domain medical image segmentation and classification
    Editor | ScienceAI Author | YuTian Team In the field of artificial intelligence (AI), especially medical AI, addressing fairness issues is crucial to ensuring fair medical outcomes. Recently, efforts to enhance fairness have introduced new methods and datasets. However, the issue of fairness has been little explored in the context of domain transfer, even though clinics often rely on different imaging technologies (e.g., different retinal imaging modalities) for patient diagnosis. This paper proposes FairDomain, which is the first systematic study of algorithm fairness under domain transfer. We test state-of-the-art domain adaptation (DA) and domain generalization (DG) algorithms for medical image segmentation and classification tasks, aiming to Understand how bias is transferred between different domains.
    AI 1193 2024-08-05 20:04:36
  • From now on, more than 100 million developers on GitHub can directly access the world's top large models to build AI applications
    From now on, more than 100 million developers on GitHub can directly access the world's top large models to build AI applications
    The new feature "GitHubModels" launched by GitHub is expected to accelerate the arrival of the era of AI engineers. What? The familiar code hosting platform GitHub has evolved again! The platform has also begun to provide Playgroud with large AI models. All popular large models in the industry that you can name, including Microsoft's Phi-3, OpenAI's GPT-4o, Meta's Llama3.1, Cohere's CommandR+, and MistralAI's MistralLarge, can be tried in an interactive sandbox . In the coming months, Github will also add more language, visual, and other types of models. In other words, the model in this picture
    AI 1086 2024-08-05 19:36:38

Tool Recommendations

jQuery enterprise message form contact code

jQuery enterprise message form contact code is a simple and practical enterprise message form and contact us introduction page code.
form button
2024-02-29

HTML5 MP3 music box playback effects

HTML5 MP3 music box playback special effect is an mp3 music player based on HTML5 css3 to create cute music box emoticons and click the switch button.

HTML5 cool particle animation navigation menu special effects

HTML5 cool particle animation navigation menu special effect is a special effect that changes color when the navigation menu is hovered by the mouse.
Menu navigation
2024-02-29

jQuery visual form drag and drop editing code

jQuery visual form drag and drop editing code is a visual form based on jQuery and bootstrap framework.
form button
2024-02-29

Organic fruit and vegetable supplier web template Bootstrap5

An organic fruit and vegetable supplier web template-Bootstrap5
Bootstrap template
2023-02-03

Bootstrap3 multifunctional data information background management responsive web page template-Novus

Bootstrap3 multifunctional data information background management responsive web page template-Novus
backend template
2023-02-02

Real estate resource service platform web page template Bootstrap5

Real estate resource service platform web page template Bootstrap5
Bootstrap template
2023-02-02

Simple resume information web template Bootstrap4

Simple resume information web template Bootstrap4
Bootstrap template
2023-02-02

Cute summer elements vector material (EPS PNG)

This is a cute summer element vector material, including the sun, sun hat, coconut tree, bikini, airplane, watermelon, ice cream, ice cream, cold drink, swimming ring, flip-flops, pineapple, conch, shell, starfish, crab, Lemons, sunscreen, sunglasses, etc., the materials are provided in EPS and PNG formats, including JPG previews.
PNG material
2024-05-09

Four red 2023 graduation badges vector material (AI EPS PNG)

This is a red 2023 graduation badge vector material, four in total, available in AI, EPS and PNG formats, including JPG preview.
PNG material
2024-02-29

Singing bird and cart filled with flowers design spring banner vector material (AI EPS)

This is a spring banner vector material designed with singing birds and a cart full of flowers. It is available in AI and EPS formats, including JPG preview.
banner picture
2024-02-29

Golden graduation cap vector material (EPS PNG)

This is a golden graduation cap vector material, available in EPS and PNG formats, including JPG preview.
PNG material
2024-02-27

Home Decor Cleaning and Repair Service Company Website Template

Home Decoration Cleaning and Maintenance Service Company Website Template is a website template download suitable for promotional websites that provide home decoration, cleaning, maintenance and other service organizations. Tip: This template calls the Google font library, and the page may open slowly.
Front-end template
2024-05-09

Fresh color personal resume guide page template

Fresh color matching personal job application resume guide page template is a personal job search resume work display guide page web template download suitable for fresh color matching style. Tip: This template calls the Google font library, and the page may open slowly.
Front-end template
2024-02-29

Designer Creative Job Resume Web Template

Designer Creative Job Resume Web Template is a downloadable web template for personal job resume display suitable for various designer positions. Tip: This template calls the Google font library, and the page may open slowly.
Front-end template
2024-02-28

Modern engineering construction company website template

The modern engineering and construction company website template is a downloadable website template suitable for promotion of the engineering and construction service industry. Tip: This template calls the Google font library, and the page may open slowly.
Front-end template
2024-02-28
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!