Table of Contents
The idea of ​​human-machine collaboration injects new vitality into the data annotation industry
The soulful crowdsourcing platform helps AI technology flourish
Home Technology peripherals AI Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models

Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models

Jan 22, 2024 pm 11:39 PM
AI

On May 26, NetEase’s Fuxi Youling crowdsourcing platform made its debut at the China International Big Data Industry Expo. This platform is a human-computer collaboration online task platform developed by NetEase Fuxi based on its own research and development. It is currently the only crowdsourcing platform on the market that supports real-time human-computer interaction annotation. The goal of the Fuxi Youling crowdsourcing platform is to solve the labor shortage problem in all walks of life and provide the entire society with more convenient and interesting online employment opportunities. Enterprise customers can quickly model and publish tasks through this platform, while each gig user can freely receive tasks without restrictions on time and geography. In this way, the Fuxi Youling crowdsourcing platform provides enterprises and individuals with a more efficient and flexible working model.

Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models

In today's era, artificial intelligence technology is rapidly changing the way humans live and work. With the rapid development of artificial intelligence technologies such as large language models and multi-modal large models, the field of data annotation has ushered in a new era of vigorous development. A large amount of data is constantly emerging in various fields. However, in this exciting era, both the demand side and the provider side are facing huge challenges. They need to find an efficient way to provide high-quality, low-cost data support. This is not only related to the accuracy and practicality of artificial intelligence technology, but also to the development prospects of the entire industry. Therefore, the data annotation industry needs continuous innovation and improvement to meet the needs of artificial intelligence technology and promote the sustainable development of the industry.

In order to adapt to the trend of the big data era, many artificial intelligence companies have begun to establish training and management systems for data trainers, and continue to carry out technological innovation and improve data quality. However, as labor costs rise, more and more organizations are looking for more efficient and economical ways to annotate data. NetEase Fuxi Youling crowdsourcing platform came into being, based on the idea of ​​​​HITL (Human-in-the-Loop).

The idea of ​​human-machine collaboration injects new vitality into the data annotation industry

At this Data Expo, Fuxi Youling Crowdsourcing Platform It demonstrates its unique capabilities and advantages: combining human intelligence and decision-making power with the computing power of machine learning to achieve high-quality data annotation. Through a detailed and rigorous annotation process and a scientific scoring system, the platform maintains the accuracy and reliability of the data. At the same time, Fuxi Youling has also adopted a series of cutting-edge technical measures, including reducing costs, shortening the annotation cycle and ensuring data quality, to improve efficiency and effectiveness.

Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models

Data closed loop

After the annotator completes the data annotation, the platform provides support for real-time backflow model training, and the task issuer can evaluate the effect of the model before and after training. Compare and feel how the data annotation results improve the model and automatically update the model. The updated model can assist subsequent data annotation tasks and further improve the quality and efficiency of data annotation.

Full data inspection

The platform supports automatic quality inspection of all task data. The task issuer can flexibly configure the quality inspection process. The platform will combine users with Historical task levels and user portraits are used to conduct task quality inspection. At the same time, models are introduced to participate in quality inspection, so that AI and people can participate in quality control at the same time, and ultimately achieve high-accuracy delivery of tasks.

User Portraits

The platform has a complete user portrait and task matching mechanism, based on the user’s past task performance and combined with the user’s personal label data. Achieve matching according to the diverse needs of different task types, and assign tasks to the best people to do it, so as to meet the quality, efficiency and cost requirements of data annotation tasks.

Swarm Intelligence

The platform will locate diversified annotators based on user portraits, introduce redundant annotation forms, and use interval estimation and true Algorithmic methods such as value inference enable them to jointly participate in labeling decisions and obtain the final labeling results, ensuring the objectivity and accuracy of the final results.

Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models

According to the person in charge of the platform: The current platform mainly focuses on cognitive work content, which comes from the collection and labeling needs of multi-modal data such as text, pictures, and speech by AIGC and other artificial intelligence technologies. With the widespread application of communication technologies such as 5G, the platform will undertake more decision-making tasks such as remote control in the future. Based on digital twin technology, offline work will be digitized and online, allowing users to complete tasks in a gamified digital twin environment. happy working.

NetEase Fuxi Youling platform uses AI technology and manual annotation to ensure the quality and accuracy of data annotation and improve data annotation efficiency. It not only provides reliable and efficient data services for enterprises, but also contributes to the vigorous development of AI technology.

The soulful crowdsourcing platform helps AI technology flourish

During the same period of the exhibition, Dr. Wu Runze of NetEase Fuxi Lab also focused on "NetEase Fuxi Data" The theme of "Application Practice of Crowdsourcing Empowering Large Models" was shared.

Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models

Dr. Wu said: NetEase Fuxi has been deeply involved in large model technology since 2019, taking text pre-training and multi-modal pre-training as the main entry points, relying on the data crowdsourcing platform to provide high-quality data feedback closed loop, and overcome For key technical challenges such as unified representation construction, distributed object storage, and large-scale vector engines, it was selected as the "Pioneer Project" of Zhejiang Province and received official recognition for funding. It has successfully incubated two major game vertical products such as Danqingyue Art Platform and Game Intelligent NPC.

Currently, the Fuxi Youling crowdsourcing platform has been applied in multiple products and scenarios within NetEase Group: In the open world of the "Nishuihan" mobile game, the emotions are delicate and the reactions are Smart NPCs with sensitive, realistic movements and rich expressions are deeply loved by players. Smart NPCs require massive amounts of high-quality Human Feedback data to support them.

NetEase Fuxi Youling Crowdsourcing provides multi-data services involving voice collection, text annotation, emotional judgment, image annotation and other data services for the intelligent NPC model in the game, and ultimately supports the creation of text, voice , facial expressions and other multi-dimensional intelligent game NPCs. This is the deep integration that NetEase has accumulated in the fields of game engines and AI to solve the closed-loop problem of large-scale computing power data and pre-training models.

At present, NetEase Fuxi Youling crowdsourcing platform has processed hundreds of millions of data. While ensuring the performance of game AI, it can more efficiently collect feedback from game players and further improve AI performance. , thereby applying the technology in more diverse scenarios. Based on the concepts of openness, cooperation, and win-win, NetEase Fuxi will invite partners from upstream and downstream of the industry chain to jointly create a new era of AI digitalization.

The above is the detailed content of Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Bytedance Cutting launches SVIP super membership: 499 yuan for continuous annual subscription, providing a variety of AI functions Bytedance Cutting launches SVIP super membership: 499 yuan for continuous annual subscription, providing a variety of AI functions Jun 28, 2024 am 03:51 AM

This site reported on June 27 that Jianying is a video editing software developed by FaceMeng Technology, a subsidiary of ByteDance. It relies on the Douyin platform and basically produces short video content for users of the platform. It is compatible with iOS, Android, and Windows. , MacOS and other operating systems. Jianying officially announced the upgrade of its membership system and launched a new SVIP, which includes a variety of AI black technologies, such as intelligent translation, intelligent highlighting, intelligent packaging, digital human synthesis, etc. In terms of price, the monthly fee for clipping SVIP is 79 yuan, the annual fee is 599 yuan (note on this site: equivalent to 49.9 yuan per month), the continuous monthly subscription is 59 yuan per month, and the continuous annual subscription is 499 yuan per year (equivalent to 41.6 yuan per month) . In addition, the cut official also stated that in order to improve the user experience, those who have subscribed to the original VIP

Context-augmented AI coding assistant using Rag and Sem-Rag Context-augmented AI coding assistant using Rag and Sem-Rag Jun 10, 2024 am 11:08 AM

Improve developer productivity, efficiency, and accuracy by incorporating retrieval-enhanced generation and semantic memory into AI coding assistants. Translated from EnhancingAICodingAssistantswithContextUsingRAGandSEM-RAG, author JanakiramMSV. While basic AI programming assistants are naturally helpful, they often fail to provide the most relevant and correct code suggestions because they rely on a general understanding of the software language and the most common patterns of writing software. The code generated by these coding assistants is suitable for solving the problems they are responsible for solving, but often does not conform to the coding standards, conventions and styles of the individual teams. This often results in suggestions that need to be modified or refined in order for the code to be accepted into the application

Can fine-tuning really allow LLM to learn new things: introducing new knowledge may make the model produce more hallucinations Can fine-tuning really allow LLM to learn new things: introducing new knowledge may make the model produce more hallucinations Jun 11, 2024 pm 03:57 PM

Large Language Models (LLMs) are trained on huge text databases, where they acquire large amounts of real-world knowledge. This knowledge is embedded into their parameters and can then be used when needed. The knowledge of these models is "reified" at the end of training. At the end of pre-training, the model actually stops learning. Align or fine-tune the model to learn how to leverage this knowledge and respond more naturally to user questions. But sometimes model knowledge is not enough, and although the model can access external content through RAG, it is considered beneficial to adapt the model to new domains through fine-tuning. This fine-tuning is performed using input from human annotators or other LLM creations, where the model encounters additional real-world knowledge and integrates it

Seven Cool GenAI & LLM Technical Interview Questions Seven Cool GenAI & LLM Technical Interview Questions Jun 07, 2024 am 10:06 AM

To learn more about AIGC, please visit: 51CTOAI.x Community https://www.51cto.com/aigc/Translator|Jingyan Reviewer|Chonglou is different from the traditional question bank that can be seen everywhere on the Internet. These questions It requires thinking outside the box. Large Language Models (LLMs) are increasingly important in the fields of data science, generative artificial intelligence (GenAI), and artificial intelligence. These complex algorithms enhance human skills and drive efficiency and innovation in many industries, becoming the key for companies to remain competitive. LLM has a wide range of applications. It can be used in fields such as natural language processing, text generation, speech recognition and recommendation systems. By learning from large amounts of data, LLM is able to generate text

Five schools of machine learning you don't know about Five schools of machine learning you don't know about Jun 05, 2024 pm 08:51 PM

Machine learning is an important branch of artificial intelligence that gives computers the ability to learn from data and improve their capabilities without being explicitly programmed. Machine learning has a wide range of applications in various fields, from image recognition and natural language processing to recommendation systems and fraud detection, and it is changing the way we live. There are many different methods and theories in the field of machine learning, among which the five most influential methods are called the "Five Schools of Machine Learning". The five major schools are the symbolic school, the connectionist school, the evolutionary school, the Bayesian school and the analogy school. 1. Symbolism, also known as symbolism, emphasizes the use of symbols for logical reasoning and expression of knowledge. This school of thought believes that learning is a process of reverse deduction, through existing

To provide a new scientific and complex question answering benchmark and evaluation system for large models, UNSW, Argonne, University of Chicago and other institutions jointly launched the SciQAG framework To provide a new scientific and complex question answering benchmark and evaluation system for large models, UNSW, Argonne, University of Chicago and other institutions jointly launched the SciQAG framework Jul 25, 2024 am 06:42 AM

Editor |ScienceAI Question Answering (QA) data set plays a vital role in promoting natural language processing (NLP) research. High-quality QA data sets can not only be used to fine-tune models, but also effectively evaluate the capabilities of large language models (LLM), especially the ability to understand and reason about scientific knowledge. Although there are currently many scientific QA data sets covering medicine, chemistry, biology and other fields, these data sets still have some shortcomings. First, the data form is relatively simple, most of which are multiple-choice questions. They are easy to evaluate, but limit the model's answer selection range and cannot fully test the model's ability to answer scientific questions. In contrast, open-ended Q&A

SOTA performance, Xiamen multi-modal protein-ligand affinity prediction AI method, combines molecular surface information for the first time SOTA performance, Xiamen multi-modal protein-ligand affinity prediction AI method, combines molecular surface information for the first time Jul 17, 2024 pm 06:37 PM

Editor | KX In the field of drug research and development, accurately and effectively predicting the binding affinity of proteins and ligands is crucial for drug screening and optimization. However, current studies do not take into account the important role of molecular surface information in protein-ligand interactions. Based on this, researchers from Xiamen University proposed a novel multi-modal feature extraction (MFE) framework, which for the first time combines information on protein surface, 3D structure and sequence, and uses a cross-attention mechanism to compare different modalities. feature alignment. Experimental results demonstrate that this method achieves state-of-the-art performance in predicting protein-ligand binding affinities. Furthermore, ablation studies demonstrate the effectiveness and necessity of protein surface information and multimodal feature alignment within this framework. Related research begins with "S

SK Hynix will display new AI-related products on August 6: 12-layer HBM3E, 321-high NAND, etc. SK Hynix will display new AI-related products on August 6: 12-layer HBM3E, 321-high NAND, etc. Aug 01, 2024 pm 09:40 PM

According to news from this site on August 1, SK Hynix released a blog post today (August 1), announcing that it will attend the Global Semiconductor Memory Summit FMS2024 to be held in Santa Clara, California, USA from August 6 to 8, showcasing many new technologies. generation product. Introduction to the Future Memory and Storage Summit (FutureMemoryandStorage), formerly the Flash Memory Summit (FlashMemorySummit) mainly for NAND suppliers, in the context of increasing attention to artificial intelligence technology, this year was renamed the Future Memory and Storage Summit (FutureMemoryandStorage) to invite DRAM and storage vendors and many more players. New product SK hynix launched last year

See all articles