Pengambilan Generasi Tambahan (RAG) adalah teknik yang menjadikan model bahasa besar (LLMS) lebih bijak dan lebih tepat dengan membenarkan mereka menggunakan maklumat luar apabila menghasilkan teks.
Cabaran besar, bagaimanapun, memilih dokumen atau petikan yang tepat dari koleksi data yang besar.RankGPT menangani isu ini dengan meningkatkan langkah semula dalam saluran paip RAG. Ia menggunakan keupayaan pemahaman yang mendalam LLM untuk menilai dan (semula) yang lebih baik maklumat yang paling relevan.
Dalam artikel ini, kami akan memperkenalkan RankGPT dan menunjukkan bagaimana anda dapat mengintegrasikannya ke dalam aplikasi AI RAG anda.
Pengambilan Generasi Tambahan (RAG) adalah kaedah yang menggabungkan LLM dengan sistem pengambilan maklumat. Ini bermakna apabila LLM diminta untuk menghasilkan teks, ia boleh menarik maklumat yang relevan dari sumber luaran, menjadikan responsnya lebih tepat dan dimaklumkan.RAG terdiri daripada dua komponen utama -retriever dan penjana -dan komponen pilihan, reranker:
Retriever -Tugas Retriever adalah untuk mencari dokumen atau segmen teks yang relevan dari satu set besar dokumen berdasarkan pertanyaan pengguna. Ia menggunakan algoritma seperti BM25 untuk menilai dokumen dengan kaitannya.
Reranker (Pilihan) - Reranker mengambil set awal dokumen yang diambil dan mengembalikan semula mereka untuk memastikan yang paling relevan berada di bahagian atas. Ini membantu menapis maklumat yang kurang berguna dan memberi tumpuan kepada apa yang penting.
Apabila menggunakan GPT-4 dengan generasi permutasi pengajaran sifar, RANKGPT mengatasi sistem yang diawasi terkemuka di pelbagai tanda aras seperti TREC, BEIR, dan Mr.Tydi.
RankGPT menggunakan penyulingan permutasi untuk memindahkan kebolehan ranking model besar seperti GPT-4 ke dalam model yang lebih kecil, khusus.
Model -model yang lebih kecil ini mengekalkan prestasi tinggi sementara menjadi lebih efisien. Sebagai contoh, model 440m suling mengatasi model 3B yang diawasi pada penanda aras Beir, mengurangkan kos pengiraan dengan ketara semasa mencapai hasil yang lebih baik.
mengendalikan maklumat baru dan tidak diketahuiGPT-4 mencapai prestasi terkini pada ujian ini, menunjukkan keupayaannya untuk mengendalikan pertanyaan baru dan tidak kelihatan.
Prestasi penanda aras RANKGPT
Sumber: Weiwei Sun et al., 2023
RankGPT (GPT-4) juga melakukan yang kuat pada dataset Mr.Tydi, yang membawa purata skor NDCG@10 62.93, mengalahkan kedua-dua BM25 dan MMARCOCE. Ia secara konsisten mengatasi BM25 dan bahkan melampaui mmarcoce dalam banyak bahasa, terutama di Indonesia dan Swahili.
Secara keseluruhan, RankGPT menjaringkan tertinggi dalam banyak bahasa, seperti Bengali, Indonesia, dan Jepun, dengan hanya beberapa kes di mana ia sedikit tertinggal di belakang Mmarcoce.
Sumber: Weiwei Sun et al., 2023
Di seluruh semua hasil penanda aras, RankGPT (GPT-4) secara konsisten mengatasi kaedah lain, sama ada ia diawasi atau tidak diselia, menunjukkan keupayaan unggulnya dalam pengalihan semula.
Inilah cara kita dapat mengintegrasikan RankGPT ke dalam saluran paip RAG.
Langkah 1: Klon Repositori RankGPT
git clone https://github.com/sunnweiwei/RankGPT
Navigasi ke direktori RankGPT dan pasang pakej yang diperlukan. Anda mungkin ingin membuat persekitaran maya dan memasang pakej menggunakan keperluan yang disediakan.txt:
pip install -r requirements.txt
di sini, kami menggunakan pertanyaan contoh ringkas dan dokumen yang diambil oleh repositori RankGPT yang asal.
item = { 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'} ] }
anda boleh menggunakan saluran paip permutasi yang disediakan untuk mudah mengubah dokumen yang diambil dengan RankGPT.
from rank_gpt import permutation_pipeline new_item = permutation_pipeline( item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo', api_key='Your OPENAI Key!' ) print(new_item)
ini akan menghasilkan urutan baru dokumen berikut:
{ 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'} ] }
Untuk pelaksanaan langkah-langkah demi langkah permutasi, anda boleh berinteraksi secara langsung dengan RankGPT untuk membuat dan memproses arahan permutasi seperti berikut:
from rank_gpt import ( create_permutation_instruction, run_llm, receive_permutation ) # Create permutation generation instruction messages = create_permutation_instruction( item=item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo' )
[{'role': 'system', 'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'}, {'role': 'user', 'content': 'I will provide you with 3 passages, each indicated by number identifier []. \nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'}, {'role': 'assistant', 'content': 'Okay, please provide the passages.'}, {'role': 'user', 'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'role': 'assistant', 'content': 'Received passage [1].'}, {'role': 'user', 'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'role': 'assistant', 'content': 'Received passage [2].'}, {'role': 'user', 'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'role': 'assistant', 'content': 'Received passage [3].'}, {'role': 'user', 'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]
# Get ChatGPT predicted permutation permutation = run_llm( messages, api_key='Your OPENAI Key!', model_name='gpt-3.5-turbo' )
'[1] > [3] > [2]'
# Use permutation to re-rank the passage item = receive_permutation( item, permutation, rank_start=0, rank_end=3 )
{'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}
Jika anda perlu meletakkan lebih banyak dokumen daripada model yang boleh mengendalikan sekaligus, gunakan strategi tetingkap gelongsor. Berikut adalah cara memohon strategi tetingkap gelongsor untuk menduduki semula dokumen:
from rank_gpt import sliding_windows api_key = "Your OPENAI Key" new_item = sliding_windows( item, rank_start=0, rank_end=3, window_size=2, step=1, model_name='gpt-3.5-turbo', api_key=api_key ) print(new_item)
Dalam contoh ini, tetingkap gelongsor mempunyai saiz 2 dan saiz langkah 1, yang bermaksud ia memproses dua dokumen pada satu masa, memindahkan satu dokumen ke hadapan untuk pas kedudukan seterusnya.
dengan menggunakan LLMS untuk menilai lebih baik kaitan maklumat, RankGPT meningkatkan ketepatan penyortiran dan kandungan semula.
Ini menangani isu -isu biasa seperti memastikan kandungan adalah tepat, meningkatkan kecekapan, dan mengurangkan kemungkinan menghasilkan maklumat yang mengelirukan.
secara keseluruhan, RankGPT menyumbang untuk membina aplikasi RAG yang lebih dipercayai dan tepat.
Atas ialah kandungan terperinci RANKGPT sebagai ejen peringkat semula untuk RAG (tutorial). Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!