>檢索增強生成(RAG)是一種使大型語言模型(LLMS)更聰明,更準確的技術,可以使它們在生成文本時使用外部信息。 但是,最大的挑戰是從大量數據集中挑選正確的文檔或段落。
通過改善RAG管道中的重新排列步驟來解決此問題。它使用LLM的深刻理解能力來更好地評估和(重新)排名哪些信息是最相關的。>
>在本文中,我們將介紹Rankgpt並演示如何將其集成到您的RAG AI應用程序中。檢索增強生成(RAG)是一種將LLM與信息檢索系統相結合的方法。這意味著,當要求LLM生成文本時,它可以從外部來源中獲取相關信息,從而使其響應更加準確和了解。
檢索器 - 檢索員的工作是根據用戶查詢從大量文檔中查找相關文檔或文本段。它使用BM25之類的算法來通過其相關性對文檔進行排名。
生成器 - 生成器是使用檢索文檔生成最終輸出的LLM。訪問相關的外部數據可以產生更準確的響應。
>raggpt在抹布
這些較小的模型保持高性能,同時更有效。例如,蒸餾440m模型的表現優於貝爾基准上的3B監督模型,大大降低了計算成本,同時實現了更好的結果。
處理新的和未知的信息gpt-4在該測試中實現了最先進的性能,證明了其有效處理新的和看不見的查詢的能力。
rankgpt基準性能
如下表所示,
來源:Weiwei Sun等,2023
來源:Weiwei Sun等,2023
>最後,RankGPT在Noveleval數據集中進行了測試,該數據集測量了模型可以根據最新信息和陌生信息對段落進行排名。 RankGPT(GPT-4)在所有評估指標中均得分最高(NDCG@1,NDCG@5和NDCG@10),尤其是NDCG@10分數為90.45。它的表現優於其他強大模型,例如Monot5(3b)和Monobert(340m),它突出了其作為Reranker的強勁表現。
來源:Weiwei Sun等,2023
在所有基準結果中,ranggpt(GPT-4)始終優於其他方法,無論是被監督還是不受監督,證明了它在重新加工方面的卓越能力。>在抹布管道中實現rankgpt
這是我們可以將rankgpt集成到抹布管道中的方式。 >
首先,您需要克隆rankgpt存儲庫。在您的終端中運行以下命令:
git clone https://github.com/sunnweiwei/RankGPT
>導航到rankgpt目錄並安裝所需的軟件包。您可能需要創建虛擬環境並使用提供的要求安裝軟件包。
pip install -r requirements.txt
>您可以使用提供的置換管道輕鬆使用rankgpt重新錄製的文檔。
item = { 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'} ] }
這將導致以下新的文檔順序:
from rank_gpt import permutation_pipeline new_item = permutation_pipeline( item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo', api_key='Your OPENAI Key!' ) print(new_item)
>分步教學置換生成
{ 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'} ] }
from rank_gpt import ( create_permutation_instruction, run_llm, receive_permutation ) # Create permutation generation instruction messages = create_permutation_instruction( item=item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo' )
[{'role': 'system', 'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'}, {'role': 'user', 'content': 'I will provide you with 3 passages, each indicated by number identifier []. \nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'}, {'role': 'assistant', 'content': 'Okay, please provide the passages.'}, {'role': 'user', 'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'role': 'assistant', 'content': 'Received passage [1].'}, {'role': 'user', 'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'role': 'assistant', 'content': 'Received passage [2].'}, {'role': 'user', 'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'role': 'assistant', 'content': 'Received passage [3].'}, {'role': 'user', 'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]
# Get ChatGPT predicted permutation permutation = run_llm( messages, api_key='Your OPENAI Key!', model_name='gpt-3.5-turbo' )
'[1] > [3] > [2]'
# Use permutation to re-rank the passage item = receive_permutation( item, permutation, rank_start=0, rank_end=3 )
{'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}
在此示例中,滑動窗口的大小為2,步驟大小為1,這意味著它一次處理兩個文檔,將一個文檔移動到下一個排名。
結論from rank_gpt import sliding_windows api_key = "Your OPENAI Key" new_item = sliding_windows( item, rank_start=0, rank_end=3, window_size=2, step=1, model_name='gpt-3.5-turbo', api_key=api_key ) print(new_item)
這解決了常見問題,例如確保內容在點上,提高效率並降低產生誤導信息的可能性。
總體而言,RankGPT有助於構建更可靠,更準確的RAG應用程序。以上是作為抹布(教程)的重新排列代理。的詳細內容。更多資訊請關注PHP中文網其他相關文章!