検索拡張ジェネレーション(RAG)は、テキストを生成するときに外部情報を使用できるようにすることで、大規模な言語モデル(LLM)をより賢く、より正確にする手法です。
しかし、大きな課題は、膨大なデータのコレクションから適切なドキュメントまたはパッセージを選ぶことです。RANKGPTは、RAGパイプラインの再ランクステップを改善することにより、この問題に対処します。 LLMSの深い理解能力を使用して、どの情報が最も関連性があるかをより適切に評価し、(再)ランク付けします。 この記事では、RANKGPTを紹介し、それをRAG AIアプリケーションに統合する方法を示します。
検索拡張生成(RAG)は、LLMと情報検索システムを組み合わせた方法です。これは、LLMがテキストを生成するように求められた場合、外部ソースから関連情報を引き出し、その応答をより正確で情報化できることを意味します。ragは、レトリバーとジェネレーターの2つの主要なコンポーネントと、オプションのコンポーネントである再lankerで構成されています。
Retriver - レトリーバーの仕事は、ユーザーのクエリに基づいて、大規模なドキュメントセットから関連するドキュメントまたはテキストセグメントを見つけることです。 BM25などのアルゴリズムを使用して、ドキュメントを関連性によってランク付けします。
Reranker(オプション) - 再審査員は、取得したドキュメントの最初のセットを取り、それらを再配置して、最も関連性の高いものが最上位にあることを確認します。これにより、あまり役に立たない情報を除外し、重要なことに焦点を当てるのに役立ちます。rag
のrankgptの役割と利点RANKGPTはLLMSを使用して、取得したドキュメントまたはテキストセグメントの関連性を評価し、最も重要なドキュメントが最上位にあることを確認します。 RANKGPTを使用すると、RAGパイプラインのジェネレーターは高品質の入力を取得し、より正確な応答をもたらします。
関連性とパフォーマンスの向上これらの小さなモデルは、はるかに効率的である一方で、高性能を維持します。たとえば、蒸留された440mモデルは、Beirベンチマーク上の3Bの監視モデルを上回り、より良い結果を達成しながら計算コストを大幅に削減しました。
新しい情報と未知の情報の処理GPT-4このテストで最先端のパフォーマンスを達成し、新しいクエリと目に見えないクエリを効果的に処理する能力を実証しました。
rankgptベンチマークパフォーマンスRANKGPT(GPT-4)は、TRECおよびBeirの他のすべてのモデルを上回り、下の表に示すように、平均NDCG@10スコアは53.68です。 Beirデータセットで最高の結果を獲得し、Monot5(3b)やCohere Rerank-V2などの強力な監視されたモデルを破りました。 GPT-3.5-ターボであっても、RANKGPTは競争力のあるスコアを獲得し、非常に効果的な再生者であることを証明しています。
出典:Weiwei Sun et al。、2023
また、RANKGPT(GPT-4)は、Mr.Tydiデータセットでも強く機能し、平均NDCG@10スコア62.93でリードし、BM25とMMARCOCEの両方を破ります。それは一貫してBM25を上回り、特にインドネシアとスワヒリ語で多くの言語でMmarcoceを上回ります。
全体として、ランクはベンガル語、インドネシア語、日本語など、多くの言語で最高の得点を獲得しました。
出典:Weiwei Sun et al。、2023
最後に、RANKGPTはNoveleValデータセットでテストされました。これは、モデルが最近の馴染みのない情報に基づいてパッセージをランク付けできるかを測定します。 RANKGPT(GPT-4)は、特にNDCG@10スコアが90.45のすべての評価メトリック(NDCG@1、NDCG@5、およびNDCG@10)で最高の得点を記録しました。 Monot5(3b)やMonobert(340m)などの他の強力なモデルよりも優れていました。
すべてのベンチマーク結果にわたって、RANKGPT(GPT-4)は、監視されているか監視されていないかにかかわらず、他の方法を一貫して上回り、再ランキングにおける優れた能力を示しています。
ragパイプラインにrankgptの実装RANKGPTをRAGパイプラインに統合する方法は次のとおりです。
ステップ1:rankgptリポジトリをクローン
git clone https://github.com/sunnweiwei/RankGPT
rankgptディレクトリに移動し、必要なパッケージをインストールします。提供された要件を使用して仮想環境を作成し、パッケージをインストールすることをお勧めします。
pip install -r requirements.txt
提供された順列パイプラインを使用して、検索されたドキュメントをRANKGPTで簡単に再確認できます。
item = { 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'} ] }
これにより、次のドキュメントの新しい順序が生じます。
from rank_gpt import permutation_pipeline new_item = permutation_pipeline( item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo', api_key='Your OPENAI Key!' ) print(new_item)
順列パイプラインのより段階的な実装については、RANKGPTと直接対話して、次のように順列の指示を作成および処理できます。
{ 'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [ {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'} ] }
from rank_gpt import ( create_permutation_instruction, run_llm, receive_permutation ) # Create permutation generation instruction messages = create_permutation_instruction( item=item, rank_start=0, rank_end=3, model_name='gpt-3.5-turbo' )
[{'role': 'system', 'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'}, {'role': 'user', 'content': 'I will provide you with 3 passages, each indicated by number identifier []. \nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'}, {'role': 'assistant', 'content': 'Okay, please provide the passages.'}, {'role': 'user', 'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'role': 'assistant', 'content': 'Received passage [1].'}, {'role': 'user', 'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}, {'role': 'assistant', 'content': 'Received passage [2].'}, {'role': 'user', 'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'role': 'assistant', 'content': 'Received passage [3].'}, {'role': 'user', 'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]
# Get ChatGPT predicted permutation permutation = run_llm( messages, api_key='Your OPENAI Key!', model_name='gpt-3.5-turbo' )
'[1] > [3] > [2]'
# Use permutation to re-rank the passage item = receive_permutation( item, permutation, rank_start=0, rank_end=3 )
{'query': 'How much impact do masks have on preventing the spread of the COVID-19?', 'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'}, {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}, {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}
以上がragの再ランクエージェントとしてのrankgpt(チュートリアル)の詳細内容です。詳細については、PHP 中国語 Web サイトの他の関連記事を参照してください。