This tutorial demonstrates building a production-ready AI pull request reviewer using LLMOps best practices. The final application, accessible here, accepts a public PR URL and returns an AI-generated review.
Application Overview
This tutorial covers:
Core Logic
The AI assistant's workflow is simple: given a PR URL, it retrieves the diff from GitHub and submits it to an LLM for review.
GitHub diffs are accessed via:
<code>https://patch-diff.githubusercontent.com/raw/{owner}/{repo}/pull/{pr_number}.diff</code>
This Python function fetches the diff:
<code class="language-python">def get_pr_diff(pr_url): # ... (Code remains the same) return response.text</code>
LiteLLM facilitates LLM interactions, offering a consistent interface across various providers.
<code class="language-python">prompt_system = """ You are an expert Python developer performing a file-by-file review of a pull request. You have access to the full diff of the file to understand the overall context and structure. However, focus on reviewing only the specific hunk provided. """ prompt_user = """ Here is the diff for the file: {diff} Please provide a critique of the changes made in this file. """ def generate_critique(pr_url: str): diff = get_pr_diff(pr_url) response = litellm.completion( model=config.model, messages=[ {"content": config.system_prompt, "role": "system"}, {"content": config.user_prompt.format(diff=diff), "role": "user"}, ], ) return response.choices[0].message.content</code>
Implementing Observability with Agenta
Agenta enhances observability, tracking inputs, outputs, and data flow for easier debugging.
Initialize Agenta and configure LiteLLM callbacks:
<code class="language-python">import agenta as ag ag.init() litellm.callbacks = [ag.callbacks.litellm_handler()]</code>
Instrument functions with Agenta decorators:
<code class="language-python">@ag.instrument() def generate_critique(pr_url: str): # ... (Code remains the same) return response.choices[0].message.content</code>
Set the AGENTA_API_KEY
environment variable (obtained from Agenta) and optionally AGENTA_HOST
for self-hosting.
Creating an LLM Playground
Agenta's custom workflow feature provides an IDE-like playground for iterative development. The following code snippet demonstrates the configuration and integration with Agenta:
<code class="language-python">from pydantic import BaseModel, Field from typing import Annotated import agenta as ag import litellm from agenta.sdk.assets import supported_llm_models # ... (previous code) class Config(BaseModel): system_prompt: str = prompt_system user_prompt: str = prompt_user model: Annotated[str, ag.MultipleChoice(choices=supported_llm_models)] = Field(default="gpt-3.5-turbo") @ag.route("/", config_schema=Config) @ag.instrument() def generate_critique(pr_url:str): diff = get_pr_diff(pr_url) config = ag.ConfigManager.get_from_route(schema=Config) response = litellm.completion( model=config.model, messages=[ {"content": config.system_prompt, "role": "system"}, {"content": config.user_prompt.format(diff=diff), "role": "user"}, ], ) return response.choices[0].message.content</code>
Serving and Evaluating with Agenta
agenta init
specifying the app name and API key.agenta variant serve app.py
.This makes the application accessible through Agenta's playground for end-to-end testing. LLM-as-a-judge is used for evaluation. The evaluator prompt is:
<code>You are an evaluator grading the quality of a PR review. CRITERIA: ... (criteria remain the same) ANSWER ONLY THE SCORE. DO NOT USE MARKDOWN. DO NOT PROVIDE ANYTHING OTHER THAN THE NUMBER</code>
The user prompt for the evaluator:
<code>https://patch-diff.githubusercontent.com/raw/{owner}/{repo}/pull/{pr_number}.diff</code>
Deployment and Frontend
Deployment is done through Agenta's UI:
A v0.dev frontend was used for rapid UI creation.
Next Steps and Conclusion
Future improvements include prompt refinement, incorporating full code context, and handling large diffs. This tutorial successfully demonstrates building, evaluating, and deploying a production-ready AI pull request reviewer using Agenta and LiteLLM.
The above is the detailed content of Build an AI code review assistant with vev, litellm and Agenta. For more information, please follow other related articles on the PHP Chinese website!