강력한 LLM 애플리케이션 구축을 위한 필수 실습-파이썬 튜토리얼-php.cn

집

백엔드 개발

파이썬 튜토리얼

강력한 LLM 애플리케이션 구축을 위한 필수 실습

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jul 28, 2024 am 11:22 AM

Essential Practices for Building Robust LLM Applications

소개

저는 클라우드에서 LLM 애플리케이션을 구축해 왔습니다. 또한 MVP나 프로토타입에 매우 적합하고 프로덕션 준비를 위해 약간의 작업이 필요한 LLM 앱을 만드는 많은 개발자를 보았습니다. 나열된 사례 중 하나 이상을 적용하면 애플리케이션을 효과적으로 확장하는 데 도움이 될 수 있습니다. 이 기사에서는 애플리케이션 개발의 전체 소프트웨어 엔지니어링 측면을 다루지 않고 LLM 래퍼 애플리케이션에 대해서만 다룹니다. 또한 코드 조각은 Python으로 되어 있으며 다른 언어에도 동일한 논리를 적용할 수 있습니다.

1. 유연성을 위해 미들웨어 활용

LiteLLM 또는 LangChain과 같은 미들웨어를 사용하면 공급업체 종속을 방지하고 진화하는 모델 간에 쉽게 전환할 수 있습니다.

파이썬:

from litellm import completion

response = completion(
    model="gpt-3.5-turbo", 
    messages=[{"role": "user", "content": "Hello, how are you?"}]
)

로그인 후 복사

LiteLLM 및 LangChain과 같은 미들웨어 솔루션은 애플리케이션과 다양한 LLM 제공자 간의 추상화 계층을 제공합니다. 이 추상화를 사용하면 핵심 애플리케이션 코드를 변경하지 않고도 다양한 모델이나 공급자 간에 쉽게 전환할 수 있습니다. AI 환경이 빠르게 발전함에 따라 향상된 기능을 갖춘 새로운 모델이 자주 출시됩니다. 미들웨어를 사용하면 성능, 비용 또는 기능 요구 사항에 따라 이러한 새로운 모델을 신속하게 채택하거나 공급자를 전환하여 애플리케이션을 최신 상태로 유지하고 경쟁력을 유지할 수 있습니다.

2. 재시도 메커니즘 구현

API 호출에 재시도 논리를 구현하여 속도 제한 문제를 방지하세요.

파이썬:

import time
from openai import OpenAI

client = OpenAI()

def retry_api_call(max_retries=3, delay=1):
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="gpt-3.5-turbo",
                messages=[{"role": "user", "content": "Hello!"}]
            )
            return response
        except Exception as e:
            if attempt == max_retries - 1:
                raise e
            time.sleep(delay * (2 ** attempt))  # Exponential backoff

로그인 후 복사

LLM 제공업체는 남용을 방지하고 공정한 사용을 보장하기 위해 요율 제한을 적용하는 경우가 많습니다. 지수 백오프를 사용하여 재시도 메커니즘을 구현하면 애플리케이션이 일시적인 오류나 속도 제한 오류를 원활하게 처리하는 데 도움이 됩니다. 이 접근 방식은 실패한 요청을 자동으로 재시도하여 애플리케이션의 안정성을 높이고 일시적인 문제로 인한 서비스 중단 가능성을 줄입니다. 지수 백오프 전략(재시도 간 지연 증가)은 즉각적인 재요청으로 인해 API가 과부하되어 속도 제한 문제가 악화되는 것을 방지하는 데 도움이 됩니다.

3. LLM 공급자 대체 설정

단일 LLM 제공업체에만 의존하지 마세요. 할당량 문제 또는 서비스 중단을 처리하기 위해 대체를 구현합니다.

from litellm import completion

def get_llm_response(prompt):
    providers = ['openai/gpt-3.5-turbo', 'anthropic/claude-2', 'cohere/command-nightly']
    for provider in providers:
        try:
            response = completion(model=provider, messages=[{"role": "user", "content": prompt}])
            return response
        except Exception as e:
            print(f"Error with {provider}: {str(e)}")
            continue
    raise Exception("All LLM providers failed")

로그인 후 복사

단일 LLM 제공업체에 의존하면 해당 제공업체에 다운타임이 발생하거나 할당량 한도에 도달하는 경우 서비스가 중단될 수 있습니다. 대체 옵션을 구현하면 애플리케이션의 지속적인 작동이 보장됩니다. 또한 이 접근 방식을 사용하면 다양한 작업에 대해 다양한 공급자나 모델의 장점을 활용할 수 있습니다. LiteLLM은 여러 공급자를 위한 통합 인터페이스를 제공하여 이 프로세스를 단순화하고 공급자 간 전환이나 대체 논리 구현을 더 쉽게 만듭니다.

4. 관찰 가능성 구현

LLM 추적 및 디버깅을 위해 Langfuse 또는 Helicone과 같은 도구를 사용하세요.

from langfuse.openai import OpenAI

client = OpenAI(
    api_key="your-openai-api-key",
    langfuse_public_key="your-langfuse-public-key",
    langfuse_secret_key="your-langfuse-secret-key"
)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Hello, AI!"}]
)

로그인 후 복사

관측성 구현의 장점:

향상된 디버깅: 대화를 쉽게 추적하고 재생하여 문제를 식별합니다.
성능 최적화: 응답 시간과 모델 성능에 대한 통찰력을 얻으세요.
비용 관리: 더 나은 예산 관리를 위해 토큰 사용량 및 관련 비용을 추적합니다.
품질 보증: 응답의 품질을 모니터링하고 개선이 필요한 영역을 식별합니다.
사용자 경험 분석: 사용자 상호 작용을 이해하고 이에 따라 프롬프트를 최적화합니다.
규정 준수 및 감사: 규정 준수 및 내부 감사에 대한 로그를 유지합니다.
이상 탐지: 비정상적인 패턴이나 행동을 빠르게 식별하고 대응합니다.

관찰 도구는 LLM 애플리케이션의 성능, 사용 패턴 및 잠재적인 문제에 대한 중요한 통찰력을 제공합니다. LLM과의 상호 작용을 실시간으로 모니터링 및 분석하여 프롬프트를 최적화하고 병목 현상을 식별하며 AI 생성 응답의 품질을 보장할 수 있습니다. 이러한 수준의 가시성은 시간이 지남에 따라 애플리케이션을 유지 관리, 디버깅 및 개선하는 데 필수적입니다.

5. 효과적인 프롬프트 관리

코드나 텍스트 파일에 프롬프트를 하드코딩하는 대신 버전 관리 기능이 있는 프롬프트 관리 도구를 사용하세요.

from promptflow import PromptFlow

pf = PromptFlow()

prompt_template = pf.get_prompt("greeting_prompt", version="1.2")
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": prompt_template.format(name="Alice")}]
)

로그인 후 복사

LLM 지원서를 유지하고 개선하려면 효과적인 신속한 관리가 중요합니다. 전용 프롬프트 관리 도구를 사용하면 프롬프트 버전을 관리하고, 다양한 변형을 A/B 테스트하고, 애플리케이션 전체에서 쉽게 업데이트할 수 있습니다. 이 접근 방식은 애플리케이션 코드에서 프롬프트 논리를 분리하므로 핵심 애플리케이션을 변경하지 않고도 프롬프트에서 더 쉽게 반복할 수 있습니다. 또한 기술 전문가가 아닌 팀원도 신속한 개선에 기여하고 AI 상호 작용을 개선하는 데 있어 더 나은 협업을 가능하게 합니다.

6. Store Conversation History Persistently

Use a persistent cache like Redis for storing conversation history instead of in-memory cache which is not adapted for distributed systems.

from langchain.memory import RedisChatMessageHistory
from langchain.chains import ConversationChain
from langchain.llms import OpenAI

# Initialize Redis chat message history
message_history = RedisChatMessageHistory(url="redis://localhost:6379/0", ttl=600, session_id="user-123")

# Create a conversation chain with Redis memory
conversation = ConversationChain(
    llm=OpenAI(),
    memory=message_history,
    verbose=True
)

# Use the conversation
response = conversation.predict(input="Hi there!")
print(response)

# The conversation history is automatically stored in Redis

로그인 후 복사

Storing conversation history is essential for maintaining context in ongoing interactions and providing personalized experiences. Using a persistent cache like Redis, especially in distributed systems, ensures that conversation history is reliably stored and quickly accessible. This approach allows your application to scale horizontally while maintaining consistent user experiences across different instances or servers. The use of Redis with LangChain simplifies the integration of persistent memory into your conversational AI system, making it easier to build stateful, context-aware applications.

7. Use JSON Mode whenever possible

Whenever possible like extracting structured information, provide a JSON schema instead of relying on raw text output.

import openai

response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo-1106",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "Extract the name and age from the user's input."},
        {"role": "user", "content": "My name is John and I'm 30 years old."}
    ]
)

print(response.choices[0].message.content)
# Output: {"name": "John", "age": 30}

로그인 후 복사

Using JSON mode for information extraction provides a structured and consistent output format, making it easier to parse and process the LLM's responses in your application. This approach reduces the need for complex post-processing of free-form text and minimizes the risk of misinterpretation. It's particularly useful for tasks like form filling, data extraction from unstructured text, or any scenario where you need to integrate AI-generated content into existing data structures or databases.

8. Set Up Credit Alerts

Implement alerts for prepaid credits and per-user credit checks, even in MVP stages.

def check_user_credits(user_id, requested_tokens):
    user_credits = get_user_credits(user_id)
    if user_credits < requested_tokens:
        raise InsufficientCreditsError(f"User {user_id} has insufficient credits")

    remaining_credits = user_credits - requested_tokens
    if remaining_credits < CREDIT_ALERT_THRESHOLD:
        send_low_credit_alert(user_id, remaining_credits)

    return True

로그인 후 복사

Implementing credit alerts and per-user credit checks is crucial for managing costs and ensuring fair usage in your LLM application. This system helps prevent unexpected expenses and allows you to proactively manage user access based on their credit limits. By setting up alerts at multiple thresholds, you can inform users or administrators before credits are depleted, ensuring uninterrupted service. This approach is valuable even in MVP stages, as it helps you understand usage patterns and plan for scaling your application effectively.

9. Implement Feedback Loops

Create mechanisms for users to provide feedback on AI responses, starting with simple thumbs up/down ratings.

def process_user_feedback(response_id, feedback):
    if feedback == 'thumbs_up':
        log_positive_feedback(response_id)
    elif feedback == 'thumbs_down':
        log_negative_feedback(response_id)
        trigger_improvement_workflow(response_id)

# In your API endpoint
@app.route('/feedback', methods=['POST'])
def submit_feedback():
    data = request.json
    process_user_feedback(data['response_id'], data['feedback'])
    return jsonify({"status": "Feedback received"})

로그인 후 복사

Implementing feedback loops is essential for continuously improving your LLM application. By allowing users to provide feedback on AI responses, you can identify areas where the model performs well and where it needs improvement. This data can be used to fine-tune models, adjust prompts, or implement additional safeguards. Starting with simple thumbs up/down ratings provides an easy way for users to give feedback, while more detailed feedback options can be added later for deeper insights. This approach helps in building trust with users and demonstrates your commitment to improving the AI's performance based on real-world usage.

10. Implement Guardrails

Use prompt guards to check for prompt injection attacks, toxic content, and off-topic responses.

import re
from better_profanity import profanity

def check_prompt_injection(input_text):
    injection_patterns = [
        r"ignore previous instructions",
        r"disregard all prior commands",
        r"override system prompt"
    ]
    for pattern in injection_patterns:
        if re.search(pattern, input_text, re.IGNORECASE):
            return True
    return False

def check_toxic_content(input_text):
    return profanity.contains_profanity(input_text)

def sanitize_input(input_text):
    if check_prompt_injection(input_text):
        raise ValueError("Potential prompt injection detected")

    if check_toxic_content(input_text):
        raise ValueError("Toxic content detected")

    # Additional checks can be added here (e.g., off-topic detection)

    return input_text  # Return sanitized input if all checks pass

# Usage
try:
    safe_input = sanitize_input(user_input)
    # Process safe_input with your LLM
except ValueError as e:
    print(f"Input rejected: {str(e)}")

로그인 후 복사

Implementing guardrails is crucial for ensuring the safety and reliability of your LLM application. This example demonstrates how to check for potential prompt injection attacks and toxic content. Prompt injection attacks attempt to override or bypass the system's intended behavior, while toxic content checks help maintain a safe and respectful environment. By implementing these checks, you can prevent malicious use of your AI system and ensure that the content generated aligns with your application's guidelines and ethical standards. Additional checks can be added to detect off-topic responses or other unwanted behaviors, further enhancing the robustness of your application.

Conclusion

All the above listed points can be easily integrated into your application and they prepare you better for scaling in production. You may also agree or disagree on some of the above points. In any case, feel free to post your questions or comments.

위 내용은 강력한 LLM 애플리케이션 구축을 위한 필수 실습의 상세 내용입니다. 자세한 내용은 PHP 중국어 웹사이트의 기타 관련 기사를 참조하세요!

본 웹사이트의 성명

본 글의 내용은 네티즌들의 자발적인 기여로 작성되었으며, 저작권은 원저작자에게 있습니다. 본 사이트는 이에 상응하는 법적 책임을 지지 않습니다. 표절이나 침해가 의심되는 콘텐츠를 발견한 경우 admin@php.cn으로 문의하세요.

핫 AI 도구

Undresser.AI Undress

사실적인 누드 사진을 만들기 위한 AI 기반 앱

AI Clothes Remover

사진에서 옷을 제거하는 온라인 AI 도구입니다.

Undress AI Tool

무료로 이미지를 벗다

Clothoff.io

AI 옷 제거제

Video Face Swap

완전히 무료인 AI 얼굴 교환 도구를 사용하여 모든 비디오의 얼굴을 쉽게 바꾸세요!

뜨거운 도구

메모장++7.3.1

사용하기 쉬운 무료 코드 편집기

SublimeText3 중국어 버전

중국어 버전, 사용하기 매우 쉽습니다.

스튜디오 13.0.1 보내기

강력한 PHP 통합 개발 환경

드림위버 CS6

시각적 웹 개발 도구

SublimeText3 Mac 버전

신 수준의 코드 편집 소프트웨어(SublimeText3)

뜨거운 주제

자바 튜토리얼

1666

Cakephp 튜토리얼

1426

라라벨 튜토리얼

1328

PHP 튜토리얼

1273

C# 튜토리얼

1253

Related knowledge

파이썬 : 게임, Guis 등 Apr 13, 2025 am 12:14 AM

Python은 게임 및 GUI 개발에서 탁월합니다. 1) 게임 개발은 Pygame을 사용하여 드로잉, 오디오 및 기타 기능을 제공하며 2D 게임을 만드는 데 적합합니다. 2) GUI 개발은 Tkinter 또는 PYQT를 선택할 수 있습니다. Tkinter는 간단하고 사용하기 쉽고 PYQT는 풍부한 기능을 가지고 있으며 전문 개발에 적합합니다.

Python vs. C : 학습 곡선 및 사용 편의성 Apr 19, 2025 am 12:20 AM

Python은 배우고 사용하기 쉽고 C는 더 강력하지만 복잡합니다. 1. Python Syntax는 간결하며 초보자에게 적합합니다. 동적 타이핑 및 자동 메모리 관리를 사용하면 사용하기 쉽지만 런타임 오류가 발생할 수 있습니다. 2.C는 고성능 응용 프로그램에 적합한 저수준 제어 및 고급 기능을 제공하지만 학습 임계 값이 높고 수동 메모리 및 유형 안전 관리가 필요합니다.

파이썬과 시간 : 공부 시간을 최대한 활용 Apr 14, 2025 am 12:02 AM

제한된 시간에 Python 학습 효율을 극대화하려면 Python의 DateTime, Time 및 Schedule 모듈을 사용할 수 있습니다. 1. DateTime 모듈은 학습 시간을 기록하고 계획하는 데 사용됩니다. 2. 시간 모듈은 학습과 휴식 시간을 설정하는 데 도움이됩니다. 3. 일정 모듈은 주간 학습 작업을 자동으로 배열합니다.

Python vs. C : 성능과 효율성 탐색 Apr 18, 2025 am 12:20 AM

Python은 개발 효율에서 C보다 낫지 만 C는 실행 성능이 높습니다. 1. Python의 간결한 구문 및 풍부한 라이브러리는 개발 효율성을 향상시킵니다. 2.C의 컴파일 유형 특성 및 하드웨어 제어는 실행 성능을 향상시킵니다. 선택할 때는 프로젝트 요구에 따라 개발 속도 및 실행 효율성을 평가해야합니다.

Python Standard Library의 일부는 무엇입니까? 목록 또는 배열은 무엇입니까? Apr 27, 2025 am 12:03 AM

Pythonlistsarepartoftsandardlardlibrary, whileraysarenot.listsarebuilt-in, 다재다능하고, 수집 할 수있는 반면, arraysarreprovidedByTearRaymoduledlesscommonlyusedDuetolimitedFunctionality.

파이썬 : 자동화, 스크립팅 및 작업 관리 Apr 16, 2025 am 12:14 AM

파이썬은 자동화, 스크립팅 및 작업 관리가 탁월합니다. 1) 자동화 : 파일 백업은 OS 및 Shutil과 같은 표준 라이브러리를 통해 실현됩니다. 2) 스크립트 쓰기 : PSUTIL 라이브러리를 사용하여 시스템 리소스를 모니터링합니다. 3) 작업 관리 : 일정 라이브러리를 사용하여 작업을 예약하십시오. Python의 사용 편의성과 풍부한 라이브러리 지원으로 인해 이러한 영역에서 선호하는 도구가됩니다.

Python 학습 : 2 시간의 일일 연구가 충분합니까? Apr 18, 2025 am 12:22 AM

하루에 2 시간 동안 파이썬을 배우는 것으로 충분합니까? 목표와 학습 방법에 따라 다릅니다. 1) 명확한 학습 계획을 개발, 2) 적절한 학습 자원 및 방법을 선택하고 3) 실습 연습 및 검토 및 통합 연습 및 검토 및 통합,이 기간 동안 Python의 기본 지식과 고급 기능을 점차적으로 마스터 할 수 있습니다.

Python vs. C : 주요 차이점 이해 Apr 21, 2025 am 12:18 AM

Python과 C는 각각 고유 한 장점이 있으며 선택은 프로젝트 요구 사항을 기반으로해야합니다. 1) Python은 간결한 구문 및 동적 타이핑으로 인해 빠른 개발 및 데이터 처리에 적합합니다. 2) C는 정적 타이핑 및 수동 메모리 관리로 인해 고성능 및 시스템 프로그래밍에 적합합니다.

See all articles

강력한 LLM 애플리케이션 구축을 위한 필수 실습

소개

1. 유연성을 위해 미들웨어 활용

2. 재시도 메커니즘 구현

3. LLM 공급자 대체 설정

4. 관찰 가능성 구현

5. 효과적인 프롬프트 관리

6. Store Conversation History Persistently

7. Use JSON Mode whenever possible

8. Set Up Credit Alerts

9. Implement Feedback Loops

10. Implement Guardrails

Conclusion

핫 AI 도구

Undresser.AI Undress

AI Clothes Remover

Undress AI Tool

Clothoff.io

Video Face Swap

인기 기사

뜨거운 도구

메모장++7.3.1

SublimeText3 중국어 버전

스튜디오 13.0.1 보내기

드림위버 CS6

SublimeText3 Mac 버전

뜨거운 주제