中级
利用 Gemini 1.5 Pro 优化上下文处理
学习如何有效利用超过100万tokens的上下文窗口进行大规模文档分析。
18 分钟阅读
Gemini 1.5 Pro
利用 Gemini 1.5 Pro 优化上下文处理#
Gemini 1.5 Pro 提供了革命性的超过100万tokens的上下文窗口,为文档分析和检索开启了新的可能性。
理解长上下文的优势#
传统的大型语言模型通常限制在4K到128K tokens之间。Gemini 1.5 Pro 的100万+上下文意味着你可以:
- 一次性处理整个代码库
- 分析冗长的法律文件
- 审阅完整的书籍手稿
- 同时比较多个文档
快速开始#
环境设置#
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel('gemini-1.5-pro')基础长上下文查询#
def analyze_large_document(document_path: str, query: str) -> str:
"""使用特定查询分析大型文档。"""
with open(document_path, 'r') as f:
content = f.read()
prompt = f"""
Document:
{content}
Query: {query}
Please analyze the document and answer the query comprehensively.
"""
response = model.generate_content(prompt)
return response.text上下文窗口策略#
结构化上下文加载#
组织你的上下文以获得更好的结果:
def create_structured_context(documents: list) -> str:
"""从多个文档创建结构化上下文。"""
context_parts = []
for i, doc in enumerate(documents):
context_parts.append(f"""
=== DOCUMENT {i+1}: {doc['title']} ===
Source: {doc['source']}
Date: {doc['date']}
Content:
{doc['content']}
=== END DOCUMENT {i+1} ===
""")
return "\n\n".join(context_parts)分层上下文组织#
def build_hierarchical_context(data: dict) -> str:
"""为复杂数据结构构建分层上下文。"""
template = """
# Project Overview
{overview}
## Architecture
{architecture}
## Components
{components}
## Dependencies
{dependencies}
## Recent Changes
{changes}
"""
return template.format(**data)高效的 Token 使用#
Token 计数#
def count_tokens(text: str) -> int:
"""使用 Gemini 的分词器计算文本中的 tokens 数量。"""
model = genai.GenerativeModel('gemini-1.5-pro')
return model.count_tokens(text).total_tokens
def check_context_fit(documents: list, max_tokens: int = 1000000) -> bool:
"""检查文档是否在上下文窗口限制内。"""
total = sum(count_tokens(doc) for doc in documents)
return total < max_tokens上下文压缩#
当接近限制时,进行策略性压缩:
def compress_context(content: str, target_ratio: float = 0.5) -> str:
"""压缩内容,同时保留关键信息。"""
prompt = f"""
Summarize the following content, preserving:
- Key facts and figures
- Important decisions and conclusions
- Technical specifications
- Action items
Target compression: {int(target_ratio * 100)}% of original
Content:
{content}
"""
response = model.generate_content(prompt)
return response.text高级用例#
代码库分析#
def analyze_codebase(repo_path: str) -> str:
"""一次性分析整个代码库。"""
files = collect_code_files(repo_path)
context = "# CODEBASE ANALYSIS\n\n"
for file_path, content in files.items():
context += f"## File: {file_path}\n```\n{content}\n```\n\n"
prompt = f"""
{context}
Analyze this codebase and provide:
1. Architecture overview
2. Main components and their responsibilities
3. Code quality assessment
4. Potential improvements
5. Security considerations
"""
response = model.generate_content(prompt)
return response.text多文档比较#
def compare_documents(docs: list, comparison_criteria: list) -> str:
"""根据指定标准比较多个文档。"""
context = create_structured_context(docs)
criteria_str = "\n".join(f"- {c}" for c in comparison_criteria)
prompt = f"""
{context}
Compare all documents above based on these criteria:
{criteria_str}
Provide a detailed comparison table and analysis.
"""
response = model.generate_content(prompt)
return response.text性能优化#
缓存策略#
import hashlib
from functools import lru_cache
\@lru_cache(maxsize=100)
def cached_analysis(content_hash: str, query: str) -> str:
"""基于内容哈希缓存分析结果。"""
# 通过哈希检索内容并分析
pass
def analyze_with_cache(content: str, query: str) -> str:
"""使用缓存分析内容。"""
content_hash = hashlib.md5(content.encode()).hexdigest()
return cached_analysis(content_hash, query)长输出的流式处理#
def stream_analysis(context: str, query: str):
"""为长输出流式传输分析结果。"""
prompt = f"{context}\n\nQuery: {query}"
response = model.generate_content(
prompt,
stream=True
)
for chunk in response:
yield chunk.text最佳实践#
- 结构化你的上下文:使用清晰的分隔符和标题
- 前置重要信息:将关键信息放在前面
- 使用显式引用:通过名称引用特定部分
- 监控 Token 使用:跟踪消耗以避免截断
- 实现回退策略:为超大的上下文准备应对方案