
企业微信

飞书
选择您喜欢的方式加入群聊

扫码添加咨询专家
在构建企业级 LLM 应用时,Prompt 管理和可观测性是两个经常被忽视但至关重要的问题。AskTable 通过 Langfuse 实现了完整的 Prompt 生命周期管理和 LLM 调用追踪,让 AI 应用真正可维护、可优化。
很多团队在开发 LLM 应用时,Prompt 直接硬编码在代码中:
# ❌ 不推荐的做法 system_prompt = """ You are a helpful SQL assistant. Given the following database schema: {schema} Generate SQL query for: {question} """ response = openai.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": system_prompt.format(schema=schema)}, {"role": "user", "content": question} ] )
这种方式的问题:
AskTable 使用 Langfuse 实现了 Prompt 的集中管理和 LLM 调用的完整追踪:
# ✅ 推荐的做法 from app.atserver.ai.llmops import get_prompt # 从 Langfuse 获取 Prompt prompt_client = get_prompt("agent/db_agent") system_prompt = prompt_client.compile(meta=db_meta) # 使用 Langfuse 包装的 OpenAI 客户端 from langfuse.openai import openai response = await openai.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": question} ] )
优势:
加载图表中...
上图展示了 Prompt 的完整迭代流程:在 Langfuse 平台创建新版本,通过 label 机制进行灰度测试,验证效果后推广到生产环境,发现问题可以一键回滚。整个过程无需修改代码或重启服务。
AskTable 实现了一个 Prompt 代理层,支持从 Langfuse 或本地文件加载 Prompt:
class PromptProxy: prompt_cache: dict[str, PromptClient] | None = None _from_local: bool def __init__(self, from_local: bool): self._from_local = from_local if not self._from_local: # 从 Langfuse 运行时获取 Prompt self.prompt_cache = None else: # 从本地 assets/prompts.json 加载 Prompt with open("assets/prompts.json") as f: prompts = json.load(f) prompts_typed = [Prompt_Text(**prompt) for prompt in prompts] self.prompt_cache = { prompt.name: TextPromptClient(prompt) for prompt in prompts_typed } def get_prompt(self, prompt_name: str) -> PromptClient: if not self._from_local: return langfuse.get_prompt(prompt_name, label=config.observer_prompt_label) else: assert self.prompt_cache is not None, "prompt cache is not initialized" if prompt_name in self.prompt_cache: return self.prompt_cache[prompt_name] else: raise errors.NotFound def list_prompts(self) -> list[str]: if not self._from_local: all_prompts = get_all_prompts() return [prompt.name for prompt in all_prompts] else: assert self.prompt_cache is not None, "prompt cache is not initialized" return list(self.prompt_cache.keys()) # 全局单例 _prompt_proxy = PromptProxy(config.at_prompt_local) def get_prompt(prompt_name: str) -> PromptClient: return _prompt_proxy.get_prompt(prompt_name)
设计亮点:
在线模式(生产环境):
from_local = False # 从 Langfuse 实时获取 Prompt # 优势:可以在线修改,无需重启服务
离线模式(开发/测试环境):
from_local = True # 从本地 JSON 文件加载 Prompt # 优势:不依赖外部服务,开发更快
langfuse.get_prompt(prompt_name, label=config.observer_prompt_label)
Label 的作用:
production、staging、developmentvariant_a、variant_b示例场景:
# 生产环境使用稳定版本 config.observer_prompt_label = "production" # 测试环境使用最新版本 config.observer_prompt_label = "latest" # A/B 测试 if user_id % 2 == 0: label = "variant_a" else: label = "variant_b"
Langfuse 支持 Jinja2 模板语法,可以动态注入变量:
# DBAgent 中的使用 system_prompt = get_prompt("agent/db_agent").compile(meta=self.db_meta)
Prompt 模板示例:
You are a database assistant. You have access to the following database schema: {% for schema in meta.schemas %} Schema: {{ schema.name }} Tables: {% for table in schema.tables %} - {{ table.name }} {% endfor %} {% endfor %} Your task is to help users query this database using natural language. Available tools: - show_table: Get detailed information about specific tables - search_metadata: Search for relevant tables and fields - execute_sql: Execute SQL queries Guidelines: 1. First use search_metadata to find relevant tables 2. Then use show_table to get detailed field information 3. Finally generate and execute SQL query 4. Always explain your reasoning
编译后的结果:
You are a database assistant. You have access to the following database schema: Schema: public Tables: - users - orders - products Schema: analytics Tables: - sales_summary - user_behavior Your task is to help users query this database using natural language. ...
优势:
AskTable 使用 Langfuse 包装的 OpenAI 客户端,自动追踪所有 LLM 调用:
from langfuse.openai import openai class AsyncLLMClient: @property def client(self) -> openai.AsyncOpenAI: client, _ = get_current_llm() return client async def create_completion( self, messages: list[dict], model: str = "gpt-4o-mini", tools: list[dict] | None = None, response_format: dict | None = None, temperature: float = 0, prompt_client: PromptClient | None = None, parallel_tool_calls: bool = False, ) -> ChatCompletion: kwargs = get_completion_kwargs( messages=messages, model=model, temperature=temperature, response_format=response_format, tools=tools, parallel_tool_calls=parallel_tool_calls, prompt_client=prompt_client, ) _s = time.perf_counter() response = await self.client.chat.completions.create(**kwargs) log.debug(f"LLM response: {response}") handle_completion_response(response) # 累计 LLM 调用时间 if at_llm_total_duration_var.get(None) is not None: acc = at_llm_total_duration_var.get() at_llm_total_duration_var.set(acc + time.perf_counter() - _s) return response
自动追踪的信息:
初始版本(准确率 70%):
Generate SQL query for: {{ question }} Database schema: {{ schema }}
优化版本 1(准确率 75%):
You are an expert SQL developer. Generate a SQL query to answer the following question: Question: {{ question }} Database schema: {{ schema }} Requirements: - Use proper JOIN syntax - Add appropriate WHERE clauses - Format the output clearly
优化版本 2(准确率 85%):
You are an expert SQL developer. Follow these steps to generate a SQL query: 1. Analyze the question: {{ question }} 2. Identify required tables and fields from the schema below 3. Determine the relationships between tables 4. Generate the SQL query Database schema: {{ schema }} Requirements: - Use explicit JOIN syntax (INNER JOIN, LEFT JOIN, etc.) - Add appropriate WHERE clauses for filtering - Use meaningful aliases for tables - Add comments to explain complex logic - Format the output for readability Example: Question: "Show me total sales by region" SQL: -- Calculate total sales grouped by region SELECT r.region_name, SUM(s.amount) AS total_sales FROM sales s INNER JOIN regions r ON s.region_id = r.id GROUP BY r.region_name ORDER BY total_sales DESC;
迭代流程:
testingproduction无需修改代码,无需重启服务!
目标:对比两种 Prompt 风格的效果
Variant A(简洁风格):
Generate SQL for: {{ question }} Schema: {{ schema }}
Variant B(详细风格):
You are an expert SQL developer... [详细的指导和示例]
实现:
# 根据用户 ID 分流 if hash(user_id) % 2 == 0: label = "variant_a" else: label = "variant_b" prompt_client = langfuse.get_prompt("agent/db_agent", label=label)
在 Langfuse 中对比:
决策:
问题:发现某个 Prompt 的 token 消耗过高
分析:
在 Langfuse 中查看 Prompt 的 token 分布:
优化:
# 优化前:包含所有字段的详细信息 {% for table in schema.tables %} Table: {{ table.name }} Description: {{ table.description }} Fields: {% for field in table.fields %} - {{ field.name }}: {{ field.type }} - {{ field.description }} Sample values: {{ field.sample_values }} Statistics: {{ field.statistics }} {% endfor %} {% endfor %} # 优化后:只包含必要信息 {% for table in schema.tables %} Table: {{ table.name }} Fields: {{ table.fields | map(attribute='name') | join(', ') }} {% endfor %} Use show_table tool to get detailed information when needed.
效果:
问题:用户反馈某个查询生成的 SQL 有错误
追踪流程:
发现问题:
# Trace 显示 User question: "上个月的销售额" Prompt version: v1.2.3 LLM output: "SELECT SUM(amount) FROM sales WHERE date > '2024-02-01'" Error: 生成的 SQL 没有考虑"上个月"的结束日期
修复:
# 在 Prompt 中添加时间处理指导 When handling time-related queries: - "上个月" means the previous calendar month - Use BETWEEN for date ranges - Example: WHERE date BETWEEN '2024-02-01' AND '2024-02-29'
验证:
在 Langfuse 中创建新版本,对比修复前后的效果。
性能指标:
质量指标:
业务指标:
完整的调用链追踪:
Trace: user_query_12345 ├─ Span: plan_generation (2.3s, $0.02) │ ├─ LLM Call: gpt-4o-mini (1.8s, $0.015) │ └─ Tool Call: search_metadata (0.5s) ├─ Span: sql_generation (3.1s, $0.03) │ ├─ Tool Call: show_table (0.3s) │ ├─ LLM Call: gpt-4o (2.5s, $0.025) │ └─ Tool Call: execute_sql (0.3s) └─ Span: result_formatting (0.5s, $0.005) └─ LLM Call: gpt-4o-mini (0.5s, $0.005) Total: 5.9s, $0.055
分析价值:
# 在 Langfuse 中设置告警规则 if avg_latency > 5.0: # 平均延迟超过 5 秒 send_alert("High latency detected") if error_rate > 0.1: # 错误率超过 10% send_alert("High error rate detected") if daily_cost > 1000: # 每日成本超过 $1000 send_alert("High cost detected")
命名规范:
agent/db_agent/v1.0.0 # 主版本.次版本.修订版本 agent/db_agent/v1.1.0 # 新增功能 agent/db_agent/v1.1.1 # Bug 修复
Label 策略:
production:生产环境稳定版本staging:预发布环境测试版本latest:最新开发版本variant_a、variant_b:A/B 测试版本清晰的角色定义:
You are an expert SQL developer with 10 years of experience. Your task is to generate accurate and efficient SQL queries.
明确的任务描述:
Given a natural language question and database schema, generate a SQL query that answers the question.
具体的要求和约束:
Requirements: - Use explicit JOIN syntax - Add appropriate indexes hints if needed - Limit results to 1000 rows by default - Handle NULL values properly
示例和模板:
Example: Question: "Show me top 10 customers by revenue" SQL: SELECT c.customer_name, SUM(o.amount) AS total_revenue FROM customers c INNER JOIN orders o ON c.id = o.customer_id GROUP BY c.customer_name ORDER BY total_revenue DESC LIMIT 10;
使用合适的模型:
# 简单任务用小模型 if task_complexity == "simple": model = "gpt-4o-mini" # 便宜 else: model = "gpt-4o" # 准确 # 或者使用级联策略 try: result = await llm_client.create_completion(model="gpt-4o-mini", ...) if not validate(result): result = await llm_client.create_completion(model="gpt-4o", ...) except: result = await llm_client.create_completion(model="gpt-4o", ...)
优化 Prompt 长度:
# 使用工具调用按需获取信息,而不是一次性传入所有信息 Initial context: {{ summary }} Use show_table tool to get detailed information when needed.
缓存策略:
# 对于相同的问题,缓存结果 cache_key = hash(question + schema_version) if cache_key in cache: return cache[cache_key]
角色分工:
工作流程:
testingproductionLangfuse 为 AskTable 提供了完整的 Prompt 管理和 LLM 可观测性能力:
这些能力让 LLM 应用从"黑盒"变成"白盒",从"不可维护"变成"可持续优化",是构建企业级 AI 应用的必备基础设施。
如果你正在构建 LLM 应用,强烈建议从第一天就集成 Langfuse 或类似的可观测性平台,而不是等到出现问题再补救。