[知识进化] 用户反馈学习 — 点踩/修改后自动调整 Agent 策略 #64

New Issue

admin · 2026-05-10T15:32:19+08:00

admin commented

2026-05-10 15:32:19 +08:00

目标

用户对 Agent 输出点踩或手动修改后，系统自动学习，不在同样场景犯同样错误。

反馈信号采集

点踩（thumbs down）
手动修改 Agent 输出后保存
用户说"不对""重新来""换个方式"
同一问题反复问（暗示上次回答不满）
Task 被驳回（approval rejected）

学习机制

收集 (输入, 输出, 反馈) 三元组
聚类相似场景（embedding 相似度）
生成策略调整：避免产生类似被否定的输出
更新 Agent system prompt（追加反例）
或在知识库中记录"这种场景不要这样做"

策略调整方式

追加 negative examples 到 prompt
调整 Agent 的 temperature（踩多→降温）
替换使用的工具链
推荐更换模型

新增文件

backend/app/services/feedback_learner.py
backend/app/models/feedback_record.py
Celery 定时：每天汇总反馈，生成策略调整建议

## 目标用户对 Agent 输出点踩或手动修改后，系统自动学习，不在同样场景犯同样错误。 ## 反馈信号采集 - 点踩（thumbs down） - 手动修改 Agent 输出后保存 - 用户说"不对""重新来""换个方式" - 同一问题反复问（暗示上次回答不满） - Task 被驳回（approval rejected） ## 学习机制 - 收集 (输入, 输出, 反馈) 三元组 - 聚类相似场景（embedding 相似度） - 生成策略调整：避免产生类似被否定的输出 - 更新 Agent system prompt（追加反例） - 或在知识库中记录"这种场景不要这样做" ## 策略调整方式 - 追加 negative examples 到 prompt - 调整 Agent 的 temperature（踩多→降温） - 替换使用的工具链 - 推荐更换模型 ## 新增文件 - backend/app/services/feedback_learner.py - backend/app/models/feedback_record.py - Celery 定时：每天汇总反馈，生成策略调整建议

admin closed this issue

2026-05-10 17:58:13 +08:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: admin/aiagent#64