[数字分身] 行为指纹模型 — 向量化 + 偏好权重学习 #53

New Issue

admin · 2026-05-10T15:28:05+08:00

admin commented

2026-05-10 15:28:05 +08:00

目标

将原始行为日志转化为可量化的用户"数字指纹"。

技术方案

1. 行为序列向量化

使用 embedding 模型将行为序列转为向量
行为：{action, context, decision, timestamp} → sentence → embedding
存储到向量数据库（pgvector 或 Qdrant）

2. 偏好权重模型

从决策历史中提取用户偏好权重
示例：PR审核权重 = {security: 0.4, performance: 0.3, readability: 0.2, style: 0.1}
使用 Bradley-Terry 模型或简单频率统计

3. 决策规则提取

从历史决策中提取 if-then 规则
示例：IF 改动文件>10 AND 无测试 THEN 要求补充测试

新增文件

backend/app/models/user_fingerprint.py
backend/app/services/fingerprint_engine.py
定时任务：每周更新指纹模型

## 目标将原始行为日志转化为可量化的用户"数字指纹"。 ## 技术方案 ### 1. 行为序列向量化 - 使用 embedding 模型将行为序列转为向量 - 行为：{action, context, decision, timestamp} → sentence → embedding - 存储到向量数据库（pgvector 或 Qdrant） ### 2. 偏好权重模型 - 从决策历史中提取用户偏好权重 - 示例：PR审核权重 = {security: 0.4, performance: 0.3, readability: 0.2, style: 0.1} - 使用 Bradley-Terry 模型或简单频率统计 ### 3. 决策规则提取 - 从历史决策中提取 if-then 规则 - 示例：IF 改动文件>10 AND 无测试 THEN 要求补充测试 ## 新增文件 - backend/app/models/user_fingerprint.py - backend/app/services/fingerprint_engine.py - 定时任务：每周更新指纹模型

admin closed this issue

2026-05-10 17:58:07 +08:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: admin/aiagent#53