Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Experiment: 10-Repo Small-Scale Analysis

小规模实验:验证 OpenClaw + 子 Agent 架构

实验目标: 验证 OpenClaw 主脑 + 子 Agent 集群分析 repo 的完整流程

实验范围: 10 个最重要的 PingCAP repos

预计时间: 1-2 小时

预计成本: <$0.10 (qwen3.5-plus)


Target Repos (10 Most Important)

基于之前的分析,选择 10 个核心 repo:

#RepoStarsLanguageSizePriorityRationale
1tidb39,859Go652 MBP0核心数据库产品
2tiflow454Go163 MBP0DM + TiCDC
3tidb-operator1,322Go101 MBP0K8s 运维平台
4ossinsight2,320TypeScript642 MBP1OSS 分析平台
5docs616Python411 MBP1官方文档
6tidb-dashboard198TypeScript34 MBP1可视化控制台
7tiup463Go15 MBP1包管理工具
8autoflow2,740TypeScript-P2Graph RAG 知识库
9tidb-vector-python61Python-P2Python SDK
10ticdc45Go-P2CDC 工具

总计: ~2 GB 代码


Experiment Goals

验证目标

✅ 1. OpenClaw 主脑流程
   ├─ 创建子 Agent (sessions_spawn)
   ├─ 收集结果 (sessions_send)
   └─ 进度追踪 (SQLite + JSON)

✅ 2. 子 Agent 分析能力
   ├─ Repo 元数据收集
   ├─ 代码结构分析
   ├─ 依赖关系映射
   ├─ 质量评估
   └─ 合并建议生成

✅ 3. 状态持久化
   ├─ 检查点写入
   ├─ 进度更新
   └─ 恢复机制验证

✅ 4. 动态调度
   ├─ 价值评分 (0-100)
   ├─ 分级 (S/A/B/C)
   └─ Agent 分配调整

✅ 5. 成本验证
   └─ 实际 token 消耗 vs 估算

Experiment Architecture

OpenClaw Orchestration

OpenClaw (Main Session)
   │
   ├─ 1. 创建 .rd-os/ 目录结构
   │
   ├─ 2. 初始化 progress.db
   │
   ├─ 3. 对每个 repo:
   │   │
   │   ├─ 创建分析子 Agent (sessions_spawn)
   │   │   Task: "Analyze {repo_name}"
   │   │   Model: qwen3.5-plus
   │   │   Output: .rd-os/state/agent-states/{repo_id}.json
   │   │
   │   └─ 等待完成 (sessions_send)
   │
   ├─ 4. 收集结果
   │   ├─ 读取输出文件
   │   ├─ 更新 progress.db
   │   └─ 生成综合报告
   │
   └─ 5. 输出实验报告

Sub-Agent Task

Sub-Agent (qwen3.5-plus)
   │
   ├─ 1. 读取 repo 元数据 (GitHub API)
   │
   ├─ 2. 分析代码结构
   │   ├─ 目录结构
   │   ├─ 主要语言
   │   └─ 关键文件
   │
   ├─ 3. 映射依赖关系
   │   ├─ go.mod / package.json / requirements.txt
   │   └─ 内部/外部依赖
   │
   ├─ 4. 评估代码质量
   │   ├─ 测试覆盖率
   │   ├─ 文档完整性
   │   └─ 代码规范
   │
   ├─ 5. 计算价值评分
   │   ├─ 活跃度 (25 分)
   │   ├─ 影响力 (25 分)
   │   ├─ 战略重要性 (25 分)
   │   ├─ 代码质量 (15 分)
   │   └─ 迁移可行性 (10 分)
   │
   ├─ 6. 生成合并建议
   │   ├─ P0/P1/P2/P3/Archive
   │   └─ 迁移优先级
   │
   └─ 7. 输出结果
       └─ .rd-os/state/agent-states/{repo_id}-analysis.json

Execution Plan

Phase 1: Setup (10 minutes)

# 1. 创建 .rd-os/ 目录
mkdir -p 20260301-mono-repo/.rd-os/{state/agent-states,store/artifacts,config}

# 2. 初始化 SQLite 数据库
sqlite3 20260301-mono-repo/.rd-os/store/progress.db <<EOF
CREATE TABLE repos (
    repo_id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    priority TEXT,
    category TEXT,
    created_at TIMESTAMP,
    updated_at TIMESTAMP
);

CREATE TABLE analysis_state (
    repo_id TEXT PRIMARY KEY,
    status TEXT,
    progress_percent INTEGER,
    started_at TIMESTAMP,
    completed_at TIMESTAMP,
    result_json TEXT,
    error_message TEXT
);

CREATE TABLE sub_agents (
    agent_id TEXT PRIMARY KEY,
    type TEXT,
    repo_id TEXT,
    status TEXT,
    spawned_at TIMESTAMP,
    completed_at TIMESTAMP
);
EOF

# 3. 创建 repo 列表
cat > 20260301-mono-repo/.rd-os/config/target-repos.json <<EOF
[
  {"id": "tidb", "name": "pingcap/tidb", "priority": "P0"},
  {"id": "tiflow", "name": "pingcap/tiflow", "priority": "P0"},
  {"id": "tidb-operator", "name": "pingcap/tidb-operator", "priority": "P0"},
  {"id": "ossinsight", "name": "pingcap/ossinsight", "priority": "P1"},
  {"id": "docs", "name": "pingcap/docs", "priority": "P1"},
  {"id": "tidb-dashboard", "name": "pingcap/tidb-dashboard", "priority": "P1"},
  {"id": "tiup", "name": "pingcap/tiup", "priority": "P1"},
  {"id": "autoflow", "name": "pingcap/autoflow", "priority": "P2"},
  {"id": "tidb-vector-python", "name": "pingcap/tidb-vector-python", "priority": "P2"},
  {"id": "ticdc", "name": "pingcap/ticdc", "priority": "P2"}
]
EOF

Phase 2: Analysis (30-60 minutes)

并发:5 个子 Agent 同时运行
批次:2 批 (5 repos/batch)

Batch 1 (P0 repos):
├─ tidb
├─ tiflow
├─ tidb-operator
├─ ossinsight
└─ docs

Batch 2 (P1/P2 repos):
├─ tidb-dashboard
├─ tiup
├─ autoflow
├─ tidb-vector-python
└─ ticdc

Phase 3: Synthesis (15 minutes)

OpenClaw 综合所有结果:
├─ 计算总体统计
├─ 生成价值评分排名
├─ 创建合并建议
└─ 输出实验报告

Expected Output

Per-Repo Analysis

{
  "repo_id": "tidb",
  "repo_name": "pingcap/tidb",
  "analysis_date": "2026-03-01",
  
  "metadata": {
    "stars": 39859,
    "forks": 6126,
    "language": "Go",
    "size_mb": 652,
    "created_at": "2015-09-06",
    "last_push": "2026-02-28"
  },
  
  "value_score": {
    "total": 95,
    "activity": 25,
    "impact": 25,
    "strategic": 25,
    "quality": 12,
    "feasibility": 8
  },
  
  "tier": "S",
  
  "code_structure": {
    "main_components": ["server", "storage", "query", "optimizer"],
    "test_coverage": 78.5,
    "documentation_score": 85
  },
  
  "dependencies": {
    "internal": 12,
    "external": 127,
    "circular": 0
  },
  
  "recommendation": {
    "action": "migrate",
    "priority": "P0",
    "effort": "high",
    "risk": "medium",
    "notes": "Core product, migrate first with dedicated team"
  }
}

Experiment Report

# 10-Repo Experiment Report

## Summary
- Repos analyzed: 10
- Total time: 1.5 hours
- Total cost: $0.08
- Success rate: 100%

## Value Distribution
- S-tier: 1 (tidb: 95)
- A-tier: 3 (tiflow: 75, tidb-operator: 70, ossinsight: 66)
- B-tier: 4 (docs: 62, tiup: 58, tidb-dashboard: 55, autoflow: 52)
- C-tier: 2 (tidb-vector-python: 45, ticdc: 42)

## Recommendations
- P0 (migrate first): tidb, tiflow, tidb-operator
- P1 (migrate second): ossinsight, docs, tidb-dashboard, tiup
- P2 (migrate third): autoflow, tidb-vector-python, ticdc

## Lessons Learned
- [ ] What worked well
- [ ] What needs improvement
- [ ] Adjustments for 400-repo scale

Success Criteria

CriterionTargetActual
Completion10/10 repos analyzedTBD
Success Rate>90%TBD
Time<2 hoursTBD
Cost<$0.20TBD
State PersistenceCheckpoints writtenTBD
RecoveryCan resume after restartTBD
QualityActionable recommendationsTBD

Risk Mitigation

RiskMitigation
API Rate LimitBatch requests, add delays
Sub-Agent FailureCheckpoint + retry
OpenClaw RestartRecovery from progress.db
Token OverrunMonitor usage, set limits
Poor Quality OutputHuman review, iterate template

Next Steps After Experiment

If Successful (>=90% criteria met)

  1. Scale to 400 repos

    • Same architecture, more concurrency
    • Batch processing (50 repos/batch)
    • Estimated time: 8-16 hours
  2. Refine Process

    • Incorporate lessons learned
    • Optimize sub-agent templates
    • Tune value scoring
  3. Begin Migration Planning

    • Use analysis results for migration order
    • Create detailed migration runbook

If Issues (<90% criteria met)

  1. Identify Problems

    • Technical issues?
    • Template issues?
    • Architecture issues?
  2. Fix and Re-run

    • Address root causes
    • Re-run experiment
    • Validate fixes

Experiment Log

To be filled during execution

[2026-03-01 HH:MM] Experiment started
[2026-03-01 HH:MM] Setup complete
[2026-03-01 HH:MM] Batch 1 spawned (5 sub-agents)
[2026-03-01 HH:MM] Batch 1 complete (5/5)
[2026-03-01 HH:MM] Batch 2 spawned (5 sub-agents)
[2026-03-01 HH:MM] Batch 2 complete (5/5)
[2026-03-01 HH:MM] Synthesis complete
[2026-03-01 HH:MM] Experiment finished

Experiment designed for: Large-scale Agentic Engineering