Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Google Monorepo Lessons Learned

Key Insights from Google’s 2 Billion Line Monorepo

Research summary for TiDB Mono-Repo Consolidation Project


Scale Comparison

MetricGoogleTiDB Target
Lines of Code2 billion~39GB (TBD)
Engineers25,000+TBD
Commits/day45,000TBD
Files9 millionTBD
Storage86 TB39 GB

Key Insight: Google proves monorepo scales to extreme levels with right tooling.


Core Principles (Google’s Playbook)

1. Single Source of Truth

✅ ONE repository for 95% of codebase
✅ No submodules
✅ No complex cross-repo dependency graphs
✅ No "which version should I use?" problems

TiDB Application: All 400 repos → 1 mono-repo


2. Trunk-Based Development

main (trunk)
  │
  ├── Developers commit directly to main
  ├── Code review BEFORE merge (pre-commit)
  ├── Release branches for deployment only
  └── Feature flags for incomplete features

Benefits:

  • No merge nightmares from long-lived branches
  • Early integration conflict detection
  • Continuous delivery enabled

TiDB Application: Adopt trunk-based from day 1


3. Code Ownership & Visibility

Default: OPEN ACCESS
  - All engineers can read all code
  - Traceability built-in
  - Exceptions: restricted files (security, legal)

Ownership: Workspace-based
  - Each directory has owning team
  - Responsible engineer identified
  - CODEOWNERS enforcement

TiDB Application:

  • Default open access within engineering
  • CODEOWNERS file for each component
  • Clear ownership boundaries

4. Build System: Bazel

Key Features:
  - Incremental builds (only changed targets)
  - Remote caching (share build artifacts)
  - Parallel execution
  - Dependency graph analysis
  - Hermetic builds (reproducible)

Why It Matters:

  • 2B LOC builds in minutes, not hours
  • Developers get fast feedback
  • CI/CD scales efficiently

TiDB Application:

  • Evaluate: Bazel vs Turborepo vs Nx
  • Depends on tech stack (Go/Java/TS?)
  • Must support incremental builds

5. Dependency Management

Google's Approach:
  - All dependencies visible in one graph
  - No circular dependencies (enforced)
  - Breaking changes caught immediately
  - Automated dependency updates

Tooling:
  - Static analysis for dependency detection
  - Automated refactoring for API changes
  - Impact analysis before changes

TiDB Application:

  • Map all 400 repos’ dependencies
  • Identify circular dependencies early
  • Build dependency visualization tool

6. Automated Code Review

Pre-commit Review:
  - All changes reviewed before merge
  - Automated checks (lint, tests, security)
  - Human review for logic/approval
  - OWNERS file defines reviewers

Scale Solution:
  - Automated systems make 24,000 commits/day
  - 500,000 requests/second to review system
  - Most commits are automated (refactoring, cleanup)

TiDB Application:

  • Automated PR checks (CI/CD)
  • CODEOWNERS for review assignment
  • AI-assisted code review (future)

7. Infrastructure: Piper + CitC

Piper (Version Control):
  - Custom distributed filesystem
  - Handles 86TB efficiently
  - Supports 40,000 commits/day

CitC (Client in the Cloud):
  - Lightweight checkout
  - Downloads only modified files
  - Cloud-based browsing/editing

CodeSearch:
  - Fast search across entire codebase
  - Cross-workspace search
  - IDE integration (Eclipse, Emacs plugins)

TiDB Application:

  • Use Git (not custom VCS)
  • Shallow clones for agents
  • Implement fast code search (Sourcegraph/Zoekt)

Google’s Monorepo Challenges & Solutions

ChallengeGoogle’s SolutionTiDB Application
Download timeCitC (partial checkout)Shallow clones, sparse checkout
Slow searchCodeSearch engineSourcegraph / Zoekt
Build timeBazel (incremental)Bazel/Turborepo/Nx
Dependency hellSingle version, automated updatesDependency graph tooling
Code review scaleAutomated pre-checks + OWNERSGitHub/GitLab CODEOWNERS
Merge conflictsTrunk-based, small commitsTrunk-based development
Access controlDefault open, exceptions restrictedDirectory-based permissions

AI-Specific Opportunities (Beyond Google)

Google built their system before AI was mainstream. We have an advantage:

What Google Does (Human-Centric)

Human engineers:
  - Write code
  - Review code
  - Fix dependencies
  - Run builds
  - Deploy services

Automation:
  - Code formatting
  - Dependency updates
  - Build optimization
  - Test execution

What We Can Do (AI-First)

AI agents:
  - Write code (feature development)
  - Review code (automated PR review)
  - Fix dependencies (automated refactoring)
  - Optimize builds (AI-driven caching)
  - Deploy services (auto-scaling decisions)

Humans:
  - Define problems
  - Set priorities
  - Review architecture
  - Handle edge cases

Key Difference: Google automated processes. We can automate decisions.


Layer 1: Repository Structure

mono-repo/
├── products/          # TiDB, TiDB Next-Gen
├── platform/          # Cloud SaaS, control plane
├── devops/            # Operations tools
├── libs/              # Shared libraries
├── tools/             # Build/dev tools
└── infra/             # Infrastructure as code

Layer 2: Build System

Recommendation: Evaluate based on tech stack
- Go: Bazel or Please
- TypeScript: Turborepo or Nx
- Java: Bazel or Gradle
- Mixed: Bazel (most flexible)

Layer 3: Code Ownership

CODEOWNERS file:
- products/tidb/*         @tidb-core-team
- platform/cloud/*        @cloud-platform-team
- devops/*                @devops-team
- libs/*                  @platform-architects

Layer 4: CI/CD

Path-based triggering:
- Changes to products/tidb/* → Run TiDB tests
- Changes to platform/* → Run platform tests
- Changes to libs/* → Run all tests (shared code)

Layer 5: AI Agent Integration

400+ Repo Agents:
- Each agent owns one legacy repo
- Agents analyze, recommend, migrate
- Post-migration: agents become component guardians

Orchestrator Agent:
- Coordinates agents
- Makes cross-component decisions
- Optimizes system-wide

Migration Strategy (Google-Inspired)

Phase 1: Analysis (Week 1-2)

  • Inventory all 400 repos
  • Map dependencies
  • Identify owners
  • Score by activity/usage

Phase 2: Infrastructure (Week 2-3)

  • Set up mono-repo structure
  • Configure build system
  • Set up CI/CD with path filtering
  • Implement CODEOWNERS

Phase 3: Pilot Migration (Week 3-4)

  • Migrate 10-20 repos (P0 priority)
  • Validate build/test/deploy
  • Refine process

Phase 4: Bulk Migration (Week 4-8)

  • Migrate remaining repos in batches
  • Automated refactoring where possible
  • Archive old repos

Phase 5: AI Enablement (Week 8+)

  • Deploy agent infrastructure
  • Enable AI code review
  • Enable AI-driven refactoring
  • Enable AI deployment optimization

Success Metrics (Inspired by Google)

MetricTarget
Build time (incremental)<5 minutes
Build time (full)<30 minutes
PR review time<4 hours
Merge conflicts/week<10
AI-completed features20% (6mo), 50% (12mo)
Automated refactoring/week100+

Key Takeaways

  1. Monorepo scales — Google proves 2B+ LOC is viable
  2. Tooling is critical — Can’t do this without proper build/search/review tools
  3. Culture matters — Trunk-based, open access, small commits
  4. Automation is key — Google’s automation does 24k commits/day
  5. AI is our advantage — We can go beyond Google’s human-centric model

Sources:

  • https://cacm.acm.org/research/why-google-stores-billions-of-lines-of-code-in-a-single-repository/
  • https://qeunit.com/blog/how-google-does-monorepo/
  • https://medium.com/@sohail_saifi/the-monorepo-strategy-that-scaled-google-to-2-billion-lines-of-code
  • https://bazel.build/