Google Monorepo Lessons Learned
Key Insights from Google’s 2 Billion Line Monorepo
Research summary for TiDB Mono-Repo Consolidation Project
Scale Comparison
| Metric | TiDB Target | |
|---|---|---|
| Lines of Code | 2 billion | ~39GB (TBD) |
| Engineers | 25,000+ | TBD |
| Commits/day | 45,000 | TBD |
| Files | 9 million | TBD |
| Storage | 86 TB | 39 GB |
Key Insight: Google proves monorepo scales to extreme levels with right tooling.
Core Principles (Google’s Playbook)
1. Single Source of Truth
✅ ONE repository for 95% of codebase
✅ No submodules
✅ No complex cross-repo dependency graphs
✅ No "which version should I use?" problems
TiDB Application: All 400 repos → 1 mono-repo
2. Trunk-Based Development
main (trunk)
│
├── Developers commit directly to main
├── Code review BEFORE merge (pre-commit)
├── Release branches for deployment only
└── Feature flags for incomplete features
Benefits:
- No merge nightmares from long-lived branches
- Early integration conflict detection
- Continuous delivery enabled
TiDB Application: Adopt trunk-based from day 1
3. Code Ownership & Visibility
Default: OPEN ACCESS
- All engineers can read all code
- Traceability built-in
- Exceptions: restricted files (security, legal)
Ownership: Workspace-based
- Each directory has owning team
- Responsible engineer identified
- CODEOWNERS enforcement
TiDB Application:
- Default open access within engineering
- CODEOWNERS file for each component
- Clear ownership boundaries
4. Build System: Bazel
Key Features:
- Incremental builds (only changed targets)
- Remote caching (share build artifacts)
- Parallel execution
- Dependency graph analysis
- Hermetic builds (reproducible)
Why It Matters:
- 2B LOC builds in minutes, not hours
- Developers get fast feedback
- CI/CD scales efficiently
TiDB Application:
- Evaluate: Bazel vs Turborepo vs Nx
- Depends on tech stack (Go/Java/TS?)
- Must support incremental builds
5. Dependency Management
Google's Approach:
- All dependencies visible in one graph
- No circular dependencies (enforced)
- Breaking changes caught immediately
- Automated dependency updates
Tooling:
- Static analysis for dependency detection
- Automated refactoring for API changes
- Impact analysis before changes
TiDB Application:
- Map all 400 repos’ dependencies
- Identify circular dependencies early
- Build dependency visualization tool
6. Automated Code Review
Pre-commit Review:
- All changes reviewed before merge
- Automated checks (lint, tests, security)
- Human review for logic/approval
- OWNERS file defines reviewers
Scale Solution:
- Automated systems make 24,000 commits/day
- 500,000 requests/second to review system
- Most commits are automated (refactoring, cleanup)
TiDB Application:
- Automated PR checks (CI/CD)
- CODEOWNERS for review assignment
- AI-assisted code review (future)
7. Infrastructure: Piper + CitC
Piper (Version Control):
- Custom distributed filesystem
- Handles 86TB efficiently
- Supports 40,000 commits/day
CitC (Client in the Cloud):
- Lightweight checkout
- Downloads only modified files
- Cloud-based browsing/editing
CodeSearch:
- Fast search across entire codebase
- Cross-workspace search
- IDE integration (Eclipse, Emacs plugins)
TiDB Application:
- Use Git (not custom VCS)
- Shallow clones for agents
- Implement fast code search (Sourcegraph/Zoekt)
Google’s Monorepo Challenges & Solutions
| Challenge | Google’s Solution | TiDB Application |
|---|---|---|
| Download time | CitC (partial checkout) | Shallow clones, sparse checkout |
| Slow search | CodeSearch engine | Sourcegraph / Zoekt |
| Build time | Bazel (incremental) | Bazel/Turborepo/Nx |
| Dependency hell | Single version, automated updates | Dependency graph tooling |
| Code review scale | Automated pre-checks + OWNERS | GitHub/GitLab CODEOWNERS |
| Merge conflicts | Trunk-based, small commits | Trunk-based development |
| Access control | Default open, exceptions restricted | Directory-based permissions |
AI-Specific Opportunities (Beyond Google)
Google built their system before AI was mainstream. We have an advantage:
What Google Does (Human-Centric)
Human engineers:
- Write code
- Review code
- Fix dependencies
- Run builds
- Deploy services
Automation:
- Code formatting
- Dependency updates
- Build optimization
- Test execution
What We Can Do (AI-First)
AI agents:
- Write code (feature development)
- Review code (automated PR review)
- Fix dependencies (automated refactoring)
- Optimize builds (AI-driven caching)
- Deploy services (auto-scaling decisions)
Humans:
- Define problems
- Set priorities
- Review architecture
- Handle edge cases
Key Difference: Google automated processes. We can automate decisions.
Recommended Architecture for TiDB
Layer 1: Repository Structure
mono-repo/
├── products/ # TiDB, TiDB Next-Gen
├── platform/ # Cloud SaaS, control plane
├── devops/ # Operations tools
├── libs/ # Shared libraries
├── tools/ # Build/dev tools
└── infra/ # Infrastructure as code
Layer 2: Build System
Recommendation: Evaluate based on tech stack
- Go: Bazel or Please
- TypeScript: Turborepo or Nx
- Java: Bazel or Gradle
- Mixed: Bazel (most flexible)
Layer 3: Code Ownership
CODEOWNERS file:
- products/tidb/* @tidb-core-team
- platform/cloud/* @cloud-platform-team
- devops/* @devops-team
- libs/* @platform-architects
Layer 4: CI/CD
Path-based triggering:
- Changes to products/tidb/* → Run TiDB tests
- Changes to platform/* → Run platform tests
- Changes to libs/* → Run all tests (shared code)
Layer 5: AI Agent Integration
400+ Repo Agents:
- Each agent owns one legacy repo
- Agents analyze, recommend, migrate
- Post-migration: agents become component guardians
Orchestrator Agent:
- Coordinates agents
- Makes cross-component decisions
- Optimizes system-wide
Migration Strategy (Google-Inspired)
Phase 1: Analysis (Week 1-2)
- Inventory all 400 repos
- Map dependencies
- Identify owners
- Score by activity/usage
Phase 2: Infrastructure (Week 2-3)
- Set up mono-repo structure
- Configure build system
- Set up CI/CD with path filtering
- Implement CODEOWNERS
Phase 3: Pilot Migration (Week 3-4)
- Migrate 10-20 repos (P0 priority)
- Validate build/test/deploy
- Refine process
Phase 4: Bulk Migration (Week 4-8)
- Migrate remaining repos in batches
- Automated refactoring where possible
- Archive old repos
Phase 5: AI Enablement (Week 8+)
- Deploy agent infrastructure
- Enable AI code review
- Enable AI-driven refactoring
- Enable AI deployment optimization
Success Metrics (Inspired by Google)
| Metric | Target |
|---|---|
| Build time (incremental) | <5 minutes |
| Build time (full) | <30 minutes |
| PR review time | <4 hours |
| Merge conflicts/week | <10 |
| AI-completed features | 20% (6mo), 50% (12mo) |
| Automated refactoring/week | 100+ |
Key Takeaways
- Monorepo scales — Google proves 2B+ LOC is viable
- Tooling is critical — Can’t do this without proper build/search/review tools
- Culture matters — Trunk-based, open access, small commits
- Automation is key — Google’s automation does 24k commits/day
- AI is our advantage — We can go beyond Google’s human-centric model
Sources:
- https://cacm.acm.org/research/why-google-stores-billions-of-lines-of-code-in-a-single-repository/
- https://qeunit.com/blog/how-google-does-monorepo/
- https://medium.com/@sohail_saifi/the-monorepo-strategy-that-scaled-google-to-2-billion-lines-of-code
- https://bazel.build/