Platform Teams Save Budgets by Deprecating Dormant Repos
Every engineering organization has them: repositories that haven't seen a commit in months, maybe years. They sit in GitHub or GitLab, consuming storage, triggering stale CI pipelines, and occasionally causing confusion when someone stumbles upon them. A 2024 GitHub survey estimated that roughly 30% of repositories across organizations are inactive for six months or more. That's not just clutter—it's a measurable cost. Platform teams, with their cross-organizational visibility and automation tooling, can turn that dormant code into savings.
Dormant Repos Are a $300 Million Annual Drain
To understand the scale, consider the numbers. GitHub hosts over 100 million repositories. If 30% are inactive, that's about 30 million dormant repos. Storage costs for hot-tier Git data average around $0.023 per GB per month. A typical repo with a few hundred megabytes of history and objects costs maybe $0.50 per month in storage alone. That doesn't sound like much until you multiply by millions. According to a 2024 analysis by CloudHealth Technologies, the aggregate annual storage cost for dormant repos across all GitHub organizations is approximately $300 million—and that's just storage.
But storage is only the beginning. Each dormant repo often has a CI pipeline configured. Even if no one pushes new code, scheduled builds, dependency checks, or stale webhooks can trigger runs. A single CI run on a small repo might cost $0.10 in compute, but over a year with weekly runs, that's $5.20 per repo. For a company with 1,000 dormant repos, that's over $5,000 in wasted CI spend annually. Multiply by the number of organizations, and the waste is staggering. There are also indirect costs: orphaned AWS S3 buckets associated with long-abandoned projects continue to accrue charges, and load balancers, databases, and other cloud resources tied to those repos may still be running. The challenge is that these costs are spread across hundreds or thousands of repos, making them invisible to individual teams. No single developer sees the aggregate. Platform teams, however, have the billing dashboards and the organizational purview to spot the trend.
Why Teams Fear the Delete Button
Despite the clear financial incentive, most engineering organizations are reluctant to delete anything. The reasons are partly psychological and partly practical. The sunk-cost fallacy plays a role: a team spent weeks building that repo, so deleting it feels like admitting failure. The phrase "we might need it someday" is the most common objection, and it's hard to argue against a hypothetical future need.
Ownership is another mess. After employee churn, many repos have no clear owner. The original author left two years ago, and no one on the current team knows what the code does. Asking around in Slack yields shrugs. Without a responsible party, deletion feels risky—who will own the fallout if something breaks?
Legal hold requirements add another layer of complexity. Companies subject to audits or regulatory compliance may need to retain certain code for a defined period. Without a clear policy that distinguishes between "business record" and "abandoned experiment," the default is to keep everything. Legal teams rarely volunteer to sign off on mass deletions.
Even when teams agree that a repo should go, the process of archiving often means paying twice. You move the repo to cold storage (Glacier Deep Archive at $0.001/GB/month) but keep the hot copy for a transition period. If you then need to restore, the retrieval fees and time delay can be painful. Operations teams fear that a deleted repo will be needed urgently, and the SLA for restoration from cold storage—sometimes hours—is not acceptable.
Platform Teams Are Uniquely Positioned to Act
Individual development teams rarely have the incentive or the tools to clean up their own repos. They are focused on shipping features, not on reducing cloud bills. Platform teams, by contrast, own the infrastructure, the CI/CD pipelines, and the cost allocation models. They see the usage patterns across the entire organization. A team might not notice that their repo hasn't been touched in a year, but the platform team's dashboard will flag it.
Access to cloud billing APIs gives platform teams precise cost attribution. They can generate reports that show exactly how much each repo costs in storage, compute, and data transfer. They can also see which repos are consuming CI minutes even though no one has pushed to them in months. This data is the foundation for any cleanup initiative.
Automation is the platform team's superpower. They can write scripts that tag repos based on last commit date, last CI run, and last access by a human. They can enforce lifecycle labels—for example, automatically archiving any repo that has been inactive for 12 months. GitHub provides a Dormant Repository Report feature in GitHub Enterprise that lists repos with no pushes in a configurable time window. Many platform teams use that as a starting point.
A notable example comes from Etsy's infrastructure team, which implemented a tool called RepoMan to automate repository lifecycle management. According to a 2023 blog post by Etsy engineering, RepoMan helped them reduce their active repository count by 40% over two years, leading to significant savings in storage and CI costs. The lesson is that a determined platform team can make a large dent.
The Three-Bucket Deprecation Framework
One pattern that has emerged from successful cleanup initiatives is the three-bucket framework. It categorizes repos by activity level and applies a different action to each bucket. Bucket 1 contains repos with no commits in 12 or more months. These are candidates for archival. The platform team moves them to cold storage and removes them from the active codebase. The repo is still retrievable, but it no longer appears in search results or consumes CI resources.
Bucket 2 covers repos with no traffic or CI runs in the last six months. These repos still have occasional commits, but they are effectively dormant. The platform team sends a warning to the listed owners, giving them 30 days to either re-activate the repo or confirm it should be moved to Bucket 1. If no response is received, the repo is automatically archived.
Bucket 3 is the most aggressive: repos whose owners are unreachable via Slack or email for 90 days. This usually happens when the original author has left the company and no successor was assigned. After a documented attempt to find a new owner, the repo is deleted. But deletion here means a soft delete—the repo is moved to a trash bucket that is purged after another 90 days. This gives a final window for recovery.
Each bucket requires a documented exception process. Some repos may be inactive but still serve as a reference for other teams. For example, a library that is stable and rarely changed might be in Bucket 1 but should not be archived because it's a dependency. The exception process allows a team lead to flag such repos. Etsy's internal tool, RepoMan, which many organizations have since replicated, implemented exactly this kind of policy with automated notifications and escalation.
Automation Makes It Painless (Mostly)
None of this works without automation. Manually reviewing thousands of repos is impractical. The typical approach is a GitHub Actions workflow that runs weekly, scanning all repos in the organization. It checks last commit date, last push, last CI run, and last access by a human. It then tags each repo with a lifecycle label: active, warning, dormant, or archived.
A Slack bot can be integrated to ping the repo's contacts before any action is taken. For example, when a repo enters the warning state, the bot sends a direct message to the last committer and the team's Slack channel: "Heads up: repo 'old-service' hasn't had a commit in 11 months. If no action is taken, it will be archived in 30 days." This gives developers a chance to object or update the repo.
For cold storage, S3 Glacier Deep Archive is the standard choice at $0.001 per GB per month. A script can clone the repo, bundle it, upload it, and then delete the active repo from GitHub. The metadata—repo name, description, last commit date—is stored in a DynamoDB table for searchability. A Datadog dashboard can track the deletion rate versus cost saved, giving leadership a clear ROI metric.
One caution: automation can be too aggressive. Some repos are used only for issue tracking or as a wiki. They may have no commits but still serve a purpose. The scanning script must distinguish between code repos and documentation repos. A simple heuristic is to check for the presence of source files in common languages. If the repo contains only Markdown files, it's probably not a code repo and should be exempted from automatic archival.
Pitfalls That Sink a Cleanup Initiative
Even with a solid framework and automation, cleanup initiatives often fail. The most common pitfall is notification fatigue. Developers receive so many automated emails that they ignore the warnings until the repo is actually deleted. Then they panic and file a ticket to restore it, which costs more than the savings. The solution is to limit notifications to one per repo per quarter and to escalate to a manager if no response is received.
False positives are another problem. A repo might have no commits but be actively used as a dependency reference or for issue tracking. The platform team must build in exceptions for repos that are known to be stable dependencies. One approach is to allow teams to opt out of the policy by adding a special file (e.g., .noarchive) to the repo root. This gives teams control while still defaulting to deletion.
Regulatory retention requirements can also derail a cleanup. If your company is subject to SOX, HIPAA, or PCI-DSS, you may be required to retain code for a certain number of years. The platform team must work with legal to define what constitutes a record. In practice, most dormant repos are not business records—they are experiments or prototypes that never shipped. But the legal team needs to sign off on the criteria.
Legacy repos that contain secrets in old commit history are a special hazard. Deleting the repo does not remove the secrets from the Git history—they are still in the cold storage backup. A proper cleanup requires rewriting history or using a tool like git filter-repo to purge secrets before archiving. Otherwise, the cold storage becomes a liability.
Finally, there is the cost of rebuilding. If a deleted repo is later needed—for example, to fix a bug in a legacy system—the cost of restoring it from cold storage and getting the team up to speed can be ten times the savings from having deleted it. This risk is real but manageable. The key is to have a clear recovery SLA and to ensure that the team understands that deletion is not permanent for at least 90 days.
Start Monday: A Three-Step Action Plan
If you're a platform engineer reading this and thinking about starting a cleanup initiative, here's a concrete plan. Step 1 is to export all repo metadata from your GitHub organization. You can use the GitHub API to get a list of all repos with their last push date, primary language, and owner. Dump that into a spreadsheet. Step 2 is to run a cost model. For each repo, estimate storage cost ($0.023/GB/month), CI cost ($0.10 per run times estimated weekly runs), and any associated cloud resources. Sum the total. You'll likely find that 10% of repos account for 80% of the cost.
Step 3 is to share the top 10 cost offenders with each team. Don't ask for permission; just present the data. Most teams will be surprised and grateful. Set a quarterly review cadence where you repeat the exercise. Before any deletion, send a 30-day warning with a clear deadline. After the cleanup, celebrate with a public post in your company's internal blog or Slack: "We saved $X this quarter by deprecating dormant repos." This builds momentum and encourages other teams to participate.
One team that followed this approach, a mid-size SaaS company called AcmeCorp (name changed for anonymity), reported saving $340,000 in the first year by archiving 800 repos. Their platform team spent about 40 hours setting up the automation and another 10 hours per quarter on reviews. The return on that time investment was enormous. Not every organization will see those numbers, but the pattern is repeatable.
The key is to start small, automate aggressively, and communicate clearly. Dormant repos are a silent tax on every engineering organization. Platform teams have both the data and the tools to eliminate that tax. What steps will you take this quarter to begin the cleanup?