The Old IT Playbook vs the New Automation Playbook
Technology teams have been operating with two different playbooks for decades. The "old IT playbook" emphasizes large, planned rollouts, heavy documentation, and rigid change control. The "new automation playbook" emphasizes short cycles, modular workflows, and using AI components where they fit best.
This post explains the practical differences, the trade-offs, and how to move toward modular automation without abandoning governance or reliability.
What the old playbook actually bought you
The classic enterprise approach focused on predictability and stability. Typical characteristics:
- Big design phases and long requirements cycles
- Large releases that change many systems at once
- Centralized testing and a gate-based approval process
- Heavy documentation and strict rollback plans
Why teams did this: when every change could break core systems, spending time up front reduced unexpected outages. The cost: slower time to value, larger coordination overhead, and brittle integrations when external conditions changed.
What the new automation playbook looks like
The modern approach breaks work into smaller, composable pieces and iterates quickly. Core ideas:
- Modular workflows (small services, connectors, or agents) that do one job well
- Fast feedback loops—deploy small changes, observe, and improve
- Human-in-the-loop checkpoints for sensitive decisions
- Observability and automated testing built into each module
This playbook aims for speed and adaptability. But faster doesn't mean careless—governance, reliability engineering, and clear contracts between modules remain essential.
Key contrasts: rollout speed, scope, and risk management
Rollout speed
- Old: months or quarters between releases.
- New: days to weeks for focused modules.
Scope
- Old: broad scope with many dependencies.
- New: narrow scope per module; integration via well-defined interfaces.
Risk management
- Old: reduce risk by delaying and batching changes.
- New: reduce risk by containing changes, monitoring, and limiting blast radius.
Governance and reliability—realities for teams adopting modular automation
Switching playbooks doesn't eliminate governance needs. Expect to address these practical areas:
Versioning and contracts
- Treat each module as a service with a clear API or data contract.
- Maintain backward compatibility where reasonable.
Testing and validation
- Unit tests for business logic and regression tests for workflows.
- Staging environments that mirror production behavior for critical modules.
Observability and alerts
- Logs, metrics, and traces per module.
- Automated alerts tied to business-level KPIs, not just system errors.
Human oversight
- Define where humans must review outputs (payments, compliance, large-value exceptions).
- Use human-in-the-loop patterns that are measurable and auditable.
Access and change controls
- Role-based access for deploying or modifying modules.
- Clear approval flows for high-impact changes.
Incident and rollback playbooks
- Small modules simplify rollback: disable or replace a module rather than reverting a whole release.
- Maintain incident runbooks that map service alerts to business impacts.
Practical migration steps (start small, instrument everything)
- Pick a narrow, high-value use case
- Example: automate inbound invoice triage rather than reworking the entire accounting system.
- Define clear inputs and outputs
- Build a simple contract for the module—what it expects and what it yields.
- Implement with observability from day one
- Emit structured logs and business metrics (e.g., processing time, error rate, % escalated).
- Add human checkpoints for risk areas
- If confidence is low, route outputs to a human queue before full automation.
- Iterate and expand
- Use telemetry to find failure modes, refine prompts or logic, and then widen the module's scope.
- Standardize governance patterns
- Capture the deployment checklist, test suites, and rollback steps so other teams can reuse them.
Typical failure modes and how to prevent them
Agent or model drift
- Prevent with continuous validation and periodic re-training or rule adjustments.
Interface mismatch between modules
- Prevent by defining and testing contracts, and offering graceful fallbacks.
Over-automation of edge cases
- Keep a queue for exceptions and instrument the rate of exceptions to determine when to automate more.
Blind trust in metrics
- Combine system metrics with business outcomes (e.g., customer satisfaction, revenue impact).
A few short, practical examples
Customer support triage
- Old: big IVR and scripting overhaul.
- New: a triage module sorts tickets and escalates complex cases to humans, with metrics on misclassifications.
Invoice processing
- Old: full OCR replacement across vendors.
- New: a connector for high-volume vendors first, with human review for uncertain matches.
Sales lead enrichment
- Old: long data pipeline projects.
- New: small enrichment service that augments leads and backs out if confidence is low.
Lessons from tech history worth keeping
- Macros and RPA taught us that automation must handle exceptions; those early lessons still apply.
- Microservices taught us the value of small, independently deployable components with clear interfaces.
- The common thread: build small, observe fast, and plan for exceptions.
When to stick with the old playbook
There are scenarios where a slower, more centralized rollout makes sense:
- Regulatory environments that require extensive validation and audit trails
- Monolithic systems where partial changes increase systemic risk
- Situations where organizational change management outweighs technical benefits
In those cases, borrow modular tooling and observability from the new playbook without forcing rapid deployment.
Final trade-offs to accept
- Speed vs. control: faster cycles introduce more frequent changes to manage; invest in automation of testing and deployment.
- Modularity vs. coordination: many small modules need governance and predictable contracts.
- Experimentation vs. stability: start experiments in low-risk areas and scale what proves reliable.
Practical takeaway
If your team wants faster automation outcomes, move incrementally: implement modular workflows for narrow use cases, instrument and monitor them carefully, and maintain human checkpoints where risk is highest. That approach preserves the stability goals of the old playbook while giving you the adaptability of the new one.
