Incident Response Plan Template (for SMBs)

Security readiness for Canadian SMBs

If a phishing email turns into an account takeover, or a ransomware event locks down your file shares, your team should not be deciding roles, tools, and communications in real time. Use the practical template in this guide to build a clear, testable incident response plan (IRP) for your business, then keep it current as your Microsoft 365 and vendor stack changes.

Last updated: Reading time: ~12 minutes Focus: incident response planning, Microsoft 365, ransomware, BEC
Security-first planning
60-second next step

Book a Security Snapshot + Incident Response Readiness Review

Switching from an underperforming MSP or running lean on internal IT? We will map your real-world incident flow across Microsoft 365, endpoints, backups, and key vendors, then identify fast fixes that reduce downtime, alert fatigue, and recovery risk.

Security-first. National reach. Local service. Get a clear plan, owners, and escalation paths before the next incident.

Cybersecurity services Microsoft 365 centric Built for SMB buying committees Clear SLAs and escalation
Important: This article is educational and intended to help you draft a practical plan. It is not legal advice. If you operate in a regulated environment, review your breach notification and evidence handling steps with your privacy and legal counsel, plus your cyber insurance provider.

What an incident response plan should do (and why most SMB plans fail)

An incident response plan is a written, pre-approved playbook for how your organization detects, responds to, and recovers from incidents. The key word is pre-approved. During an event, your team is under pressure, information is incomplete, and every action can affect downtime, data loss, and liability. A good plan removes uncertainty by defining who decides, who executes, and how communications happen across IT, leadership, vendors, and insurers.

SMB plans often fail for simple reasons. They are too generic, they are hidden in a folder nobody opens, or they treat incident response as an IT-only activity. Real incidents hit multiple functions: operations needs systems restored, finance needs to understand exposure, HR may handle employee impact, compliance needs documented decisions, and executives must approve time-sensitive choices. When the plan is vague, teams either freeze or improvise, and both outcomes increase cost.

The goal is not a perfect binder. The goal is a usable system: a one-page summary for fast execution, plus deeper runbooks for the incidents you are most likely to face (phishing, business email compromise, ransomware, lost devices, and SaaS account takeover). If you build it that way, you can run tabletop exercises, improve weak links, and shorten recovery time.

Common SMB reality

Too many tools, too many notifications, unclear priorities, and delayed patching or identity controls. An incident response plan turns chaos into a queue: severity, owners, decisions, and actions.

What success looks like

In the first 30 minutes you know: incident commander, severity, containment steps, communications lead, and whether insurance or legal needs to be engaged.

The incident response lifecycle you can actually operationalize

Most practical incident response programs follow a simple lifecycle: preparation, detection and analysis, containment, eradication, recovery, and post-incident improvement. You do not need a massive enterprise program to use this structure. You need a consistent rhythm so incidents do not become one-off fire drills.

Preparation is where SMBs win. If you pre-stage access, logging, backups, escalation paths, and decision authority, you cut the time spent searching for information when it matters most. Detection and analysis is where you confirm what is happening, prioritize the incident, and decide the first containment actions. Containment, eradication, and recovery are where you stop the bleeding, remove the cause, and bring services back safely. Post-incident activity is how you convert pain into improved controls, better runbooks, and fewer repeats.

Canadian reference: For a Canada-specific overview of what an incident response plan includes, see the Canadian Centre for Cyber Security guidance on developing an incident response plan (ITSAP.40.003) and the CyberSecure Canada fillable template example for organizing documentation.

Quick start: the 1-page incident response plan (copy this first)

Before you write long runbooks, build a one-page IRP summary. This is the page you print, store in a shared location, and keep offline. It tells anyone what to do in the first moments of a suspected incident, even if key people are unavailable.

[ONE-PAGE INCIDENT RESPONSE PLAN SUMMARY]

Organization:
Plan owner:
Version:
Last updated:
Approved by:

1) Declare an incident (trigger)
- Examples: suspected account takeover, ransomware note, data exfiltration alert, vendor breach notice, unusual outbound traffic, repeated MFA prompts, missing devices.
- Who can declare: ______________________
- How to declare: Teams channel / phone tree / incident ticket / SMS group

2) Incident commander (IC)
- Primary: ______________________  Backup: ______________________
- IC responsibilities: assign roles, set severity, approve containment actions, run status cadence, document decisions.

3) Severity level (pick one)
- SEV-1: safety/legal risk, widespread outage, ransomware, confirmed data exfiltration
- SEV-2: limited outage or compromise, high-risk identity event
- SEV-3: suspicious activity, contained malware, single device compromise
- SEV-4: low-risk alert, false positive, routine security task

4) First containment actions (do not wait)
- Disable affected accounts and revoke sessions (identity team)
- Isolate impacted endpoint(s) from network (endpoint team)
- Preserve evidence: keep logs, snapshots, copies of emails (forensics lead)
- Notify leadership and communications lead (comms lead)

5) Critical contacts (24/7)
- MSP / IT partner: ______________________
- Cyber insurance hotline: ______________________
- Legal / breach coach: ______________________
- PR / communications support: ______________________
- Cloud vendor support (Microsoft, etc): ______________________
- Law enforcement (if applicable): ______________________

6) Status cadence
- SEV-1: every 30 minutes
- SEV-2: every 60 minutes
- SEV-3: twice daily
- SEV-4: weekly review

7) Recovery principle
- Restore clean first, then reconnect.
- Do not re-enable accounts or systems until containment and root cause are verified.

8) Post-incident review
- Hold within 5 business days.
- Assign remediation owners and due dates.
Fast win: If you only do one thing this week, complete the contact list, designate an incident commander and backup, and write down your first containment actions. That alone reduces response time dramatically.

Define roles, responsibilities, and decision rights (RACI for SMBs)

The most common incident response failure is unclear ownership. Someone sees a suspicious alert, but no one knows who can approve containment actions like disabling accounts, isolating machines, or pausing email forwarding rules. Your plan should define a small incident response team with explicit decision rights. In SMBs, one person often holds multiple roles. That is fine, as long as you name backups.

Start with these core roles: incident commander, technical lead, identity lead (Microsoft 365 and Entra ID), communications lead, and executive sponsor. Add optional roles as needed: privacy or compliance lead, HR lead, vendor liaison, and finance lead (especially if cyber insurance is involved).

Activity Incident Commander Technical Lead Identity Lead (M365) Comms Lead Executive Sponsor
Declare incident and assign severity R/A C C C I
Containment actions (isolation, access restrictions) A R R I C
Engage cyber insurance, legal, external IR R C C C A
Internal status updates and external messaging C C C R/A A (final approval)
Evidence preservation and documentation A R R I I
Post-incident review and remediation tracking R/A R R C C

Create a severity matrix with response targets you can meet

Severity is how you turn a flood of alerts into a manageable queue. The goal is not to over-classify everything as urgent. The goal is to align severity to business impact and risk, then attach response targets that fit your staffing reality. If you have a 24/7 SOC or MDR service, you can hold faster response targets. If you do not, your plan should specify what happens after hours and who is on-call for true emergencies.

Below is a practical severity matrix you can adapt. Use examples that match your environment: Microsoft 365 account takeover attempts, suspicious mailbox rule creation, unusual outbound data transfer, and ransomware indicators on endpoints or servers.

Severity Business impact Examples Target: acknowledge Target: initial containment
SEV-1 Major outage, confirmed ransomware, confirmed data exfiltration, safety or legal risk Ransom note, widespread encryption, privileged account takeover, active lateral movement 15 minutes 60 minutes
SEV-2 Material impact, limited compromise, high-risk identity event Single server down, CFO mailbox compromised, suspicious OAuth app consent, repeated MFA fatigue attack 60 minutes 4 hours
SEV-3 Contained threat, localized device issue, suspicious activity needing investigation Malware on one endpoint, blocked phishing, abnormal sign-in pattern with no evidence of compromise Same business day 1 business day
SEV-4 Low risk, noise, routine security task False positive alert, benign scan, policy tuning request Within 5 business days N/A
Pitfall to avoid: If everything is SEV-1, nothing is. Over-escalation leads to fatigue, missed real incidents, and slower response. Calibrate severity monthly based on actual events and recovery effort.

Build your contact list and escalation tree (include insurers and vendors)

During an incident, people waste time hunting for phone numbers, support portals, and contract details. Your plan should include a single contact list with after-hours information. Keep a copy offline in case your email or file systems are impacted. If you have cyber insurance, confirm your policy requirements in advance. Many policies require that you contact their hotline quickly and use approved vendors, and delays can complicate claims.

Role / Vendor Primary contact Backup 24/7 method Notes
Incident Commander Name, phone Name, phone Mobile, SMS Approves containment and escalation
MSP / IT partner Team, phone, email Escalation manager Support hotline Include contract ID and portal link
Cyber insurance Hotline, policy # Broker contact Hotline Confirm breach coach requirements
Legal / privacy counsel Name, phone Name, phone Mobile Notification and privilege guidance
Microsoft support Tenant admin contact Backup admin Support portal Include tenant ID and admin roles
Backup provider Name, phone Name, phone Emergency line Document restore order and credentials
PR / communications Name, phone Name, phone Mobile External messaging approval workflow

Communications: scripts, approvals, and what not to say

Communications is a control, not a nice-to-have. Poor messaging can increase liability, confuse employees, and compromise an investigation. Your plan should specify who communicates internally, who communicates externally, and who approves each message type. It should also include pre-written scripts for common scenarios so you are not writing from scratch at 2 a.m.

For SMBs, you typically need four communication streams: employees, customers or partners, vendors, and leadership. In regulated industries, you may also need to notify a privacy office, regulator, or professional body depending on the nature of the incident. Your legal and privacy counsel should review this part of the plan so you do not accidentally admit fault, confirm details that are not verified, or share information that assists an attacker.

Internal employee script (short)

“We are investigating a security event affecting some systems. Please do not click unexpected links, do not approve MFA prompts you did not initiate, and report suspicious emails immediately. We will provide updates at [cadence].”

Customer or partner holding statement

“We identified an event and are working with specialists to contain and remediate it. We will share verified information as it becomes available and will notify affected parties as required.”

Communications rule: Confirm facts before communicating details. Keep a decision log and timestamp every statement, including who approved it and what evidence supported it.

Microsoft 365 focused runbooks (identity and email come first)

For Microsoft 365 centric SMBs, many incidents start with identity or email: phishing, MFA fatigue, OAuth consent abuse, mailbox rule manipulation, or credential reuse. That is why your incident response plan should include a Microsoft 365 runbook that emphasizes rapid containment, evidence preservation, and least-privilege admin access.

The goal is not to memorize every portal. The goal is to define a repeatable set of actions: identify impacted accounts, stop attacker sessions, block persistence mechanisms, and document what you did. If you use Microsoft Defender, Sentinel, or an MDR service, align your plan to how alerts are triaged and how escalations happen after hours. If you do not have 24/7 monitoring, your plan should specify triggers that wake someone up.

Microsoft 365 account takeover: first 30 minutes checklist

  • Confirm the signal: unusual sign-in location, impossible travel, multiple MFA prompts, mailbox rule creation, unusual forwarding, or risky OAuth consent.
  • Contain identity: disable sign-in for impacted user(s) or reset credentials, revoke sessions, and require re-authentication.
  • Block persistence: remove suspicious mailbox rules, disable external forwarding where possible, and review newly added MFA methods.
  • Preserve evidence: export relevant sign-in logs, audit events, message traces, and timestamps of actions taken.
  • Scope blast radius: check for admin role changes, consented apps, other users receiving MFA prompts, and unusual inbox activity.
Related MSP Corp services: If you need help building repeatable identity controls and response paths, explore Cybersecurity Services, GuardianShield MDR, and Network Penetration Testing.
Risk management and cybersecurity planning concept image
Incident response is part of risk management. The plan is only valuable if it is exercised, measured, and continuously improved.

Technical playbooks for the incidents SMBs face most often

Your plan should include short playbooks for your top incident types. Each playbook should answer four questions: how do we detect it, how do we contain it, how do we eradicate it, and how do we recover? Keep the playbooks action-oriented, with decision points, not long theory. Below are three high-impact playbooks you can copy and tailor.

Ransomware

Prioritize isolation, evidence preservation, and restore validation. Do not reconnect systems until you confirm containment and root cause.

Business Email Compromise

Stop sessions, remove persistence (rules, forwarding, OAuth apps), and notify finance quickly if payments or invoices are involved.

Lost or stolen device

Remote wipe, revoke tokens, confirm disk encryption, and check for credential reuse. Treat executive devices as higher risk.

Playbook A: ransomware or suspected encryption activity

  • Contain: isolate impacted endpoints and servers from the network. Disable compromised accounts and remove admin privileges if needed.
  • Preserve: capture a snapshot of impacted systems where possible, export logs, and store copies offline. Do not wipe before collecting evidence.
  • Scope: identify patient zero, shared drives affected, backup exposure, and whether data exfiltration indicators exist.
  • Decide: engage cyber insurance and legal early. Define decision authority for downtime tradeoffs and restoration sequencing.
  • Recover: restore from known-good backups, validate integrity, patch root cause, rotate credentials, and monitor for reinfection.

Playbook B: business email compromise (BEC) and invoice fraud risk

  • Contain: disable sign-in or reset credentials, revoke sessions, and remove suspicious MFA methods.
  • Remove persistence: delete suspicious inbox rules, forwarding settings, and unknown OAuth app grants.
  • Search for impact: identify recipients of malicious emails, check for payment instruction changes, and lock down finance workflows.
  • Communicate: notify internal teams to validate payment changes out-of-band. If fraud occurred, contact your bank immediately.
  • Harden: enforce MFA, conditional access, and disable external forwarding if possible. Train targeted departments (finance, executive assistants).

Playbook C: lost or stolen laptop, phone, or tablet

  • Contain: revoke sessions and tokens, disable the account if compromise is suspected, and trigger remote lock or wipe.
  • Validate protections: confirm full disk encryption, device compliance status, and whether local data was accessible offline.
  • Scope: review sign-in logs for unusual access after the loss event.
  • Recover: issue replacement device, re-enroll, and rotate credentials for sensitive accounts.

Evidence handling and documentation (make it easy to prove what happened)

Evidence preservation is not only for large enterprises. For SMBs, good documentation protects you during insurance claims, audits, and customer inquiries, and it helps you avoid repeating the incident. Your plan should define a simple chain-of-custody process: what evidence you collect, who collects it, where it is stored, and who can access it.

Keep it practical: log exports, screenshots, message traces, and a timeline of actions taken. Do not store evidence only in systems that could be compromised during the incident. Maintain an offline or segregated storage location for incident artifacts. Assign one person as documentation lead if possible, or ensure the incident commander maintains a decision log.

[CHAIN OF CUSTODY / EVIDENCE LOG]

Incident ID:
Date opened:
Collected by:
Storage location (secure):

Item # | Evidence type | Source system | Time range | Hash/identifier | Access restrictions | Notes
1      | Sign-in logs  | Entra ID      |            |                 |                     |
2      | Audit logs    | Microsoft 365 |            |                 |                     |
3      | Email trace   | Exchange      |            |                 |                     |
4      | Disk image    | Endpoint      |            |                 |                     |
5      | Screenshots   | User report   |            |                 |                     |

Decision log:
Timestamp | Decision | Approved by | Evidence used | Follow-up action | Owner

Recovery and restoration: how to avoid reinfection and repeat outages

Recovery is where many SMBs unintentionally prolong incidents. The urge to “get back online” can cause teams to reconnect systems before the root cause is removed. That is how reinfection happens, and why some incidents come in waves. Your incident response plan should define a recovery principle: restore clean first, validate, then reconnect.

Tie your incident response plan to your backup and disaster recovery approach. If you are not regularly testing restores, your backup strategy is a hope, not a control. Define which systems get restored first and why: identity services, critical business applications, file stores, and then user endpoints. Align recovery targets (RTO and RPO) to business needs, not guesses. If your current MSP or internal process cannot meet those targets, that is a signal to redesign.

Recovery checklist:
  • Validate backups are not accessible with the same credentials used during normal operations.
  • Test at least one restore path quarterly for a critical system and a random endpoint.
  • Document restore order and who can approve downtime tradeoffs.
  • After restore, increase monitoring and tighten identity controls for at least 14 days.

Testing and maintenance: the plan that is not exercised is not a plan

A plan that has never been exercised will fail at the exact moment you need it. Testing does not need to be expensive. Start with a quarterly tabletop exercise: walk through a realistic scenario with leadership, IT, and operations. Focus on decision points, not technical detail. Ask: who declares the incident, who approves containment, how do we contact insurance, and what do we tell employees?

Then schedule an annual technical exercise that validates backups, account lockdown procedures, and evidence collection. Every major change is also a trigger: new Microsoft 365 security policies, a new backup vendor, a new line-of-business application, or a switch of MSP. Treat your IR plan like a living system that is versioned, reviewed, and approved.

Maintenance schedule you can copy

  • Monthly: review top security incidents and tune severity matrix and alert rules.
  • Quarterly: tabletop exercise, update contact list, validate incident commander backups.
  • Semi-annual: restore test for a critical system, plus a random user endpoint.
  • Annual: full plan review and executive approval, vendor list validation, and training refresh for finance and leadership.
Built for SMB teams

Start Your Transition to Better IT

If your current provider is reactive, slow, or unclear on security ownership, your incident response plan will never be executed cleanly. Let us map your current gaps across identity, patching, backups, and monitoring, then build a practical roadmap to reduce risk and downtime.

Get a clear plan: responsibilities, escalation paths, and controls that match your business and compliance reality.

FAQ: incident response planning for SMBs

Do SMBs really need a formal incident response plan? Yes, because speed and clarity matter.

SMBs have fewer people and less redundancy, so confusion costs more. A short, realistic plan gives you roles, first actions, and a communications path that reduces downtime and protects customer trust.

What is the difference between an incident response plan and a disaster recovery plan?

Incident response focuses on detection, containment, and remediation of security events. Disaster recovery focuses on restoring business services after an outage. They should be connected, but they answer different questions and use different playbooks.

How often should we test our incident response plan?

A good baseline is quarterly tabletops and an annual technical exercise, plus updates after major environment changes. If your business is in a high-risk or regulated sector, increase frequency.

What are the most common first steps for Microsoft 365 incidents?

For identity-driven incidents, rapid containment usually means disabling sign-in or resetting credentials, revoking sessions, removing persistence mechanisms (rules, forwarding, app consents), and exporting audit evidence for documentation.

Should we contact cyber insurance before doing technical work?

In severe incidents, contact your insurer early so you understand policy requirements and approved vendors. However, do not delay immediate containment actions that prevent ongoing harm. Your plan should define which actions are safe to execute immediately and which require escalation.

When should we involve a managed detection and response (MDR) provider?

If you cannot reliably triage alerts after hours, or if alert fatigue is causing missed incidents, MDR can provide 24/7 coverage and clear escalation. It is especially valuable for Microsoft 365 and endpoint signals where speed matters.

References and further reading

If you want to validate your plan against widely used public guidance, these are strong starting points:

Make response repeatable

Want MSP Corp to customize this template to your environment?

We will align your incident response plan to your Microsoft 365 tenant, endpoint tooling, backup strategy, and vendor stack. Then we will run a tabletop exercise with your leadership team so everyone knows their role before an incident hits.

Clear ownership, faster containment, cleaner recovery, and fewer repeat incidents.

About the author

Written by the MSP Corp Security Team. We help Canadian SMBs and mid-market organizations reduce alert fatigue, tighten Microsoft 365 identity controls, and build security-first operating models that support productivity and resilience.