64 Key Risk Indicators Examples with Definitions - KRIs for Technology Risk Management

A Comprehensive List of Key Risk Indicators with Definitions for Information Technology and Information Security

Technology risk in modern day business can be seen in news headlines on a daily basis. Data breaches from large corporations can drive stock prices down by 30-50% in one trading day. Recent big headline data breaches of customer data include; Target in 2013, Experian in 2017, and now Facebook in 2018. Implementing and closely tracking the right IT and IS key risk indicators can help reduce the risk for your company.

What are Key Risk Indicators, or KRIs?

KRIs, or key risk indicators, are defined as measurements used by an organization to manage current and potential exposure to various risks. KRIs can measure the potential risk related to a specific action that the organization is considering—as well as the risk inherent in the company’s day-to-day operations. KRIs can act as an early-warning system to alert the company of financial issues (lost revenue), operational issues (loss of productivity), or reputational issues (loss of credibility).

KRIs are used to calculate the risk, usually measured in percentages, of potentially unfavorable events that can negatively affect a process, an activity, or an entire company. These measurements inform management of a company’s technology and business risk profile and can be used to help investigate and improve operations where attention is needed.

Just like key performance indicators, these metrics may vary based on the departments or processes being examined, or the target audience being considered (e.g., line manager vs. senior executive).

What are Key Risk Indicators examples and how can they be used?

Key risk indicator examples are defined as previously used or researched illustrative measurements of risk that can installed and tracked to lower risk in a company or business process. These KRI examples should be used as a starting point to determine what gaps exist in your current risk measurement activities. Do you need help benchmarking your key risk indicators or implementng them? Contact us.

64 Key Risk Indicator (KRIs) Examples with Definitions for Information Technology Risk Management

  1. Mean Time Between Failure (MTBF) – All Systems – The average amount of time (measured in days) elapsed between system failures, measured from the moment the system initially fails, until the time that the next failure occurs (including the time required to perform any repairs after the initial failure).
  2. Percent Difference in MTBF (Monthly) – The difference in Mean Time Between Failure (MTBF) from month-to-month for the group of systems being examined, measured as a percentage.
  3. Mean Time to Repair (MTTR) – All Systems – The average amount of time (measured in hours) required to repair a system or application to full functionality following a failure (i.e., a service interruption), measured from the time that the failure occurs until when the repair is completed and rolled out to all required locations (servers, devices, workstations, etc.).
  4. Percent Difference in MTTR (Monthly) – The difference in Mean Time to Repair (MTTR) from month-to-month for the group of systems being examined, measured as a percentage.
  5. System Availability – All Systems – The amount of time (measured in minutes) that ALL systems are online and available for use by all authorized users divided by the total amount of time those systems are scheduled to be available for use over the same period of time, as a percentage.
  6. System Availability During Trading Hours – All Systems – The amount of time (measured in minutes) that ALL systems are online and available for use during trading hours (10am-3pm, Sunday-Thursday) by all authorized users divided by the total amount of time those systems are scheduled to be available for use over the same period of time, as a percentage.
  7. Number of Instances Where Systems Exceeded Capacity Requirements – The total number of instances (i.e., a specific point in time) where systems exceeded the pre-defined capacity threshold, measured in transactions or requests per second, within the measurement period.
  8. Percentage of System/Application Downtime Caused by Inadequate Server Capacity – The amount of system downtime, or service interruption time, that was caused specifically by insufficient capacity (i.e., requests/transaction load directly caused failure) as a percentage of total unplanned downtime within the measurement period.
  9. Percentage of Downtime Due to Scheduled Activities – All Systems – The total amount of downtime, measured in minutes, that has been set aside and used by the IT function for planned system maintenance activities (as opposed to unplanned downtime) as a percentage of total downtime (planned and unplanned) during the measurement period.
  10. Percentage of Scheduled Maintenance Activities Missed – The number of scheduled maintenance activities related to company devices (workstations, network equipment, servers) that did not take place on or before their scheduled date as a percentage of all maintenance activities scheduled to occur over the same period of time.
  11. IT Service Desk – Mean Service Request Resolution Time (All Levels) – The average amount of time (measured in minutes) required for the IT support team to resolve, or close, an IT support request, measured from the time that the ticket or request is submitted by an employee until the issue has been resolved and formally closed.
  12. IT Service Desk – Total Number of Requests Opened (All Levels) – The total number of service requests, or tickets, received by the IT service desk team over a certain period of time. A service request is considered opened immediately upon reception (regardless of whether or not the request is acknowledged).
  13. IT Service Desk – Percentage of Requests Not Resolved within SLA (All Levels) – The number of IT service requests that are not resolved within the timeframe defined by the company’s SLA as a percentage of total issues resolved over the same period of time.
  14. Network Availability – The amount of time (measured in minutes) that the company’s network is available for use by all authorized users divided by the total amount of time the network is scheduled to be available for use over the same period of time, as a percentage.
  15. Mean Network Bandwidth Utilization Rate – Overall (30 Minute Intervals) – The average utilization rate (i.e., percentage of total available network bandwidth capacity being used), measured as a ratio of current network traffic to the total amount of traffic that the network, or port, being examined can handle.
  16. Number of Instances Where Network Bandwidth Utilization Exceeded Threshold – The total number of instances during the measurement period where network bandwidth capacity exceed a defined threshold (identified through network testing and monitoring) at which the network begins to exhibit request delays, low transmission speeds, etc.
  17. Mean Network Hardware Utilization Rate – Overall (30 Minute Intervals) – The average utilization rate (i.e., percentage of total available network hardware capacity being used), measured as a ratio of current network traffic to the total amount of traffic that the network, or port, being examined can handle.
  18. Number of Instances Where Network Hardware Utilization Exceeded Threshold – The total number of instances during the measurement period where network hardware capacity exceed a defined threshold (identified through network testing and monitoring) at which the network begins to exhibit request delays, low transmission speeds, etc.
  19. Percentage of Applications Running without a Current Service Level Agreement – The number of applications currently running on company workstations or devices that are NOT governed by an explicit, documented service level agreement (SLA), which states the parameters and standards of service to be delivered by the application, as a percentage of all applications currently running.
  20. IT Service Provider SLA Adherence – The number of IT vendor service level agreements where the vendor has met or exceeded targets outlined in their corresponding Service Level Agreement (SLA) over the last 3 months as a percentage of total vendor, or service provider, activities and performance levels are governed by a formal SLA.
  21. Internal IT Team SLA Adherence – The number of internal service level agreements where the IT team has met or exceeded targets outlined in their corresponding Service Level Agreement (SLA) over the last 3 months as a percentage of total IT team activities and performance levels are governed by a formal SLA.
  22. Percentage of Systems Running without Current Maintenance Contract – All Systems – The number of actively used systems or applications that do not have a current maintenance contract in place as a percentage of total systems/applications managed at the same point in time.
  23. Number of Disputes with IT Vendors – The total number of formal disputes that took place between the company and IT-related vendors over the last 3 months. Vendor disputes may arise due to poor vendor performance, payment issues and/or project scope misalignment (i.e., scope “creep”), among other things.
  24. Percentage of Systems Undergoing New Releases – All Systems – The total number of application or systems where a new release was completed or attempted by the IT function during the measurement period as a percentage of total systems managed.
  25. Percentage of Unsuccessful Releases – The number of releases rolled out by the IT function to company devices or workstations that must be rolled back (i.e., affected systems are restored to pre-release state through version control, or similar) due to issues that occurred following the release as a percentage of total releases attempted (i.e., successful and failed) over the same period of time.
  26. Percentage of Systems Undergoing Changes – All Systems – The total number of application or systems where a new change was completed or attempted by the IT function during the measurement period as a percentage of total systems managed.
  27. Percentage of Unsuccessful Changes – All Levels of Impact – The number of changes rolled out by the IT function to company devices or workstations that must be rolled back (i.e., affected systems are restored to pre-change state through version control, or similar) due to issues that occurred following the implementation of the change, as a percentage of total changes attempted over the same period of time.
  28. Percentage of Changes Considered Emergency Changes – The number of changes, or patches, to systems, devices and applications that are considered to be an emergency as a percentage of changes made over the same period of time. An emergency change is a previously unplanned change to systems or applications that must be implemented immediately, or as soon as possible, to avoid a serious security risk, productivity loss, and/or service interruption.
  29. Percentage of System Releases Not Mirrored on Backup Systems Within 24 Hours Following Launch – All Systems – The number of releases that were successfully launched to the live environment that were not mirrored on backup systems within 24 hours following the successful launch as a percentage of total changes successfully performed during the measurement period.
  30. Percentage of System Changes Not Mirrored on Backup Systems Within 24 Hours Following Launch – All Systems – The number of system changes that were successfully launched to the live environment that were not mirrored on backup systems within 24 hours following the successful launch as a percentage of total changes successfully performed during the measurement period.
  31. Percentage of Critical Systems without Up-to-Date Patches – The total number of critical systems (all deployed instances of the system or application running on each device/workstation) that do not currently have up-to-date patches installed and running as a percentage of total critical system end user devices/workstations. This metric may also be known as “Patch Coverage Rate.”
  32. Percentage of Critical System Backups that are Not Fully Automated – The number of critical systems without an automated (i.e., no manual work required) backup currently configured and running accurately as a percentage of total critical system backups (automated and manual).
  33. Total Number of Critical System Backup Failures – The total number of critical system backup processes that failed (i.e., did not run, were not captured in-full, were captured with errors, etc.) to complete or run properly during the measurement period.
  34. Percentage of Applications Requiring Functionality Upgrade Within the Last 90 Days – The total number of applications used by the company that required an upgrade related to user experience/usability within the last 90 calendar days.
  35. Number of IT Projects Canceled After Kick-off Within Last 6 Months – The number of IT projects that were cancelled at some point following the initial project startup due to lack of alignment with corporate strategy or planning over the last 6 months.
  36. Percentage of IT Projects Delayed – The number of IT projects that are NOT completed before or on their initial planned completion (i.e., delayed projects) date as a percentage of total IT projects completed over the same period of time.
  37. Percentage of IT Projects That Exceeded Budget – The number of IT projects that exceed the initially developed budget parameters as a percentage of total IT projects completed over the same period of time.
  38. Percentage of IT Projects Reworked Due to Misaligned Requirements Within the Last 90 Days – The number of IT projects that, within the last 90 days, required re-scoping or re-prioritization due to business requirements that were not clearly defined, or were not sufficiently reviewed by key stakeholders prior to project launch as a percentage of total IT projects running.
  39. IT Budget Variance (Actual vs. Budgeted) – The difference in planned (i.e., budgeted) versus actual IT expense for the entire IT department, or function, during the measurement period, measured as a percentage.
  40. Percent Change in Number of Website Visits – Month over Month (MoM) – The percent difference in the total number of users that visited the website through all channels (organic search, paid search, direct, referral, etc.) from month-to-month.
  41. Bounce Rate – The number of users that view only one web page when visiting the site before exiting (i.e., bouncing) as a percentage of total website visits over the same period of time. A high Bounce Rate can indicate that the website is not sufficiently designed to lead users to other locations around the website.
  42. Average Page Load Time – The average amount of time (in seconds) required for the user’s browser to full load a web page within the company’s website, from the time the click occurs until the web browser has loaded the page in full.
  43. Average Page Views per Visit – The average number of individual web pages viewed by a website visitor during the course of a single visit, or session, during the measurement period.
  44. Average Time on Site – The average amount of time a website visitor spends on the website, from the time that the user lands on a page until they exit the website, during the course of a single visit, or session, during the measurement period.
  45. Percentage of IT Assets (Devices) Impacted by End-of-Life or Support – The number of devices managed by the IT Department that are slated to be impacted by upcoming end-of-life (EoL) or end-of-support (EoS) dates.
  46. Total Number of IT Assets Current Not in Use – The total number of IT assets owned by the organization that are currently (i.e., at the point of measurement) not used in any capacity by the organization.
  47. Percentage of Systems in Use that are No Longer Supported – The number of systems currently in use by the company that are no longer supported by the original developer as a percentage of total systems used by the organization at the same point in time. These non-supported systems may also be considered “legacy” systems.
  48. Percentage of Network Devices Not Meeting Configuration Standards – The total number of network devices (modems, routers, switches, etc.) that were found not to be in compliance the company’s pre-defined configuration standards as a percentage of total network devices under management at the same point in time.
  49. Percentage of Workstations that have Not Received a Full Malware Scan Within Last 24 Hours – The number of workstations that have not undergone a full, successful virus scan with that last 24 hours as a percentage of total active workstations managed by the organization.
  50. Percentage of Servers that have Not Received a Full Malware Scan Within Last 24 Hours – The number of servers that have not undergone a full, successful virus scan with that last 24 hours as a percentage of total active servers managed by the organization.
  51. Percentage of Mobile Devices that have Not Received a Full Malware Scan Within Last 24 Hours – The number of mobile devices that have not undergone a full, successful virus scan with that last 24 hours as a percentage of total active mobile devices managed by the organization.
  52. Percentage of Devices Not Running Updated Anti-Malware Controls – The number of devices (workstations, servers, mobile devices) managed by the company that are not currently running fully up-to-date anti-malware protection as a percentage of total devices managed by the organization.
  53. Percentage of Workstations Not Running Updated Anti-Malware Controls – The number of workstations managed by the company that are not currently running fully up-to-date anti-malware protection as a percentage of active workstations managed by the organization.
  54. Percentage of Servers Not Running Updated Anti-Malware Controls – The number of servers managed by the company that are not currently running fully up-to-date anti-malware protection as a percentage of total active servers managed by the organization.
  55. Percentage of Mobile Devices Not Running Updated Anti-Malware Controls – The number of mobile devices managed by the company that are not currently running fully up-to-date anti-malware protection as a percentage of active mobile devices managed by the organization.
  56. Percentage of Firewall Rules Added or Changed Within Last 90 Days That Were Formally Documented – The number of changes to firewall rules that were applied to the company’s firewall (across all firewall applications/systems in use) that were formally documented according to the company’s policies/procedures as a percentage of total firewall rule changes applied within the last 90 calendar days.
  57. Percent Increase in Number of Attacks on Firewall (Weekly) – The percent difference in the number of attacks on the company’s firewall that were detected during the previous two calendar weeks.
  58. Number of Unused Firewall Rules – The total number of firewall rules (across all firewall applications/systems in use) that were found to no longer be in use during formal or informal firewall rule reviews conducted during the measurement period.
  59. Number of Firewall Reviews Conducted – The total number of formal firewall configuration reviews conducted by IT team members during the measurement period.
  60. Average Time Elapsed Between Formal Reviews of Firewall Rules – The average number of calendar days elapsed between formal firewall rules reviews conducted by the company to determine if rules must be added, removed or edited to meet current operating requirements.
  61. Number of Network Outages Attributed to Internet Service Provider – The number of network outages that can be attributed to the company’s Internet Service Provider (ISP), rather than an internal source, during the measurement period.
  62. Number of Servers Experiencing Hardware-related Performance Issues Within the Last 90 Days – The number of servers that have experienced hardware-related performance issues during the last 90 calendar days as a percentage of total servers operated by the company.
  63. Number of Workstations Experiencing Hardware-related Performance Issues Within the Last 90 Days – The number of individual workstations that have experienced performance issues during the last 90 calendar days as a percentage of total workstations operated by the company.
  64. Deployed Hardware Utilization Ratio (DH-UR) – The ratio of number of servers that are running live applications used by the organization to the total number of servers currently managed, or deployed by the organization at the time of measurement.