control, and governance
Auditing the Incident and Problem Management Process
Regular audits of the organization’s procedures for resolving IT problems can help prevent these issues from becoming even bigger trouble for the business.
The day-to-day running of IT operations generates many user queries and problems that could impact the efficient operation of IT systems and applications if they are not addressed. Incident and problem management often is overlooked from an IT audit perspective because it lacks the appeal of development or specificity of disaster recovery. But without a good understanding of the topic and an audit focus on the process, certain business operations can go off the rails quickly.
It is important for internal auditors to understand the difference between an incident and problem:
The overall objective of both the incident and problem management process is to ensure that IT systems are running smoothly and supporting business operations. ISACA’s Control Objectives for Information and Related Technologies (COBIT) provides a good framework to audit an incident and problem management process.
COBIT identifies process steps covering the service desk, registration of customer queries, incident escalation and closure, and reporting and trend analysis.
The service desk is the IT department’s face to the business. Yet, it is amazing how few resources, training, or dollars are allocated to the team that supports this activity. In assessing this component, the auditor must understand:
Registration of Customer Queries
Typically, customer queries may arrive at a service desk via telephone or email. The auditor should ascertain how these queries are logged and tracked. Usually, when a query is received, service desk staff assign it a priority or severity level based on an agreed-upon definition. The assignment of priority is important, as it will determine how quickly a query will be resolved. A priority definition should consider:
The auditor needs to understand the procedures in place to escalate incidents that cannot be resolved immediately. These procedures may involve a level two and level three support structure; the escalation of the queries between these levels will be determined by the resolution time limits — usually defined by a service level agreement — and the complexity of the problem. The auditor should choose a sample of escalated incidents to ensure the events have received the appropriate attention within the agreed-upon time frames.
Incident closure is an important — but mostly overlooked — aspect of the process. Once an incident is resolved, the assigned team member often moves onto the next incident without updating the status, system documentation, and resolution. This lack of information poses a significant issue for the problem management process in that trends cannot be analyzed. During their testing, auditors should ensure that appropriate information is included in the incident record. Furthermore, auditors should ascertain whether documented criteria is in place that specifies what information is required to be collected before an incident is classified as “closed.” Some of these criteria may include:
Reporting and Trend Analysis
Senior management should produce and review periodic reports detailing resolved incidents, service performance, and response times. The auditor should ensure the reported information is accurate and note action taken by management to address key issues.
The key process steps identified by COBIT for problem management include identification and classification of problems, problem tracking and resolution, and problem closure.
Identification and Classification of Problems
Problems typically are identified through trend analysis of multiple incident reports and error logs. Alternatively, a high-severity incident also may be classified as a problem to enable detailed root cause analysis. Key aspects required at this stage include:
Problem Tracking and Resolution
A single tracking system — ideally interfaced with the incident management system — will assist in providing the audit trail and status required to monitor problems. In addition, communication to the impacted parties is critical at all stages to ensure there is an appropriate solution and timely resolution. Moreover, personnel must be trained to identify and track trends. In most instances, the root cause of the problem is identified only after a significant amount of analysis is undertaken. Tools such as Pareto charts and principles and Ishikawa diagrams (also called fishbone diagrams) are useful in identifying trends and cause–effect of problems.
Problem records should be closed when there is a successful resolution of the known error or if the business agrees to implement an alternate solution or workaround.
A STRONGER PROCESS
The underlying principle behind a successful problem and incident management process is communication among the various teams. The hand-offs between the teams is where most problems occur. Clearly defined procedures together with specific accountabilities can help strengthen the process. Moreover, continuing awareness and education — with internal audit’s support — can make the process more robust.
Shannon Buckley, CIA, CISA, CGEIT, CPA, is a senior auditor with Bupa International Markets in Sydenham, Victoria, Australia.
To comment on this article, email the author at firstname.lastname@example.org.
COMMENT ON THIS ARTICLE
Internal Auditor is pleased to provide you an opportunity to share your thoughts about the articles posted on this site. Some comments may be reprinted elsewhere, online, or offline. We encourage lively, open discussion and only ask that you refrain from personal comments and remarks that are off topic. Internal Auditor reserves the right to edit/remove comments.