Picture This: A Visual Guide to Incident Response
The term incident has special meanings in different industries. In the banking and financial areas, it’s very specific and involves something that includes the loss of money. You wouldn’t want to call a hacker attempt an incident if you were involved in a bank network because this terminology would automatically trigger an entirely different type of investigation. Still, as you study for core security certifications, you will need to learn the proper procedure for documenting and responding to a security breach.
The next five sections of this article deal with the phases of a typical incident response process. The steps are generic in this example. Each organization will have a specific set of procedures, however, that will generally map to these steps.
NOTE: An important concept to keep in mind when working with incidents is the chain of custody, which covers how evidence is secured, where it is stored, and who has access to it. When you begin to collect evidence, you must keep track of that evidence at all times and show who has it, who has seen it, and where it has been. The evidence must always be within your custody, or you’re open to dispute about whether it has been tampered with. It is highly recommended that a log book be used to document every access and visuals (pictures and video) recorded to show how the evidence is secured.
Step One: Identifying the Incident
Incident identification is the first step in determining what has occurred in your organization. An internal or external attack may have been part of a larger attack that has just surfaced, or it may be a random probe or scan of your network. An event is signaled by a trigger from your Intrusion Detection System, or IDS. Operations personnel will determine whether an event becomes an incident. An easy way to think of the two is that an event is anything that happens, while an incident is any event that endangers a system or network.
Figure One: For the purposes of this discussion, assume that you find physical evidence suggesting someone may have stolen equipment as your step one.
Many IDS-es trigger false positives when reporting incidents. False positives are events that aren’t really incidents. Remember that an IDS is based on established rules of acceptance (deviations from which are known as anomalies) and attack signatures. If the rules aren’t set up properly, normal traffic may set off the analyzer and generate an event. Be sure to double-check your results because you don’t want to declare a false emergency.
One problem that can occur with manual network monitoring is overload. Over time, a slow attack may develop that increases in intensity. Manual processes typically will adapt, and they may not notice the attack until it’s too late to stop it. Personnel tend to adapt to changing environments if the changes occur over a long period of time. An automated monitoring system, such as an IDS, will sound the alarm when a certain threshold or activity level occurs.
When a suspected incident pops up, first responders are those who must ascertain whether it truly is an incident or a false alarm. Depending on your organization, the first responder may only be the main security administrator, or could consist of a team of network and system administrators.
After you’ve determined that you indeed have an incident on your hands, you need to consider how to handle it. This process, called escalation, involves reviewing policies, consulting appropriate management, and determining how best to conduct an investigation into the incident. Make sure that the methods you use to investigate the incident are consistent with corporate and legal requirements for your organization. Bring your Human Resources and Legal departments into the investigation early, and seek their guidance whenever questions involving their areas of expertise appear.
A key aspect, often overlooked by system professionals, involves information control. When an incident occurs, who is responsible for managing the communications about the incident? Employees in the company may naturally be curious about a situation. A single spokesperson needs to be designated. Remember, what one person knows, 100 people know.
Step Two: Investigating the Incident
The process of investigating an incident involves searching logs, files and any other sources of data about the nature and scope of the incident. If possible, you should determine whether it is part of a larger attack, a random event, or a false positive. False positives are common in an IDS environment and may be the result of unusual traffic in the network. It may be that your network is being pinged by a class of computer security students to demonstrate the return times, or it may be that an automated tool is launching an attack.
NOTE: It is sad but true: One reason administrators don’t put as much security on networks as they could is because they do not want to have to deal with the false positives. While this is a poor excuse, it is still often used by administrators. As a security administrator, you must seek a balance between being overwhelmed with too much unneeded information and knowing when something out of the ordinary is occurring. It is an elusive balance that is easier to talk about than find, but it’s one you must strive for.
Figure Two: Check for records indicating signs of someone taking the equipment: either authorized or unauthorized.
You might find that the incident doesn’t require a response if it can’t be successful. Your investigation might conclude that a change in policies is required to deal with a new type of threat. These types of decisions should be documented, and if necessary, reconfigurations should be made to deal with the change.
Step Three: Repairing the Damage
One of your first considerations after an incident is to determine how to restore access to resources that have been compromised. Then, of course, you must reestablish control of the system. Most operating systems provide the ability to create a disaster-recovery process using distribution media or system state files.
Figure Three: Repairing the damage includes taking steps to stop any continuation of the problem.
After a problem has been identified, what steps will you take to restore service? In the case of a DoS attack, a system reboot may be all that is required. Your operating system manufacturer will typically provide detailed instructions or documentation on how to restore services in the event of an attack.
If a system has been severely compromised, as in the case of a worm, it might not be possible to repair it. It may need to be regenerated from scratch. Fortunately, antivirus software packages can repair most of the damage done by the viruses you encounter. But what if you come across something new? You might need to start over with a new system. In that case, you’re highly advised to do a complete disk drive format or repartition to ensure that nothing is lurking on the disk, waiting to infect your network again.
Step Four: Documenting and Reporting the Response
During the entire process of responding to an incident, you should document the steps you take to identify, detect, and repair the system or network. This information is valuable; it needs to be captured in case an attack like this occurs again. The documentation should be accessible by the people most likely to deal with this type of problem. Many help-desk software systems provide detailed methods you can use to record
Figure Four: It may be necessary to bring the authorities in to the investigation.
If appropriate, you should report/disclose the incident to legal authorities and CERT (www.cert.org) so that others can be aware of the type of attack and help look for proactive measures to prevent this from happening again. While it is a bit dated, the CERT guide Handbook for Computer Security Incident Response Teams (CSIRTS) at http://resources.sei.cmu.edu/library/asset-view.cfm?assetID=6305 should be considered required reading.
You might also want to inform the software or system manufacturer of the problem and how you corrected it. Doing so might help them inform or notify other customers of the threat and save time for someone else.
Step Five: Adjusting Procedures
After an incident has been successfully managed, it’s a worthwhile step to revisit the procedures and policies in place in your organization to determine what changes, if any, need to be made.
Figure Five: Keeping all equipment visible and secured can be one adjustment to procedures.
Answering simple questions can sometimes be helpful when you’re resolving problems. The following questions might be included in a policy or procedure manual:
- How did the policies work or not work in this situation?
- What did we learn about the situation that was new?
- What should we do differently next time?
These simple questions can help you adjust procedures. This process is called a postmortem, and it’s the equivalent of an autopsy.
Summing It Up
An incident response policy explains how incidents will be handled, including notification, resources, and escalation. This policy drives the incident response process, and it provides advance planning to the incident response team. It should include within it the steps addressed here.