Tag Archives: network security monitoring

Applied Network Security Monitoring, the book!

I’m thrilled to announce my newest project, Applied Network Security Monitoring, the book, along with my co-authors Liam Randall and Jason Smith.

Better yet, I’m excited to say that 100% of the royalties from this book will be going to support some great charities, including the Rural Technology Fund, Hackers for Charity, Hope for the Warriors, and Lighthouse Youth Services.

You can read more about the book, including a full table of contents at its companion site, here: http://www.appliednsm.com/.

Information Security Incident Morbidity and Mortality (M&M)

It may be a bit cliché, but encouraging the team dynamic within an information security group ensures mutual success over individual success. There are a lot of ways to do this, including items I’ve discussed before such as fostering the development of infosec superstars or encouraging servant leadership. Beyond these things, there is no better way to ensure team success within your group than to create a culture of learning. Creating this type of culture goes well beyond sending analysts to formalized courses or paying for certifications. It relies upon adopting the mindset that in every action an analyst takes, they should either be teaching or learning, with no exceptions. Once every analyst begins seeing every part of their daily job as an opportunity to learn something new or teach something new to their peers, then a culture of learning is flourishing.

A part of this type of organizational culture is learning from both successes and failures. The practice of Network Security Monitoring (NSM) and Incident Response (IR) are ones that are centered on technical investigations and cases, and when something bad eventually happens, incidents. This is not unlike medicine, which is also focused on medical investigations and patient cases, and when something bad eventually happens, death.

Medical M&M

When death occurs in medicine, it can usually be classified as something that was either avoidable or inevitable from both a patient standpoint and also as it related to the medical care that was provided. Whenever a death is seen as something that may have been prevented or delayed with modifications to the medical care that was provided, the treating physician will often be asked to participate in something called a Morbidity and Mortality Conference, or M&M as they are often referred to casually. In an M&M, the treating physician will present the case from the initial visit, including the presenting symptoms and the patients initial history and physical assessment. This presentation will continue through the diagnostic and treatment steps that were taken all the way through the patient’s eventual death.

The M&M presentation is given to an audience of peers, to include any other physicians who may have participated in the care of the patient in question, as well as physicians who had nothing to do with the patient. The general premise is that these peers will question the treatment process in order to uncover any mistakes that may have been made or processes that could be improved upon.

The ultimate goal of the medical M&M as a team is to learn from any complications or errors, to modify behavior and judgment based upon experiences gained, and to prevent repetition of errors leading to complications. This is something that has occurred within medicine for over one hundred years and has proven to be wildly successful.

Information Security M&M

I’ve written about how information security can learn from the medical field on multiple occasions, including recently discussing the use of Differential Diagnosis for Network Security Monitoring. The concept of M&M is also something that I think transitions very well to information security.

As information security professionals, it is very easy to miss things. I’m a firm believer that prevention eventually fails, and as a result, we can’t be expected to live in a world free from compromise. Rather, we must be positioned so that when an incident does occur, it can be detected and responded to quickly. Once that is done, we can learn from whatever mistakes occurred that allowed the intrusion, and be better prepared the next time.

When an incident occurs we want it to be because of something out of our hands, such as a very sophisticated attacker or an attacker who is using an unknown zero day. The truth of the matter is that not all incidents are that complex and often times there are ways in which detection, analysis, and response could occur faster. The information security M&M is a way to collect that information and put it to work. In order to understand how we can improve from mistakes, we have to understand why they are made. Uzi Arad summarizes this very well in the book, “Managing Strategic Surprise”, a must read for information security professionals. In this book, he cites three problems that lead to failures in intelligence management, which also apply to information security:

  • The problem of misperception of the material, which stems from the difficulty of understanding the objective reality, or the reality as it is perceived by the opponent.
  • The problems stemming form the prevalence of pre-existing mindsets among the analysts that do not allow an objective professional interpretation of the reality that emerges from the intelligence material.
  • Group pressures, groupthink, or social-political considerations that bias professional assessment and analysis.

The information security M&M aims to provide a forum for overcoming these problems through strategic questioning of incidents that have occurred.

When to Convene an M&M

In an Information Security M&M, the conference should be initiated after an incident has occurred and been remediated. Selecting which incidents are appropriate for M&M is a task that is usually handled by a team lead or member of management who has the ability to recognize when an investigation could have been handled better. This should occur reasonably soon after the incident so important details are fresh on the minds of those involved, but far enough out from the incident that those involved have time to analyze the incident as a whole, post-mortem. An acceptable time frame can usually be about a week after the incident has occurred.

M&M Presenter(s)

The presentation of the investigation will often involve multiple individuals. In medicine, this may include an initial treating emergency room physician, an operating surgeon, and a primary care physician. In information security, this could include an NSM analyst who detected the incident, the incident responder who contained and remediated the incident, the forensic investigator who performed an analysis of a compromised machine, or the malware analyst who reverse engineered the malware associated with the incident.

M&M Peers

The peers involved with the M&M should include at least one counterpart from each particular specialty, at minimum. This means that for every NSM analyst directly involved with the case, there should be at least one other NSM analyst who had nothing to do with it. This aims to get fresh outside views that aren’t tainted by feeling the need to support any actions that were taken in relation to the specific investigation. In larger organizations and more ideal situations, it is nice to have at least two counterparts from each specialty, with one being of lesser experience than the presenting individual and one being of more experience.

The Presentation

The presenting individual or group of individuals should be given at least a few days notice before their presentation. Although the M&M isn’t considered a formal affair, a reasonable presentation is expected to include a timeline overview of the incident, along with any supporting data. The presenter should go through the detection, investigation, and remediation of the incident chronologically and present new findings only as they were discovered during this progression. Once this chronological presentation is given, the incident can then be examined holistically.

During the presentation, participating peers should be expected to ask questions as they arise. Of course, this should be done respectfully by raising your hand as the presenter is speaking, but questions should NOT be saved for after the presentation. This is in order to frame the questions to the presenter as a peer would arrive at them during the investigation process.

Strategic Questioning

Questions should be asked to presenters in such a way as to determine why something was handled in a particular manner, or why it wasn’t handled in an alternative manner. As you may expect, it is very easy to offend someone when providing these types of questions, therefore, it is critical that participants enter the M&M with an open mind and both presenters and peers ask and respond to questions in a professional manner and with due respect.

Initially, it may be difficult for peers to develop questions that are entirely constructive and helpful in overcoming the three problems identified earlier. There are several methods that can be used to stimulate the appropriate type of questioning.

Devils Advocate

One method that Uzi Arad mentions in his contribution to “Managing Strategic Surprise” is the Devils Advocate method. In this method, peers attempt to oppose most every analytical conclusion made by the presenter.  This is done by first determining which conclusions can be challenged, then collecting information from the incident that supports the alternative assertion. It is then up to the presenter to support their own conclusions and debunk competing thoughts.

Alternative Analysis (AA)

R.J. Heuer presents a several of these methods in his paper, “The Limits of Intelligence Analysis”. These methods are part of a set of analytic tools called Alternative Analysis (AA).

Group A / Group B

This analysis involves two groups of experts analyzing the incident separately, based upon the same information. This requires that the presenters (Group A) provide supporting data related to the incident prior to the M&M so that the peers (Group B) can work collaboratively to come up with their own analysis to be compared and contrasted during the M&M. The goal is to establish to individual centers of thought. Whenever points arise where the two groups reach a different conclusion, additional discussion is required to find out why the conclusions differ.

Red Cell Analysis

This method focuses on the adversarial viewpoint, in which peers assume the role of the adversary involved with the particular incident. In doing this, they will question the presenter as to how their investigative steps were completed in reaction to the attackers actions. For instance, a typical defender may solely be focused on finding out how to stop malware from communicating back to the attacker, but the attacker may be more concerned with whether or not the attacker was able to decipher the communication that was occurring. This could lead to a very positive line of questioning that results in new analytic methods that help to better assess the impact of the attacker to benefit containment.

What If Analysis

This method is focused on the potential causes and effects of events that may not have actually occurred. During detection, a peer may ask a question related to how the attack might have been detected if the mechanism that did detect it didn’t do so. In the response to the event, a peer might question what the presenter would have done had the attacker been caught during the data exfiltration process rather than after it had already occurred. These questions don’t always relate directly to the incident at hand, but provide incredibly valuable thought provoking discussion that will better prepare your team for future incidents.

Analysis of Competing Hypothesis

This method is similar to what occurs during a differential diagnosis, where peers crate an exhaustive list of alternative assessments of symptoms that may have been presented. This is most effectively done by utilizing a whiteboard to list every potential diagnosis and then ruling those out based upon testing and review of additional data. You can review my article on differential diagnosis of NSM events here for a more thorough discussion of this type of questioning.

Key Assumptions Check

Most all sciences tend to make assumptions based upon generally accepted facts. This method of questioning is designed to challenge key assumptions and how they affect the investigation of a scenario. This most often pairs with the What If analysis method. As an example, in the spread of malware, it’s been the assumption that when operating within a virtual machine, the malware doesn’t have the ability to escape to the host or other virtual machines residing on it. Given an incident being presented where a virtual machine has been infected with malware, a peer might pose the question of what action might be taken if this malware did indeed escape the virtual environment and infect other virtual machines on the host, or the host itself.

Outcome

During the M&M, all participants should actively take notes. Once the M&M is completed, the presenting individuals should take their notes and combine them into a final report that accompanies their presentation materials and supporting data. This reporting should include a listing of any points which could have been handled differently, and any improvements that could be made to the organization as a whole, either technically or procedurally. This report should be attached the case file associated with the investigation of the incident.

Additional Tips

Having organized and participated in several of these conferences and reviews of similar scope, I have a few other pointers that help in ensuring they provide value.

  • M&M conferences should be held only sporadically, with no more than one per week and no more than three per month.
  • It should be stressed that the purpose of the M&M isn’t to grade or judge an individual, but rather, to encourage the culture of learning.
  • M&M conferences should be moderated by someone at a team lead or lower management level to ensure that the conversation doesn’t get too heated and to steer questions in the right direction.
  • If you make the decision to institute M&M conferences, it should be a requirement that everybody participates at some point, either as a presenter or a peer.
  • The final report that is generated from the M&M should be shared with all technical staff, as well as management.
  • Information security professionals, not unlike doctors, tend to have big egos. The first several conferences might introduce some contention and heated debates. This is to be expected initially, but will work itself out over time with proper direction and moderation.
  • The M&M should be seen as a casual event. It is a great opportunity to provide food and coordinate other activities before and after the conference to take the edge off.
  • Be wary of inviting upper management into these conferences. Their presence will often inhibit open questioning and response and they often don’t have the appropriate technical mindset to gain or provide value to the presentation.

It is absolutely critical that when initiating these conferences, it is done with care. The medical M&M was actually started in the early 1900s by a surgeon named Dr. Ernest Codman at Massachusetts General Hospital in Boston. MGH was so appalled that Dr. Codman suggested that the competence of surgeons should be evaluated that he eventually lost his staff privileges. Now, M&M is a mainstay in modern medicine and something that is done in some of the best hospitals in the world. I’ve seen instances where similar types of shunning occur in information security when these types of peer review opportunities are suggested. As information security practitioners it is crucial that we are accepting of this type of peer review and that we encourage group learning and the refinement of our skills.

References:

  • Campbell, W. (1988). “Surgical morbidity and mortality meetings“. Annals of the Royal College of Surgeons of England 70 (6): 363–365. PMC 2498614.PMID 3207327.
  • Arad, Uzi (2008). Intelligence Management as Risk Management. Paul Bracken, Ian Bremmer, David Gordon (Eds.), Managing Strategic Surprise (43-77). Cambridge: Cambridge University Press.
  • Heuer, Richards J., Jr. “Limits of Intelligence Analysis.” Orbis 49, no. 1 (2005)

NSM Collection vs. Detection

I was going back through some old bookmarks when I stumbled upon on a post by Richard Bejtlich from 2007 entitled “NSM and Intrusion Detection Differences“. In this article, Richard discussed the concept of ‘immaculate collection’ versus ‘immaculate detection’. Richard’s article references IDS developers desiring immaculate detection while NSM practitioners typically vie for immaculate collection. Given this, I posed the following question to several of my colleagues: Which is more important, collection or detection?

 

The question itself is open to a bit of interpretation, but my group was split about 60/40 favoring collection over detection. I tend to agree with that majority, although the minority had some valid points as well.

 

Those favoring detection argued that a mountain of data, no matter how eloquently collected, is useless without some level of detection capability. Additionally, most in this camp agreed that your detection capability shapes how you perform collection. A few even made the point that they considered collection to be a function of network operations, and not NSM. I can’t disagree with the first of these arguments, but I’m opposed to the other two. I’ll address the argument of whether or not detection shapes collection here.

 

When I think about NSM, I typically think of it in three phases: collection, detection, and analysis. Collection is the gathering and parsing of relevant network security data, and it often performed by a combination of hardware and software. Detection is the process of finding anomalies in collected data that may represent a potential intrusion. Detection is most often done by software, but can be done by humans to a lesser extent. Analysis is the review and investigation of alert data generated during detection. Analysis is typically (and most effectively) done by humans.

 

 

Phases of Network Security Monitoring

 

 

The key takeaway from these three phases is that they form a cycle rather than a beginning to end process. Collected data feeds the detection capability, and the alert data generated from detection feeds the analysis process. What makes this process cyclical is that the investigation and research performed during the analysis process is used to define and shape what data you are collecting.

 

That said, I argue that collection is the most important phase of network security monitoring for a couple of reasons:

 

Detection Depends on Collection

Abraham Lincoln was quoted in saying that if you were to give him six hours to chop down a tree, he would spend the first four hours sharpening his ax. This analogy fits perfect here, because no matter how much thought you put into your detection tools, they are utterly useless if they aren’t digesting the right data. That nice beefy Snort sensor might just be wasting cycles if you’ve placed it on the wrong side of your firewall. Detection fails if collection isn’t done well.

 

Analysis Also Depends on Collection

I hate using the needle in the haystack analogy, but if the hay is covered in manure then you sure aren’t going to want to  spend all of that time digging through it. A human analyst interprets alert data provided by a detection mechanism and then goes out and collects more data in an effort to support his/her investigation. If this data isn’t being collected in an easily retrievable and digestible format then analysis fails. An IDS signature might tell me that a potential attacker is attempting SQL injection on my public facing web server but if I’m not collecting PCAP data and my web server/database logs aren’t accessible then I’m going to have a really hard time finding out if the attack is actually successful.

 

Analysis Feeds Collection Moreso than Detection

I’ve served in the role where I’m the guy creating the detection tools and also in the role where I’m the guy analyzing the alerts generated by the detection tools. It is absolutely true that in some cases collection software/hardware is designed and configured in such a way that it provides data in the appropriate format to a detection tool. This might lead someone to the conclusion that it is detection shaping the collection, but that argument is only made seeing a narrow view of the entire thought process. It is actually the analysis of previous alert data that typically has identified the need for the detection tool that is being created. Remember that detection is most often a task performed by software and it is analysis that is performed by individuals. Software doesn’t identify needs, people do.

 

 

Again, I think this is one of those questions that may or may not have a right answer, but for my two cents, if you gave me six hours to find the bad guys, I’d spend the first four making sure I collected the right data.