I was going back through some old bookmarks when I stumbled upon on a post by Richard Bejtlich from 2007 entitled “NSM and Intrusion Detection Differences“. In this article, Richard discussed the concept of ‘immaculate collection’ versus ‘immaculate detection’. Richard’s article references IDS developers desiring immaculate detection while NSM practitioners typically vie for immaculate collection. Given this, I posed the following question to several of my colleagues: Which is more important, collection or detection?
The question itself is open to a bit of interpretation, but my group was split about 60/40 favoring collection over detection. I tend to agree with that majority, although the minority had some valid points as well.
Those favoring detection argued that a mountain of data, no matter how eloquently collected, is useless without some level of detection capability. Additionally, most in this camp agreed that your detection capability shapes how you perform collection. A few even made the point that they considered collection to be a function of network operations, and not NSM. I can’t disagree with the first of these arguments, but I’m opposed to the other two. I’ll address the argument of whether or not detection shapes collection here.
When I think about NSM, I typically think of it in three phases: collection, detection, and analysis. Collection is the gathering and parsing of relevant network security data, and it often performed by a combination of hardware and software. Detection is the process of finding anomalies in collected data that may represent a potential intrusion. Detection is most often done by software, but can be done by humans to a lesser extent. Analysis is the review and investigation of alert data generated during detection. Analysis is typically (and most effectively) done by humans.
Phases of Network Security Monitoring
The key takeaway from these three phases is that they form a cycle rather than a beginning to end process. Collected data feeds the detection capability, and the alert data generated from detection feeds the analysis process. What makes this process cyclical is that the investigation and research performed during the analysis process is used to define and shape what data you are collecting.
That said, I argue that collection is the most important phase of network security monitoring for a couple of reasons:
Detection Depends on Collection
Abraham Lincoln was quoted in saying that if you were to give him six hours to chop down a tree, he would spend the first four hours sharpening his ax. This analogy fits perfect here, because no matter how much thought you put into your detection tools, they are utterly useless if they aren’t digesting the right data. That nice beefy Snort sensor might just be wasting cycles if you’ve placed it on the wrong side of your firewall. Detection fails if collection isn’t done well.
Analysis Also Depends on Collection
I hate using the needle in the haystack analogy, but if the hay is covered in manure then you sure aren’t going to want to spend all of that time digging through it. A human analyst interprets alert data provided by a detection mechanism and then goes out and collects more data in an effort to support his/her investigation. If this data isn’t being collected in an easily retrievable and digestible format then analysis fails. An IDS signature might tell me that a potential attacker is attempting SQL injection on my public facing web server but if I’m not collecting PCAP data and my web server/database logs aren’t accessible then I’m going to have a really hard time finding out if the attack is actually successful.
Analysis Feeds Collection Moreso than Detection
I’ve served in the role where I’m the guy creating the detection tools and also in the role where I’m the guy analyzing the alerts generated by the detection tools. It is absolutely true that in some cases collection software/hardware is designed and configured in such a way that it provides data in the appropriate format to a detection tool. This might lead someone to the conclusion that it is detection shaping the collection, but that argument is only made seeing a narrow view of the entire thought process. It is actually the analysis of previous alert data that typically has identified the need for the detection tool that is being created. Remember that detection is most often a task performed by software and it is analysis that is performed by individuals. Software doesn’t identify needs, people do.
Again, I think this is one of those questions that may or may not have a right answer, but for my two cents, if you gave me six hours to find the bad guys, I’d spend the first four making sure I collected the right data.
Very interesting read. I will agree that you have to have the right data in order to come the right conclusion.