Investigating Like a Chef

Whenever I get the chance I like to try and extract lessons from practitioners in other fields. This is important because the discipline of information security is so new, while more established professions have been around, in some cases, for hundreds of years. I’ve always had a keen interest in culinary studies, mostly because I come from an area of the country where people show that they love each other by preparing meals. I’m also a bit of a BBQ connoisseur myself, as those of you who know me can solemnly attest to. While trying to enhance my BBQ craft I’ve had the opportunity to speak with and read about a few professional chefs and study how they operate. In this post I want to talk a little bit about some key lessons I took away from my observations.

If you have ever worked in food service, or have even prepared a meal for a large number of people you know that repetition is often the name of the game. It’s not trimming one rack of ribs, its trimming a dozen of them. It’s not cutting one sweet potato, its cutting a sack of them. Good chefs strive to do these things in large quantities while still maintaining enough attention to detail so that the finished product comes out pristine. There are a lot of things that go into making this happen, but none more important than a chef mastering their environment. This isn’t too different than a security analyst who investigates hundreds of alerts per day while striving to pay an appropriate amount of attention to each individual investigation. Let’s talk about how chefs master their environment and how these concepts can be applied to information security.

Chefs minimize their body movement. If you are going to be standing up in a kitchen all day performing a bunch of repetitive and time sensitive tasks, then you want to make sure every step or movement you make isn’t wasted. This prevents fatigue and increases efficiency.

As an example, take a look at Figure 1. In this image, you will see that everything the chef needs to prepare their dish is readily available without the chef having to take extra steps or turn around too often. Raw products can be moved from the grocery area, rinsed in the sink, sliced or cut on the cutting board, cooked on the stove, and plated without having to turn more than a few times or move more than a couple of feet.

chef_mentalmoves

Figure 1: A Chef’s Workspace is Optimized for Minimal Movement

Chefs learn the French phrase “mise en place” early on in their careers. This statement literally means, “put in place”, but it specifically refers to organizing and arranging all needed ingredients and tools required to prepare menu items during food service. Many culinary instructors will state that proper mise en place, or simply “mise” in shorthand, is the most important characteristic that separates a professional chef from a home cook.

There is a lot of room for mise in security investigations as well. Most analysts already practice this to some degree by making sure that their operating system is configured to their liking. They have their terminal windows configured with a font and colors the make it easy to read, they have common OSINT research sites readily accessible as browser favorites, and they have shortcut icons to all of their commonly used tools. At a higher level, some analysts even have custom scripts and tools they’ve written to minimize repetitive tasks. These things are highly encouraged.

While analysts don’t have to worry about physical movement as much, they do have to work about mental movement. In an ideal situation an analyst can get to the end of an investigation with as few steps as possible, and a strategic organization of their digital workspace can help facilitate that. I’ve seen some organizations that seek to limit the flexibility analysts have in their workspace by enforcing consistent desktop environments or limiting access to additional tools. While policies to enforce good security and analysis practices are great, every analysts learns and processes information in a different way. It isn’t only encouraged that analysts have flexibility to configure their own operating environments, it’s critical to helping them achieve success.

Beyond the individual analysts workstation, the organization can also help out by providing easy access to tool and data, and processes that support it. If an analyst has to connect to five systems to retrieve the same data, that is too much mental movement that could be better spent formulating and answering questions about the investigation. Furthermore, if organizations limit access to raw data it could force the analyst to make additional mental moves that slow down their progress.

Chefs make minimal trips to the fridge/pantry. When you are cooking dinner at home you likely make multiple trips to the fridge to get ingredients or to the pantry to retrieve spices during the course of your meal. That might look something like this:

“I think this soup needs a bit more tarragon, let me go get it. “

or…

“I forgot I need to add an egg to the carbonara at the end, I’ll go get it from the fridge.”

Building on the concept of mise en place, professional chefs minimize their trips to the fridge and pantry so that they always have the ingredients they need with as few trips as possible. This ensures they are focused on their task, and also minimizes prep and clean up time. They also ensure that they get an appropriate amount of each ingredient to minimize space, clean up, and waste.

chef_mise

Figure 2: Chef’s Gather and Lay Out Ingredients for Multiple Dishes – Mise en Place

One of the most common tasks an analyst will perform during an investigation is retrieval of data in an attempt to answering questions. This might include querying a NetFlow database, pulling full packet capture data from a sensor, or querying log data in a SIEM.

Inexperienced analysts often make two mistakes. The first is not retrieving enough data to answer their questions. This means that the analyst must continue to query the data source and retrieve more data until they get the answer they are looking for. This is equivalent to a chef not getting enough flour from the pantry when trying to make bread. On the flip side, another common pitfall is retrieving too much data, which is an even bigger problem. In these situations an analyst may not limit the time range of their query appropriately, or simply may not use enough filtering. The result is a mountain of data that takes a significant amount of time to wade through. This is equivalent to a chef walking back from the fridge with 100 eggs when they only intend to make a 3-egg omelet.

Learning how to efficiently query data sources during an investigation is product of asking the right questions, understanding the data you have available, and having the data in a place that is easily accessible and reasonably consolidated. If you can do these things you should be able to ensure you are making less trips back to the pantry.

Chefs carefully select, maintain, and master their tools. Most chefs spend a great deal of time and money purchasing and maintaining their knives. They sharpen their knives before every use, and have them professionally refinished frequently. They also spend a great deal of time practicing different types of cuts. A dull or improperly used knife can result in inconsistently cut food, which can lead to poor presentation and even cause under or overcooked food if multiple pieces of food are cooked together but are sized differently. Of course, this could also lead to you accidentally cutting yourself. These concepts go well beyond knives; a bent whisk can result in clumped batter, and an unreliable broiler can burn food. Chefs have to select, maintain, and master a variety of tools to perform their job.

chef_tools

Figure 3: A Chef’s Travel Kit Provides Well-Cared For Essential Tools

In a security investigation tools certainly aren’t everything, but they are critically important. In order analyze network communication you have to understand the protocols involved at a fundamental level, but you also need tools to sort through them, generate statistics, and work towards decision points. Whether it is a packet analysis tool like Wireshark, a flow data analysis tool like SiLK, or an IDS like Snort, you have to understand how those tools work with your data. The more ambiguity placed between you and raw data, the greater chance for assumptions that could lead to poor decisions. This is why it is critical to understand how to use tools, and how they work.

Caring for tools goes well beyond purchasing hardware and ensuring you have enough servers to crunch data. At an organization level it requires hiring the right number of people in your SOC to help manage the infrastructure. Some organizations attempt to put that burden on the analysts, but this isn’t always scalable and often results in analysts being taken away from their primary duties. This is also the “piling on” of responsibilities that results in analysts getting frustrated and leaving a job.

Beyond this, proper tool selection is important as well. I won’t delve into this too much here, but careful consideration should be given to free and open source tools, as well as the potential for developing in house tools. Enterprise solutions have their place, but that shouldn’t be the default go-to. The best work in information security in most cases is still done at the free and open source level. You should look for tools that support existing processes, and never let a tool alone dictate how you conduct an investigation.

Chefs can cook in any kitchen. When chefs master all of the previously mentioned concepts, it allows them to apply those concepts in any location. If you watch professional cooking competitions, you will see that most chefs come with only their knife kit and are able to master the environment of the kitchen they are cooking in. For example, try watching “Chopped” sometime on Food Network. These chefs are given short time constraints and challenging random ingredients. They organize their workspace, assess their tools, make very few trips to get ingredients, and are able to produce five star quality meals.

chef_chopped

Figure 4: Professional Chef’s Competing in an Unfamiliar Kitchen on Food Network’s Chopped

In security investigations, this is all about understanding the fundamentals. Yes, tools are important as I mentioned earlier, but you won’t always work in an environment that provides the same tools. If you only learn how to use Arcsight then you will only ever be successful in environments that use Arcsight. This is why understanding higher-level investigative processes that are SIEM-independent is necessary. Even at a lower level, understanding a tool like Wireshark is great, but you also need to understand how to work with packets using more fundamental and universal tools like tcpdump, as you may not always have access to a graphical desktop. Taking that step further, you should also understand TCP/IP and network protocols so that you can make better sense of the network data you are analyzing without relying on protocol dissectors. A chef’s fundamental understanding of food and cooking methods allows them to cook successfully in any kitchen. An analyst’s fundamental understanding of systems and networking allows them to investigate in any SOC.

Conclusion

Humans have been cooking food for thousands of years, and have been doing so professionally for much longer than computers have even existed. While the skills needed to be chef are dramatically different than those needed to investigate network breaches, there are certainly lessons to be learned here. Now, if you’ll excuse me, writing this has made me hungry.

* Figures 1-3 are from “The Four-Hour Chef” by Tim Ferriss. One of my favorite books.

Teaching Good Investigation Habits Through Reinforcement

Press_for_food-fullThe biggest responsibility that leaders and senior analysts in a SOC have is to ensure that they are providing an appropriate level of training and mentoring to younger and inexperienced analysts. This is how we better our SOC’s, our profession, and ourselves. One problem that I’ve written about previously relates to the prevalence of tacit knowledge in our industry. The analysts who are really good at performing investigations often can’t describe what makes them so good at it, or what processes they use to achieve their goals. This lack of clarity and repeatability makes it exceedingly difficult to use any teaching method other than having inexperienced analysts learning through direct observation of those who are more experienced. While observation is useful, a training program that relies on it too much is flawed.

In this blog post I want to share some thoughts related to recent research I’ve done on learning methods as part of my study in cognitive psychology. More specifically, I want to talk a bit about one specific way that humans learn and how we might be able to better frame our investigative processes to better the investigation skills of our fellow analysts and ourselves.

Operant Conditioning

When most people think of conditioning they think of Pavlov and how he trained his dogs to learn to salivate at the sound of a tone. That is what is referred to as learning by classical conditioning, but that isn’t what I want to talk about here. In this post, I want to instead focus on a different form of learning called operant conditioning. While classical conditioning is learning that is focused on a stimulus that occurs prior to a response and is associated with involuntary response, operant conditioning is learning that is related to voluntary responses and is achieved through reinforcement or punishment.

An easy example of operant conditioning would be to picture a rat in a box. This box contains a button the rat can push with its body weight, and doing so releases a treat. This is an example of positive reinforcement that allows to rat to learn the associated that pressing the button results in a treat. The relationship is positively reinforced because a positive stimulus is used.

Another type of operant conditioning reinforcement is negative reinforcement. Consider the same rat in a different box with a button. In this box, a mild electrical charge is passed to the rat through the floor of the box. When the rat presses the button, the electrical charge stops for several minutes. In this case, negative reinforcement is being used because it teaches the rat a behavior by removing a negative stimulus. The key takeaway here is that negative reinforcement is still reinforcing a behavior, but in a different way. Some people confuse negative reinforcement with punishment.

Punishment is the opposite of reinforcement because it reduces the probability of a behavior being expressed. Consider the previous scenario with the rat in the electrified room, but instead, the room is only electrified when the rat presses the button. This is an example of a punishment that decreases the likelihood of the rat pressing the button.

Application to Security Investigation

I promise that all of this talk about electrifying rats is going somewhere other than the BBQ pit (I live in the deep south, what did you expect?). Earlier I spoke about the challenge we have because of tacit knowledge. This is made worse in many environments where you have access a mountain of data but have an ambiguous workflow that can allow an input (alert) to be taken down hundreds of potential paths. I believe that you can take advantage of a fundamental construct like operant conditioning to help better your analysts. In order to make this happen, I believe there are three key tasks that must occur.

Identify Unique Investigative Domains

First, you must designate domains that lend themselves to specific cognitive functions and specializations. For instance, triage requires different skills sets and cognitive processes than hunting. Thus, those are two separate domains with different workflows. Furthermore, incident response requires yet another set of skills and cognitive processes, making it a third domain of investigation. Some organizations don’t really distinguish between these domains, but they certainly should. I think there is work to be done to fully establish investigative domains (I expect lots of continued research here on my part), and more importantly, criteria for defining these domains. But at a minimum you can easily pick out a few domains relevant to your SOC, like I’ve mentioned above.

Define Key Workflow Characteristics and Approaches

Once you’ve established domains you can attempt to define their characteristics. This isn’t something you do in an afternoon, but there are a few clear wins. For instance, triage is heavily suited to divergent thinking and differential diagnosis techniques. On the other hand, hunting is equally reliant on convergent and divergent thinking and is well suited to relational (link) analysis. These are characteristics you can key on in your workflows moving on to the next step.

Apply Positive and Negative Reinforcement in Tools and Processes

Once you know what paths you want analysts to take, how do you reinforce their learning so that they are compelled to do so? While some of us would like to consider a mechanism that provides punishment via electrified keyboards, positive and negative reinforcement are a bit more appropriate. Of course, you can’t give an analyst a treat when they make good decisions, but you can provide reinforcement in other ways.

For an investigation, there is no better positive stimulus than providing easy and immediate access to relevant data. When training analysts, you want to ensure they are smart about what data they gather to support their questioning. Ideally, an analyst only gathers the amount of information the need to get the answer they want. More skilled analysts are able to do this quickly without spending too much time re-querying data sources for more data or whittling excess away from data sets that are too large. Whenever an analyst has a questions and your tool or process helps them answer it in a timely manner, you are positively reinforcing the use of that tool or process. Furthermore, when the answer to that question helps them solve an investigation, you are reinforcing the questions the analyst is putting forth, which helps that analyst learn what questions are most likely to help them achieve results.

Negative reinforcement can be used advantageous here as well. In many cases analysts arrive at points in an investigation where they simply don’t know what questions to ask next. With no questions to ask, the investigation can stall or prematurely end. When chasing a hot lead, this can result in frustration, despair, and hopelessness. If the tools and processes used in your SOC can help facilitate the investigation by helping the analysts determine their next logical set of questions, then that can serve as negative reinforcement by removing the negative stimuli of frustration, despair, and hopelessness. At this point you aren’t only help the analyst further a single investigation, you are once again reinforcing questions that help them learn how to further every subsequent investigation they will conduct.

Other Thoughts

While the previous sections identified some structured approaches you can take towards bettering your analysts, I had a few less structured thoughts I wanted to share in bullet points. These are ways that I think SOC’s can help achieve teaching goals in every day decisions:

  • How can you continually provide positive reinforcement to help analysts learn to make good decisions?
  • If you are making a decision for analysts, let them know. Little things like data normalization and timestamp assumptions can make a difference. Analyst knowledge of these things further help them understand their own data and how we manipulate it for their (hopeful) betterment. Less abstraction from data is critical to understanding the intricacies of complex systems.
  • You must be aware of when you punish your analysts. This occurs when a tool or process prevents the user from getting data they need, takes liberties with data, fails to produce consistent results, etc. If a process or tool is frustrating for a user, then that punishment decreases the likelihood that they will use it, even if it represents a good step in the investigation. You want to at all costs avoid tools and processes that steer your analysts away from good analytic practices.

Conclusion

This is another post that is pretty heavy in theory, but it isn’t so far away from reality that it doesn’t’ have the potential for real impact in the way you make decisions about the processes and tools used in your SOC, and how you train your analysts. As our industry continues to work on developing workflows and technologies we have to think beyond what looks good and what feels right and grasp the underlying cognitive processes that are occurring and the mental challenges we want to help solve. One method for doing this is a thoughtful use of operating condition as a teaching tool.

Theory of Multiple Intelligences for Security Analysts – Initial Thoughts

Obrainicon_bluene of the more interesting concepts I’ve come to study recently is the theory of multiple intelligences, which was originally proposed in the 1980s by Dr. Howard Gardner, a developmental psychologist. The Theory of Multiple Intelligence (MI) simply states that rather than humans having a singular intelligence, we have a set of different intelligences that are independent and entirely unique. While his theory does have some detractors and competing schools of thought, it has generally been met with great intrigue and is a popular area of study for developmental, cognitive, and industrial psychology scholars alike. In this post I want to discuss the theory of MI and how I think it relates to security investigations. While you might be expecting concise post full of conclusions with a nice bow on it, this article is more about raising questions and getting some of my notes on paper for further research.

Multiple Intelligences

We often think of intelligence as a measure of how much someone knows about something, but that more accurately describes aptitude than intelligence. An intelligence is actually a computation capacity. This is why true intelligence tests that result in intelligence quotient (IQ) scores are much more about measuring someone’s ability to learn than what they have learned. Traditionally, intelligence was viewed as a single biological construct. The theory of MI pluralizes this concept of computational capacity such that more than one of them exists, and that they exist as independent intelligences.

There are several criteria surrounding what the core intelligences are. This includes the intelligence being universal to the entire human species, an identifiable set of core operations, and a susceptibility to encoding in a symbol system where meaning can be captured. I don’t want to delve too far into these criteria here, but if you are interested in this you can read more in Dr. Gardner’s books mentioned at the end of the post. The result of Gardner’s study into MI resulted in the formulation of seven intelligences, with the assertion that all humans have the full range of these intelligences. I’ll give a basic outline of those now.

  • Musical-Rhythmic: Has to do with sensitivity to sounds, rhythms, tones, and music. People with high musical intelligence can often recognize and match pitch well, and are able to sing, play instruments, and compose music. These people usually excel careers as musicians, composers, singers, or producers.
  • Bodily-Kinesthetic: Relates to control of one’s bodily motions and the capacity to handle objets skillfully, to include a sense of timing and muscle memory. People with high bodily-kinesthetic intelligence are generally good at physical activities like working out, sports, dancing, or craftsmen activities. These people usually excel in careers as athletes, dancers, and various types of builders.
  • Logical-Mathematical: Has to do with logic, reasoning, numbers, and critical thinking. People with high logical-mathematical intelligence excel at problem solving, thinking about abstract ideas, solving complex computations, and conducting scientific experiments. These people usually excel in careers as scientists, programmers, engineers, and accountants.
  • Verbal-Linguistic: Deals with the ability to process, interpret, and form words. People with high verbal-linguistic intelligence are good at reading, writing, telling stories, and memorizing words and dates. These people usually excel in careers as writers, lawyers, journalists, and teachers.
  • Visual-Spatial: Has to do with the ability to visualize things in the mind. People with high visual-spatial intelligence excel at navigating, doing jigsaw puzzles, reading maps, recognizing patterns, interpreting graphs and charts, and daydreaming. These people usually excel in careers as architects, artists, and engineers.
  • Interpersonal: Focused on interaction with others and the ability to recognize and be sensitive to others moods, feelings, temperaments, and motivations. People with high interpersonal intelligence communicate effectively and empathize well with others. They often enjoy debates and excel at verbal and nonverbal communication. These people usually excel in careers as psychologists, counselors, politicians, and sales.
  • Intrapersonal: Focused on introspective and self-reflective capacities. People with high intrapersonal intelligence have a strong ability to assess their own strengths and weaknesses and predict their own reactions and emotions. These people usually excel in careers as writers, scientists, and philosophers.

A key takeaway under MI theory is that every human is born with each of these intelligences, but no two people have the same level of every intelligence. Even identical twins will have varying levels of each intelligence because we know that intelligence is shaped by nature and nurture. Additionally, we know that just because someone has a high level of a particular intelligence doesn’t mean that they will use that intelligence in a smart manner. For instance, someone with high logical-mathematical intelligence might choose to use their intelligence to guess lottery numbers for a living instead of applying it to one of the sciences, accounting, etc.

Intelligence and Security Investigations

Whether or not you subscribe to MI theory, it does provide an interesting approach towards viewing how and why certain people excel in different types of security investigations. There are multiple types of security investigation domains, including event-driven (triage) analysis, NSM hunting, malware analysis, and forensic response. I hold that each of these domains requires a specific balance and emphasis of abilities and computational capacity. With that in mind, it brings about an interesting question of which intelligences are most suited to particular types of security investigations.

The first thing that must be considered is whether each investigative domains is more suited to a laser or search light intellectual profile. These terms define the manner in which people typically excel in certain intelligences. A laser is a person who generally has a high elevation in one or two intelligences. A search light is a person who has an equal level of moderately elevated intelligence in three to four intelligences, but does not have a very high elevation in any one intelligence. Lasers tend to focus on one specific focus area or task, where as search lights tend to work in areas that require a constant surveying of multiple elements to form a bigger picture. Once each investigative domain is tied to a laser or search light profile, the individual intelligences that are most applicable can be determined.

I don’t have a lot of concrete thoughts yet related to which intellectual profiles and intelligences are suited to each investigative domain, and I certainly don’t have a thorough accounting for every relevant domain for information security. However, I do have some initial thoughts that warrant more research. I could postulate on this for quite some time, but a few things that initially come to mind including the following:

  • Most traditional computer scientists would probably think that security investigations are almost exclusively related to logical-mathematical intelligence. I’d challenge this for some investigative domains. In a  lot of cases I believe visual-spatial intelligence is much more important.
  • Malware analysis tends to lean more towards a laser profile. It also requires a great deal of logical-mathematical intelligence due to the need to interpret and reverse engineer source code during static analysis.
  • Triage analysis and forensic response requires visual-spatial intelligence because of all the moving parts that must be assimilated into a bigger picture. These are a product of the reliance on divergent thinking during these processes, and the need to rapidly shift to convergent thinking one a critical mass of ideas and knowledge has been reached.
  • Forensic response requires a greater deal of interpersonal intelligence due to the reliance on communication with various new and unfamiliar stakeholders. The ability to empathize and gauge moods is critical. I would guess that a search light profile would be most desired here.
  • Intelligence analysis requires an elevated interpersonal intelligence due to the need to assess motivations.
  • Analysis across most domains in a team setting requires some level of intrapersonal intelligence so that practitioners can identify their own deficiencies along the lines of alternative analysis methods.

If we can identify investigative domains and determine which intelligences are most suited to those, we can be a lot more successful in identifying the right people for those roles and educating them appropriately so that they are successful.  This is another step along the way towards converting tacit knowledge to explicit knowledge and gaining a better advantage in security analysis scenarios.

References:

Multiple Intelligences: New Horizons (2008), Howard Gardner

Frames of Mind: The Theory of Multiple Intelligences (1983), Howard Gardner

Perception, Cognition, and the Notion of “Real Time” Detection and Analysis

Preface

brainicon_blueAs a lot of folks who know me are aware, one of the areas of security that I spend the majority of my time researching is the analytic process and how the human component of an investigation works. I’ve written and spoken on this topic quite a bit, and I’ve dedicated myself to it enough that I’ve actually elected to go back to school to work in a second masters degree focused on cognitive psychology. My hope is that I can learn more about cognitive functions of the brain and psychological research so that I can work towards taking a lot of the tacit knowledge that is security investigation (NSM, IR, Malware RE, etc), and turning it into codified information that can help shape how we as an industry look at the analysis of security compromises. This article (and hopefully many more to come) is related to that study.

————- Post Starts Here ————-

I’ve never been a fan of declaring concepts, theories, or ideas to be “dead”. We all know how that went when Gartner declared IDS to be dead several years ago. Last I checked, intrusion detection is still widely used and relatively successful at catching intruders when done right. Even more, the numbers don’t lie as Cisco bought Sourcefire, makers of the world’s most popular IDS technology Snort, for 2.7 BILLION dollars last year. However, I do think it’s worth closely examining ideas that may have never really had a lot of life from inception. The concept I want to discuss here is the notion of “real time detection” as it relates to detecting the activity of structured threat actors.

I’m not going to get into the semantics of what constitutes “real time” versus “near real time” as that isn’t really the point of this article. I’ll suffice to say that when we talk about real time detection we are referring to the act of investigating alerts (typically generated from some type of IDS or other detection mechanism), and making a decision whether or not something bad has occurred and if escalation to incident response is necessary. This concept relies on event-driven inputs from detection mechanisms to mark the start of the analysis process, and quick decision-making immediately following the receipt of that input and a brief investigation.

With a real time detection and analysis approach there is a tremendous amount of pressure to make decisions rapidly near the beginning of the analysis process. Many security operation centers (SOCs) even track metrics related to the duration of time between an alert being generated and that alert being closed or escalated (often called “dwell time”). It isn’t uncommon for these SOCs to judge the performance of analysts or groups of analysts as a whole based on these types of metrics. The problem with this approach exists in that there are fundamental psychological barriers that are working against the analyst in this model. In order to understand these barriers, we need to examine how the mind works and is influenced.

Limitations of the Mind

The investigation or analysis process is based around cognition, which is the term used to refer to the rate at which humans can bridge the gap between perception and reality. In this sense, perception is a situation as we individually interpret it, and reality is a situation as it actually exists. Cognition can be nearly instant, such as looking at a shirt and recognizing that it is blue. In other situations, like security analysis, cognition can take quite some time. Furthermore, even after a lengthy cognition process an analyst may never fully arrive at a complete version of reality.

The thing that makes everyone different in terms of cognitive psychology is the mindset. A mindset is, essentially, the lens we view the world through. A mindset is something we are born with, and something that is constantly evolving. Any time we perceive something new, it is assimilated into our mindset and affects the next thing we will perceive. While we do have control over certain aspects of our mindset, it is impossible to be aware or in control of every portion of it. This is especially true of the subconscious portions of that mindset that are formed early on in our development. Ultimately, the mindset is a combination of nature and nurture. It takes it account where we were born, where we grew up, the values of our parents, the influence of our friends, life experiences, work experiences, relationships, and even our health.

realtimefig1

Figure 1: Our Mindset and Perception are Related

A mindset is a good thing because it is that mindset that allows us to all think differently and be creative in unique ways. In information security, this is what allows some analysts to exceed in very unique areas of our craft. However, the limitation imposed on us because of our mindset results in a few scenarios that negatively affect our perception and ability for rapid cognition.

Humans Perceive What they Expect to Perceive

The expectancy theory states that an individual will decide to behave or act in a certain way because they are motivated to select a specific behavior over other behaviors. While we often think of motivation as something overt and identifiable, that isn’t the case in most situations. Instead, these expectations and patterns of expectations are a product of our mindset, both the conscious and subconscious part of it. As an example, read the text in Figure 2.

realtimefig2

Figure 2: An Example of Expected Perception

Now, read the text in Figure 2 again. Did you notice that the article in each of the triangles was repeated? In this example, you probably didn’t because these phrases are common vernacular that you’ve come to expect to be presented a specific way. Beyond that, additional ambiguity was introduced by forming the words in a triangle such that they are interpreted in a manner that is not conducive to spotting the anomaly (but more on that later). The key takeaway here is that we are rarely in control of how we initially perceive something.

Mindsets are Quick to Form but Resistant to Change

If we are not in full control of our mindset, then it is fair to say that we cannot be in full control of our perception. This becomes a problem because aspects of our mindset are quick to form, but resistant to change. Cognitive psychology tells us that we don’t learn by creating a large number of multiple independent concepts, but rather, we form a few base concepts or images, and assimilate new information to those things.

This is why we rarely notice gradual changes such as weight gain in friends and family that we see often. You are very unlikely to notice if a coworker you see every day gains twenty pounds over six months. However, if you see a friend you haven’t seen in six months, you are much more likely to notice the added weight. Even if it isn’t very obvious, you are likely to think “Something looks different about Frank.”

Highly Ambiguous Scenarios are Difficult to Overcome

A scenario that is highly ambiguous is one that is open to multiple interpretations. This is the product of a large number of potential outcomes, but a limited amount of data for which to form a hypothesis of which outcome is most likely. A common experiment that is referenced to prove this relationship is related to interference in visual recognition, or something occasionally referred to as the “blur test.” In this experiment, a subject is exposed to a blurry image that slowly comes into focus until the image becomes clear enough to be identified. The independent variable in this experiment was the initial amount of blur in the image, and the dependent variable was the amount of blur remaining in the image when the subject was able to determine what was being visually represented. The psychologists conducting the experiment presented a set of images to subjects with varying degrees of initial blur, and measured the amount of blur remaining when the was able to identify what the image was.

The results were really interesting, because they showed with statistical significance that when an image with a higher initial amount of blur was presented to a subject, the image had to get much clearer in order for them to identify what it actually was. Conversely, when an image was presented with a lower initial amount of blur, subjects could identify what the image represented much sooner and well before the image had come fully into focus.

The amount of initial blur in this experiment represents a simple example of varying the level of ambiguity in a scenario, which can lead us to infer that higher initial ambiguity can lengthen the amount of time required to bridge the gap between perception and reality.

Applications to Security Investigation

When we consider the nature of the investigative process, we know that it is based on being presented with a varying amount of initial data in a scenario where there are hundreds or thousands of possible outcomes. This yields a situation where a dozen analysts could come up with a dozen initial hypotheses regarding what is actually happening on a network given a single input. Each analyst forms their initial perception, and as they continue to collect more data they assimilate that data to the initial perception.

In a real world scenario where analysts often handle investigations “cradle to grave” or “alert to escalation”, this presents a scenario where the evidence that has been gathered over time is never viewed from a clear perspective that is free from initial perception (and bias). Given that network traffic, malicious binaries, and log data can be incredibly deceiving, this is a limiting factor in the success of an investigation as our initial perception in an investigation is increasingly likely to be wrong. This speaks to the limitation I previously discussed regarding how mindsets are quick to form but resistant to change, and how a large initial amount of ambiguity, which is very common in security investigations, can lead to flawed investigations.

Ultimately, the scenarios in which security investigations are conducted contain many of the characteristics in which the mind is limited in its ability to bridge the gap between perception and reality.

Alternative Approaches

Identifying problems with the approaches we often take with security analysis is quite a bit easier than figuring out how to overcome those challenges. While I don’t think that there is a fix-all nor do I offer to present any panacea-like solutions, I think that we are entering an era where analysis is become an incredibly important part of the security landscape that justifies rethinking some of the ways we approach how we perform security investigations. Focusing on cognitive problems of analysis, I think there are three themes that we, as an industry, can do to improve how we get from alert to resolution. While I don’t think these three things encompass a complete paradigm shift in how alerts are investigated, I do believe that they will be part of it.

Separation of Triage vs. Investigation

While multiple definitions may exist, triage in terms of event-driven security analysis typically refers to the process of initially reviewing an alert to determine if more investigation is required, and what priority that investigation should have relative to other investigations. In the past, the triage process has been treated as a part of the broader investigative process; however they are fundamentally different activities requiring different skill sets. They are also subject to varying types and degrees of biases and cognitive limitations I discussed earlier. Those biases and limitation are often a lesser concern during an initial triage, and of much more concern to investigations that require a larger amount of time to complete.

Faster and less ambiguous analysis scenarios are still subject to bias and other limitation of the human mind to some extent, but real world application tells that there are many scenarios where a quick triage of an event to determine if more investigation is required can often be done on an individual basis. This is as long as that individual is of adequate experience and is using a structured and repeatable technique. That means that it is acceptable for a single human to handle the investigation of things like unstructured threats and relatively simply malware infections. After all, these things are often very clear-cut, and can usually be validated or invalidated by simply comparing network or host data to the signature which generated the alert.

On the other hand, investigations associated with structured threats, complex malware, or that are generally more initially ambiguous require a different and more lengthy approach, which is the scenario I will focus on exclusively in the next two items I will discuss. The key takeaway here is that we should treat triage and investigation as two separate but related processes.

Graduated Analysis

Although a single person can often perform triage-based analysis, this is not the case for more involved investigations. As evidence suggests, the analyst who performs the initial triage of an event is at a disadvantage when it comes to forming an accurate perception of what has really occurred once new data becomes available. Just as the subjects in the “blur test” were less successful in identifying an image when a larger amount of initial blur was present, analysts who are investigating a security event are less likely to identify the full chain of events if they start at a point where only minimal context details are available.

Because cognitive limitations prevent us from efficiently reforming our perceptions when new data becomes available, it makes a case to perform hand-offs to other analysts at specific points during the investigation. Thus, we are shifting the primary investigator of the investigation such that the investigation gradually receives more clarity and narrows the cognition gap that may be present. Determining when these hand-offs should occur is hard to predict since organizational structures can vary. However, at baseline it is reasonable to estimate that a handoff should occur at least after the initial triage. Beyond this, it may make sense for hand-offs to occur at points in time when there is a dramatic influx of new and relevant information, or when the scope of the investigation broadens widely.

This approach creates an interesting byproduct. If all significant investigations are handed off after triage, this essentially creates an analyst who is exclusively focused on alert triage. Considering this is its own workflow requiring a unique set of skills, this can be looked on as a benefit to a graduated approach. While a graduated approach doesn’t necessarily require a graduated skill level in analysts (such as level 1, 2, and 3 analysts), logic would suggest that this might be beneficial from a resource staffing perspective. In this model, only more skilled analysts are examining “higher tier” investigations encompassing a great deal more data. On the other hand, some might suggest that the triage analyst should be one of the more skilled analysts, as they will be defining additional data points for collection over time that will shape the course of the investigation. There does not yet exist enough data to determine which approach yields the greatest benefit.

“Realistic Time Detection” in Favor of “Real Time Detection”

The nature of traditional analysis dictates that an analysts is presented with some input data and is asked to make a rapid decision whether or not a breach has occurred, and to what extent. I believe that the immense pressure to make a quick and final decision is not based on the needs of the situation at hand, but rather, the unrealistic expectation we have placed on the role of the analyst. It is logically unreasonable to expect to detect and ascertain all of the pertinent details of a potential compromise in any kind of manner that resembles real time or near real time. Even if you are able to determine that malware has been installed and C2 communication is present, you still don’t know how the attacker got in, what other machines they are interacting with, the nature of the attacker (structured or unstructured), or if an attack if ongoing.

Research has shown that the average attacker spends 244 days on a network. With that large time range working against us, it is not entirely reasonable to shoot for detecting and explaining the presence of an attacker in anything resembling real time. Most individuals who have researched structured attackers or have pretended to be them will tell you that these information campaigns are focused on objectives that require quite a bit of effort to find the data that is desired and also require a long-term persistent campaign in order to continually collect data and achieve the campaign goals. Thus, detecting, containing, and extricating an attacker from your network at day 15 isn’t horribly ineffective. It isn’t as ideal, but we are dealing with circumstances that sometimes call for less than ideal solutions. Ultimately, I would rather focus on strategic “realistic time” detection and catch an adversary on day +12 rather than focus on “real time” detection and miss an adversary on day 0 due to a flawed investigative approach, only to be notified by a third party that the attacker has been in my network for quite some time on day +200.

Focusing on a slower more methodical approach to analysis isn’t easy, and to be honest, I don’t clam to know what that whole picture looks like. I can deduce that it does contain some of the following characteristics, in addition to the notions of segregated triage and graduated analysis mentioned above:

  • Case Emphasis – The investigative process should be treated not unlike a medical case. First, symptoms are evaluated. Next, data is gathered and tests are performed. Finally, observations are made over time that are driven by desired data points. These things build until a conclusion can be made, and may take quite some time. A lack of data doesn’t warrant ignoring symptoms, but rather, a deeper focus on collecting new data.
  • Analytic Technique – Analysts should be able to identify with multiple types of analytic techniques that are well suited to their strengths and weaknesses, and also to different scenarios. There has been little study into this area, but we have a lot to learn from other fields here with techniques like relation investigation and differential diagnosis.
  • Analysis as a Study of Change – While traditional investigations focus almost exclusively on attempting to correlate multiple static data points, this needs to also include a focus on changes in anticipated behavior. This involves taking a baseline followed by multiple additional measurements at later points in time, and then comparing those measurements. This is a foundational approach that is practiced frequently in many types of scientific analysis. While some may confuse this with “anomaly-based detection”, this is a different concept more closely associated with “anomaly-based analysis”. Currently, the industry has a lack of technology that supports this and other aspects of friendly intelligence collection.
  • Post-Exploitation Focus – The industry tends to focus dramatically on the initial exploitation and command and control function of an attack life cycle. We do this because it supports a real time detection model, but if we are truly to focus on realistic time detection and the study of change, we must focus on things that can more easily be measured when compared to normal behavior and communication sequences. This lends itself more towards focusing on post-exploitation activities more closely tied to attackers actions on objectives.

Conclusion

The thoughts presented here are hardly conclusive, but they are founded in scientific study that I think warrants some time of change. While I’ve suggested some major shifts that I think need to take place in order to shore up some of the deficiencies in cognition, these are merely some broad ideas that I’ll be the first to admit haven’t been fully and completely thought out or tested. My hope is that this article will serve to raise more questions, as these are concepts I’ll continue to be pursuing in my own research of investigative techniques and cognitive psychology.

References