Category Archives: Investigations

Research Call for Security Investigators

brainicon_blueI’m currently seeking security investigators for a research study I’m conducting on cognition and reasoning related to the investigative process. I need individuals who are willing to sit down with me over the phone and participate in an interview focused on individual investigations they’ve worked. The interviews will be focused on describing the flow of the investigation, your thought process during it, and challenges you encountered. I’ll ask you to describe what happened and how you made specific decisions. Specifically, I’m looking for investigations related to the following areas:

  • Event Analysis: You received some kind of alert and investigated it to determine whether it was a true positive or false positive.
  • Incident Response: You received notification of a breach and performed incident response to locate and/or remediate affected machines.

Ideally, these should be scenarios where you felt challenged to employ a wide range of your skills. In either domain, the scenario doesn’t have to lead to a positive confirmation of attacker activity. Failed investigations that led to a dead end are also applicable here.

A few other notes:

  • You will be kept anonymous
  • Any affected organization names are not needed, and you don’t have to give specifics there. Even if you do, I won’t use them in the research.
  • You will be asked to fill out a short (less than five minute) demographic survey
  • The phone interview will be recorded for my review
  • The phone interview should take no longer than thirty minutes
  • If you have multiple scenarios you’d like to walk through, that’s even better
  • At most, the scenario will be generalized and described at a very high level in a research paper, but it will be done in a generic manner that is not attributable to any person or organization.

If you’d like to help, please e-mail me at with the subject line “Investigation Case Study.”

Inattentional Blindness in Security Investigations

*Disclaimer: Psychology Related Blog Post*

bellJoshua woke up on a frigid Friday morning in Washington, DC and put on a black baseball cap. He walked to the L’Enfant metro station terminal and found a nice visible spot right near the door where he could expect a high level of foot traffic. Once positioned, he opened his violin case, seeded it with a handful of change and a couple of dollar bills, and then began playing for about 45 minutes.

During this time thousands of people walked by and very few paid attention to Joshua. He received several passing glances while a small handful stopped and listened for a moment. Just a coupe lingered for more than a minute or two. When he finished playing, Joshua had earned about twenty-three dollars beyond the money he put into the case himself. As luck would have it, twenty of those dollars came from one individual who recognized Joshua.

Joshua Bell is not just an ordinary violin player. He is a true virtuoso who has been described as one of the best modern violinist in the world, and he has a collection of performances and awards to back it up. Joshua walked into that metro terminal, pulled out a three hundred year old Stradivarius violin, and played some of the most beautiful music that most of us will hear in our lifetime. That leaves the glaring questions: why did nobody notice?

Inattentional Blindness

Inattentional blindness (IB) is an inability to recognize something in plain sight, and it is responsible for the scenario we just described. You may have heard this term before if you’ve had the opportunity to subject yourself to this common selective attention test:

As humans, the ability to focus our attention on something is a critical skill. You focus when you’re driving to work in the morning, when you are performing certain aspects of your job, and when you are shopping for groceries. If we didn’t have the ability to focus our attention, we would have a severely limited ability to perceive the world around us.

The tricky thing is that we have limited attention spans. We can generally only focus on a few at a time, and the more things we try to focus on, the less overall focus can be applied to any one thing. Because of this, it is easy to miss things that are right in front of us when we aren’t focused on finding them. In addition, we also tend to perceive what we expect to perceive. These factors combine to produce situations that allow us to miss things right in front of our eyes. This is why individuals in the metro station walked right by Joshua’s performance. They were focused on getting to work, and did not expect a world-class performer to be playing in the middle of the station on a Friday morning.

Manifestations in Security

As security investigators, we must deal with inattentional blindness all the time. Consider the output shown in Figure 1. This screenshot shows several TCP packets. At first glance, these might appear normal. However, an anomaly exists here. You might not see it because it exists in a place that you might not expect it to be, but it’s there.


Figure 1: HTTP Headers

In a profession where we look at data all day it is quite easy to develop expectations of normalcy. As you perform enough investigations you start to form habits based on what you expect to see. In the case of investigating the TCP packets above, you might expect to find unexpected external IP addresses, odd ports, or weird sequences of packets indicating some type of scan. As you observe and experience these occurrences and form habits related to how you discover them, you are telling your mind to build cognitive shortcuts so that you can analyze data faster. This means that your attention is focused on examining these fields and sequences, and other areas of these packets lose part of your attention. While cognitive shortcuts like these are helpful they can also promote IB.

In the example above, if you look closely at other parts of the packets, you will notice that the third packet, a TCP SYN packet initiating the communication between and actually has a data length value of 5. This is peculiar because it isn’t customary to see data present in a TCP SYN packet whose purpose is simply to establish stateful communication via the three-way handshake process. In this case, the friendly host in question was infected with malware and was using these extra 5 bytes of data in the TCP SYN to check in to a remote host and provide its status. This isn’t a very common technique, but the data is right in front of our face. You might have noticed the extra data in the context of this article because the nature of the article made you expect something weird to be there, but in practice, many analysts fail to notice this data point.

Let’s look at one more example. In Figure 2 we see a screen populated with alerts from multiple sources fed into the Sguil console. In this case, we have a screen full of anomalies waiting to be investigated. There is surely evil to be found while digging into these alerts, but one alert in particular provides a unique anomaly that we can derive immediately. Do you see it?


Figure 2: Alerts in Sguil

Our investigative habits tell us that the thing we really need to focus on when triaging alerts is the name of the signature that fired. After all, it tells us what is going on and can relay some sense of priority to our triage process. However, take a look at Alert 2.84. If you observe the internal (RFC1918) addresses reflected in all of the other alerts, they all relate to devices in the range. Alert 2.84 was generated for a device in the range. This is a small discrepancy, but if this is not on a list of approved network ranges then there is a potential for a non-approved device on the network. Of course, this could just be a case of someone plugging a cheap wireless access point into the network, but it could also be a hijacked virtual host running a new VM spun up by an attacker, or a Raspberry Pi someone plugged into a hidden wall jack to use as an entry point on to your network. Regardless of the signature name here, this alert is now something that warrants more immediate attention. This is another item that might not be spotted so easily, even by the experienced analyst.

Everyone is susceptible to IB, and it is something we battle ever day. How can we try to avoid missing things that are right in front of our eyes?

Diminishing the Effects

The unfortunate truth is that it isn’t possible to eliminate IB because it is a product of attention. As long as we have the ability to focus our attention in one area, then we will become blind to things outside of that area. With that said, there are things we can do to diminish some of these affects and improve our ability to investigate security incidents and ensure we don’t miss as much.


The easiest way to diminish some of the affects of IB is through expertise in the subject matter. In our leading example we mentioned that there were a few people who stopped to listen to Joshua play his violin in the station. It is useful to know that at least two of those people were professional musicians themselves. Hearing the music as they walked through the station triggered the right mechanisms in their brain to allow them to notice what was occurring, compelling them to stop. This was because they are experts in the field of music and probably maintain a state of awareness related to the sound of expert violin playing. Amongst the hustle and bustle of the metro station, their brain allowed them not to miss the thing that people without that expertise had missed.

In security investigations it’s clear to see IB at work in less experienced analysts. Without a higher level of expertise these junior analysts have not learned how to focus their attention in the right areas so that they don’t miss important things. If you hand a junior analyst a packet capture and ask them where they would look to find evil, chances are their list of places to look would be much shorter than a senior analyst, or it would have a number of extraneous items that aren’t worth being included. They simply haven’t tuned their ability to focus attention in the right places.

More senior analysts have developed the skill to be able to selectively apply their attention, but they rarely have the ability to codify it or explain it to another person. The more experienced analysts get at identifying and teaching this information, the better chance of younger analysts getting necessary expertise faster.

Directed Focus

While analysts spend most of their time looking at data, that data is often examined through the lens of tools like SIEMs, packet sniffers, and command line data manipulation utilities. As a young industry, many of these tools are very minimal and don’t provide a lot of visual cues related to where attention should be focused. This is beneficial in some ways because it leaves the interpretation fully open to the analyst, but without having opinionated software this sort of thing promotes IB. As an example, consider the output of tcpdump below. Tcpdump is one of the tools I use the most, but it provides no visual queues for the analysts.


Figure 3: Tcpdump provides little in the way of visual cues to direct the focus of attention

We can compare Tcpdump to a tool like Wireshark, which has all sorts of visual cues that give you an idea of things you need to look at first. This is done primarily via color coding, highlighting, and segmenting different types of data. Note that the packet capture shown in Figure 3 is the same one shown in Figure 4. Which is easier to visually process?


Figure 4: Wireshark provides a variety of visual cues to direct attention.

It is for this reason that tools developed by expert analysts are desirable. This expertise can be incorporated into the tool, and the tool can be opinionated such that it directs users towards areas where attentional focus can be beneficial. Taking this one step farther, tools that really excel in this area allow analyst users to place their own visual cues. In Wireshark for example, analysts can add packet comments, custom packet coloring rules, and mark packets that are of interest. These things can direct attention to the right places and serve as an educational tool. Developing tools in this manner is no easy task, but as our collective experience in this industry evolves this has to become a focus.

Peer Review

One last mechanism for diminishing the affects of IB that warrants mention is the use of peer review. I’ve written about the need for peer review and tools that can facilitate it multiple times. IB is ultimately a limitation that is a product of an analyst training, experience, biases, and mindset. Because of this, every analyst is subject to his or her own unique blind spots. Sometimes we can correlate these across multiple analyst who have worked in the same place for a period of time or were trained by the same person, but in general everyone is their own special snowflake. Because of this, simply putting another set of eyes on the same set of data can result in findings that vary from person to person. This level of scrutiny isn’t always feasible for every investigation, but certainly for incident response and investigations beyond the triage process, putting another analyst in the loop is probably one of the most effective ways to diminish potential misses as a result if IB.


Inattentional blindness is one of many cognitive enemies of the analyst. As long as the human analyst is at the center of the investigative process (and I hope they always are), the biggest obstacle most will have to overcome is their own self imposed biases and limitations. While we can never truly overcome these limitations without stripping away everything that makes us human, an awareness of them has been scientifically proven to increase performance in similar fields. Combining this increased level of metacognitive awareness with an arsenal of techniques we can do to minimize the effect of cognitive limitations will go a long way towards making us all better investigators and helping us catch more bad guys.

Investigations and Prospective Data Collection

confused-winnerOne of the problems we face while trying to detect and respond to adversaries is in the sheer amount of data we have to collect and parse. Twenty years ago it wasn’t as difficult to place multiple sensors in a network, collect packet and log data, and store that data for quite some time. In modern networks, that is becoming less and less feasible. Many others have written about this at length, but I want to highlight two main points.

Attackers play the long game. The average time from breach to discovery is over two hundred days. Despite media jargon about “millions of attacks a day” or attacks happening “at the speed of light”, the true nature of breaches is that they are not speedy endeavors from the attackers side. Gaining a foothold in a network, moving laterally within that network, and strategically locating and retrieving target data can take weeks or months. Structured attackers don’t win when they gain access to a network. They win once they accomplish their objective, which typically comes much later.

Long term storage isn’t economical. While some organizations are able to store PCAP or verbose log data in terms of months, that is typically reserved for incredibly well funded organizations or the gov/mil, and is becoming less common. Even on smaller networks, most can only store this data in terms of hours, or at most a few days. I typically only see long term storage for aggregate data (like flow data) or statistical data. The amount of data we generate has dramatically outgrown our capability to store and parse through that data, and this issue it only going to worsen for security purposes.

Medicine and Prospective Collection

The problem of having far too much data to collect and analyze is not unique to our domain. As I often do, let’s look towards the medical field. While the mechanics are a lot different, medical practitioners rely on a lot of the same cognitive skills to investigate afflictions to the human condition that we do to investigate afflictions to our networks. These are things like fluid ability, working memory, and source monitoring accuracy all work in the same ways to help practitioners get from a disparate set of symptoms to an underlying diagnosis, and hopefully, remediation.

Consider a doctor treating a patient experiencing undesirable symptoms. Most of the time a doctor can’t look back at the evolution of a persons health over time. They can’t take a CAT scan on a brain as it was six months ago. They can’t do an ultrasound on a pancreas as it was two weeks ago. For the most part, they have to take what they have in front of them now or what tests can tell them from very recent history.

If what is available in the short term isn’t enough to make a diagnosis, the physician can determine criteria for what data they want to observe and collect next. They can’t perform constant CAT scans, ultrasounds, or blood tests that look for everything. So, they apply their skills and define the data points they need to make decisions regarding the symptoms and the underlying condition they believe they are dealing with. This might include something like a blood test every day looking at white blood cell counts, continual EKG readings looking for cardiac anomalies, or twice daily neurological response tests. Medical tests are expensive and the amount of data can easily be overwhelming for the diagnostic process. Thus, selectively collecting data needed to support a hypothesis is employed. Physicians call this a clinical test-based approach, but I like to conceptualize it as prospective data collection. While retrospective data looks at things that have previously been collected up until a point in time, prospective data collections rely on specific criteria for what data should be collected moving forward from a fixed point in time, for a set duration. Physicians use a clinical strategy with a predominate lean towards effective use of prospective data collection because they can’t feasibly collect enough retrospective data to meet their needs. Sound familiar?

Investigating Security Incidents Clinically

As security investigators, we typically use a model based solely on past observations and retrospective data analysis. The prospective collection model is rarely leveraged, which is surprising since our field shares many similarities with medicine. We all have the same data problems, and we can all use the same clinical approach.

The symptoms our patients report are alerts. We can’t go back and look at snapshots of a devices health over the retrospective long-term because we can’t feasibly store that data. We can look back in the near term and find certain data points based on those observations, but that is severely time limited. We can also generate a potential diagnosis and observe more symptoms to find and treat the underlying cause of what is happening on our networks.

Let’s look at a scenario using this approach.

Step 1

An alert is generated for a host (System A). The symptom is that multiple failed login attempts where made on the devices administrator account from another internal system (System B). 

Step 2

The examining analyst performs an initial triage and comes up with a list of potential diagnoses. He attempts to validate or invalidate each diagnosis by examining the retrospective data that is on hand, but is unable to find any concrete evidence that a compromise has occurred. The analyst determines that System B was never able to successfully login to System A, and finds no other indication of malicious activity in the logs. More analysis is warranted, but no other data exists yet. In other scenarios, the investigation might stop here barring any other alerting. 

Step 3

The analyst adds his notes to the investigation and prunes his list of diagnoses to a few plausible candidates. Using these hypothesis diagnoses as a guide, the analyst generates a list of prospective collection criteria. These might include:

  • System A: All successful logins, newly created user accounts, flow data to/from System B.
  • System B: File downloads, attempted logins to other internal machines, websites visited, flow data to/from System A.

This is all immensely useful data in the context of the investigation, but it doesn’t break the bank in terms of storage or processing costs if the organization needs to store the data for a while in relation to this small scope. The analyst tasks these collections to the appropriate sensors or log collection devices. 

Step 4

The prospective collections record the identified data points and deliver them exclusively to the investigation container they are assigned to. The analyst collects these data points for several days, and perhaps refines them or adds new collections as data is analyzed.

Step 5

The analyst revisits and reviews the details of the investigation and the returned data, and either defines additional or refined collections, or makes a decision regarding a final diagnosis. This could be one of the following:

  • System B appears to be compromised and lateral movement to System A was being attempted.
  • No other signs of malicious activity were detected, and it was likely an anomaly resulting from a user who lost their password. 

In a purely retrospective model the later steps of this investigation might be skipped, and may lead the analyst to miss the ground truth of what is actually occurring. In this case, the analyst plays the long game and is rewarded for it.

Additional Benefits of Prospective Collection

In addition to the benefits of making better use of storage resources, a model that leverages prospective collection has a few other immediate benefits to the investigative process. These include:

Realistic-Time Detection. As I’ve written previously, when the average time from breach to detection is greater than two hundred days, attempting to discover attackers on your network the second they gain access is overly ambitious. For that matter, it doesn’t acknowledge the fact that attackers may already be inside your network. Detection can often its hardest at the time of initial compromise because attackers are typically more stealthy at this point, and because less data exists to indicate they are present on the network. This difficulty can decrease over time as attackers get sloppier and generate more data that can indicate their presence. Catching an attacker +10 days from initial compromise isn’t as sexy as “real time detection”, but it is a lot more realistic. The goal here is to stop them from completing their mission. Prospective collection supports the notion of realistic-time detection.

Cognitive Front-Loading. Research shows us that people are able to solve problems a lot more efficiently when they are aware of concepts surrounding metacognition (thinking about thinking) and are capable of applying that knowledge. This boils down to have an investigative philosophy and a strategy for generating hypotheses and having multiple approaches towards working towards a final conclusion. Using a prospective collection approach forces analysts to form hypotheses early on in the process, promoting the development of metacognition and investigation strategy.

Repeatability and Identified Assumptions. One of the biggest challenges we face is that investigative knowledge is often tacit and great investigators can’t tell others why they are so good at what they do. Defining prospective collection criteria provides insight towards what great investigators are thinking, and that can be codified and shared with less experienced analysts to increase their abilities. This also allows for more clear identification of assumptions so those can be challenged using structured analytic techniques common in both medicine and intelligence analysis. I wrote about this some here, and spoke about it last year here.


The purpose of this post isn’t to go out and tell everyone that they should stop storing data and refocus their entire SOC towards a model of prospective collection. Certainly, more research is needed there. As always, I believe there is value in examining the successes and failures of other fields that require the same level of critical thinking that security investigations also require. In this case, I think we have a lot to learn from how medical practitioners manage to get from symptoms to diagnosis while experiencing data collection problems similar to what we deal with. I’m looking forward to more research in this area.

On the Importance of Questions in an Investigation

questionsI spend a large part of my day studying cognition related to security investigations, which can ultimately be boiled down to thinking about how we learn and process information during and around our investigative processes. As part of my research, one of my professors recently pointed me towards a TEDx video by Dan Rothstein entitled “Did Socrates Get it Wrong?”. In this fourteen minute talk Rothstein questions whether Socrates approach of expert led questioning, commonly referred to as the Socratic method, was wrong. He brings up quite a few fascinating points, but ultimately concludes that Socrates was right and wrong, and that strategic questioning is of the utmost importance, but that it can also be an entirely student lead exercise. The key here is that asking the right question is critical for exploration, and of course, getting to the right answer.

This has quite a few implications to security investigations. Strategic questioning as a means towards finding and eliminating bias is something that immediately comes to mind, but not what I want to talk about here.

At a more fundamental level is questioning as the essence of the investigation process. I tend to believe that an investigation itself is simply a question. Usually something like this:

  • What happened here?
  • Did we get compromised?
  • Did APT[x] access any of our information assets?

Going one step further, I would also hypothesize that every action we take during the course of an investigation can be distilled down into a question, like these:

  • Does the activity identified in this alert match what the signature was trying to detect?
  • Did internal Host A communicate with external Host B?
  • Did the device download and execute the stage two payload of this malware family?
  • Is there a log indicating that a specific file was accessed?

Most of the time these questions don’t materialize in this form. Typically, they develop in our subconscious and analysts go forth looking for answers before they’ve articulated the question fully. I may not actually ask myself “Does the data in this PCAP match what the signature was looking for in the appropriate context?” before I go look at the signature to see what it was attempting to detect, but subconsciously that is exactly what I’m doing. Research suggests that a lot of this can be attributed to the formulation of habits or intuition (potentially in a brain structure known as the precuneus) that help us be more cognitively efficient. While this type of intuition can help us get things done faster, there is immense value in ripping these things from our subconscious into our conscious thought so that they can be articulated.

A couple things come to mind immediately when assessing the value of articulating questions consciously. First, if all of an investigation can be based on questions, we must ensure we are asking the right questions. This requires us to be consciously aware of those questions before we seek to solve them. Second, if we hope to successfully train the next generation of analysts then we have to teach them to ask the right questions, again requiring us to be consciously aware of what they are.

If you are a security investigator or are responsible for training them, consider creating a culture of articulated questions in your SOC. Before acting, attempt to determine what question you are trying to answer and share that information with your peers. I would bet that you will find this type of strategic questioning will help you ask better questions and more effectively guide your investigation towards an appropriate goal.


Dan Rothstein, “Did Socrates Get it Wrong”, TEDx Somerville – 

Working Memory and the Visual Investigative Hypothesis

Late last year I wrote a blog post focused what I have perceived as a coming evolution of focus for security investigations. This evolution will push us into an era where the human analyst takes center stage in a security investigation, and where tools and processes will shift to augment human cognitive ability. In this article, I want to expand on some of those thoughts and describe my research on how human analysts solve investigations. This is summarized as a concept I refer to as visual investigative theory.

I want to begin by revisiting the KSU ethnographic study I called out in a previous article. When several KSU sociologists spent time performing an ethnographic study of a security operations center they had some very interesting findings. Based on those findings, I drew the following conclusions:

Investigative process knowledge is tacit. While experienced analysts have the ability to quickly solve investigations, they almost never have the ability to accurately describe what makes them so effective.

Fundamental skills and domains aren’t well established. We have an inability to identify the fundamental cognitive (not platform or technology specific) skills that are required to successfully detect and response to compromises. Further, we have not clearly identified subdomains of the broader security investigation domain, and differentiated the cognitive skills necessary to define and excel at each of them.

Knowledge transfer is limited. Without identified skills and domains, or adequate explicit process knowledge, our ability to train less experienced analysts is hampered. Most SOC’s rely exclusively on “over the shoulder” training where less experienced investigators simply watch experienced investigators work. While this has its place, a training program founded exclusively in this type of instruction is fundamentally flawed and lacks proper fundamental building blocks.

Investigations rely on intuition. The aforementioned findings lead to the conclusion that the investigative process relies heavily on intuition. Beyond tool and technology specific processes, investigators rely almost exclusively on what they might refer to as “gut feeling” to determine which steps they should take to connect the dots and solve the investigation at hand.

Examining Intuition

Intuition typically refers to the ability to understand something immediately without the need for conscious reasoning. The concept of intuition isn’t new, but its acceptance in the world of psychological research is. Psychology itself is a fairly young field, having only existed since the late 1800s and becoming exponentially more popular around the mid 1900s. Most founding fathers of psychology dismissed intuition. Even Sigmund Freud was famous for saying that “it is an illusion to expect anything from intuition.” However, that has changed in recent years with the development of more sophisticated brain imaging techniques.

If you’ve ever had a head injury where you’ve scrambled your eggs a bit, then there is a chance that you’ve been the beneficiary of an MRI scan. A newer and more advanced form of this is something called an fMRI scan, which allows doctors and researchers to measure the response level from certain areas of the brain when specific stimuli are introduced.

A group of researchers recently wanted to better understand the science behind intuition. To do this, they utilized fMRI technology to measure the response of different areas of the brain while presenting chess of varying degrees of expertise with match scenarios designed to draw upon their sense of intuition. While chess is very different from investigating security incidents, participants in each of these tasks claim to be successful thanks in part to unexplainable, tacit intuition.

In this scenario, researches selected two groups of chess players. The first group consisted of journeyman chess players who were familiar with the game, but would not be considered professionals or experts. The second group consisted of professional chess players with high global rankings. Both groups were presented with an image of a chessboard showing a game in progress for a short period of time. They were then asked questions relating to what moves they thought would be best next, while their neural response was measured using fMRI technology.

The results of this experiment were exciting because they identified a specific area of the brain where the chess experts showed significantly more activity than the inexperienced players. This area, called the precuneus, showed 2.1x more activity in the chess experts. This indicates that there is a biological basis for the unconscious thought that we’ve previously only been able to refer to as intuition. Because of this, many psychologists have begun to shift their beliefs such that they recognize the existence of intuition.


Figure 1: The precuneus is related to what we think of as intuition

This gets really interesting when you consider that the precuneus is also known to be responsible for portions of our working memory, and our capacity to form and manipulate mental images. Before we dive into that, let’s have a quick primer on how human memory works.

Modeling Memory

There are multiple theories and models related to how memory is organized, but the most widely accepted model breaks it down into three distinct categories.

Sensory Information Store (SIS) is the most volatile form of memory, and is associated with the lingering sensations that follow a stimulus. For instance, if you are starting at an object and close your eyes, you may still “see” the object for a brief period as though its printed on to the inside of your eyelids. This is an example of SIS.

Short-term Memory (STM) is volatile memory that exists in conscious thought. When you are actively thinking about something, you are using STM to do so. This is why STM is often referred to as working memory (WM). Things that we perceive and only contemplate for a short period of time that aren’t worthy of storing permanently are processed by STM. In computing terms, STM is akin to RAM.

Long-term Memory (LTM) is our most resilient form of memory. Once something gets encoded into LTM it is stored for a very long time. For input into LTM, some theorize that we only encode certain things into LTM while others propose that we encode most everything. For output from LTM, some propose that we store everything but simply can’t recall it all, while others propose that some things that are encoded eventually decay out over time. In computer terms, LTM is similar to the concept of disk storage.

For the purposes of this article, we are most concerned with short term / working memory. As with memory in general, there are multiple models for how STM is organized, but one of the most widely accepted is Baddeley’s Model of Working Memory.


Figure 2: Badelley’s model of working memory

In Badelley’s model, there are three components of WM that are all controlled by a central executive services.

The Phonological Loop stores audible information and prevents it from decaying by continuously repeating its contents. For example, it allows you to use working memory to remember a phone number by repeating it over and over again in your head.

The Episodic Buffer holds representations that integrate multiple types of information to form a single unified representation of memory. It was a more later and more recent addition to the model.

The Visuospatial Sketchpad (VSSP) allows us to mentally picture and manipulate visual information about objects. For example, if you picture a multi-colored cube rotating so that different colors face you as time advances, you are using the VSSP. It is this portion of working memory we are most concerned about for the purpose of this discussion.

Visual Investigative Hypothesis

We can apply what we just learned about working memory to the earlier discussion about intuition. As we discovered, intuition is strong related to the precuneus. Examination of other psychology and neurology research tells us that the precuneus is involved in several different things, including (surprise!) working memory and visuospatial processing. While not definitive, this does lead us to believe that the visuospatial sketchpad and the mental visualization and manipulation of objects may be related to intuition and how humans solve complex problems.

Of course, I’m not a neuroscientist and there is still quite a bit of ongoing research here. However, I think there are many cases when this theory makes sense. For example, prolific and prodigal musicians have been known to say that they can literally “see” the music as they are composing or playing it. Individuals who practice stock trading will also speak about how they can see trends forming before they actually happen, allowing them to execute smart orders and make a sizable profit. Even going back to our earlier discussion of chess, expert chess players will state that a reason they excel at competition is their ability to “see” the board and picture future situations better than their opponents.

It would truly appear that humans excel at processing information when it’s possible for them to visualize it, so why wouldn’t the same apply to security investigations? I’ve been an analyst myself for quite some time, and I’ve also had the pleasure of working with and speaking to a lot of other analysts, and I think this does apply. It’s important to realize that in a lot of cases, people may visualize things like this subconsciously without actually realizing that they are solving problems visually. I believe that individuals who excel at solving information security investigations also solve problems visually. In fact, I think that many subconsciously see a data or attackers moving thorough a network as they assimilate various data points from system logs, packet captures, and IDS alerts. I’ve summed this theory up into something I call visual investigative hypothesis.

In short, the visual investigative hypothesis states that security analysts are more efficient, and more likely to arrive at a conclusion based on an accurate representation of events that occurred when they are able to visualize the relationships that represent a network compromise and build a mental picture of an attacker moving through a network.

In psychology, most principals exist as either hypothesis or theories because our understanding of the brain, while advancing, is still very limited. Many highly probable concepts and others considerably less probable will likely never advance to being considered confirmed truths, so while I do expect to mold my research into a more sound theory, I certainly don’t expect to ever definitively and quantifiably prove it as a ground truth. A great deal of my doctoral coursework will be geared towards development of visual investigative hypothesis into more formal theory, which will involve continued efforts interviewing security analyst and conducting case studies regarding their investigative habits, failures, and successes.

Maximizing Working Memory Effectiveness

While there is still much work to do, if you subscribe to the visual investigative hypothesis there are a few ways you can begin shifting your investigative technique towards something that is much more visual. When considering working memory, its important to understand that it is a finite resource. Humans only have so much capacity in working memory, just like computers have only so much RAM. Some people have a larger WM capacity while some have less. In addition, external factors like tiredness and stress can negatively affect the situational capacity of WM. Knowing WM is a finite resource can guide us towards ideas for optimizing our investigative habits and the tools we use to perform our work.

As an example, consider the magic number seven, a theory developed by Princeton psychologist George Miller. This theory states that an average person can hold seven objects in working memory, plus or minus two. This means that if I were to list twenty random objects, you are likely to only remember five to nine of them. This is the result of biology, and most likely something that can’t really be changed person to person.

This applies to the investigative process when you think about all of the various pieces of information that an analyst has to store in WM when attempting to describe an anomalous event or breach. At any given point an analyst might need to consider a pair of IP addresses, a port number, protocol, two system roles, a detection signature, a file name, a portion of a file hash, a system name, a start time, and an end time. No wonder investigations push the limits of WM capacity.

Overcoming magic number seven and limitations of working memory is all about making the right information available at the right time, and in the right way. Some ways that we can do this during an investigation include:

Data Scoping: Analyst should only retrieve the information they need for the time duration required. Have too little data is a bad thing, but having too much data can be just as bad. This can be achieved by formulating concise questions before seeking data, and making sure your data sources can be queried flexibly.

Focusing on Relationships: Humans remember things better if they can associate them with existing schemas in long-term memory. If I were to tell you ten random objects and ask you to recall them an hour later, you would have trouble doing so. If I repeated the same experiment with related items like breakfast foods, your recall would be much better. We can force objects in an investigation into similar schemas by describing entities as nouns and their interactions as verb, building graph/link representations that help us conceptualize a potential attackers movement through a network this way. One of the bigger gaps between network attackers and defenders is that attackers often think in this type of relationship-centric manner, and defenders don’t.

Rethinking Search: Searching through data itself should be less of an iterative process of querying a data source, viewing a response, and repeating. It should be more of an exploration where the analyst anchors themselves to a point in the data and they explore outward from there. This supports a relationship-centric view of security.

Visualizing Events over Time: The activities of a suspected adversary typically lend themselves well to groupings of major and minor events occurring at specific times. Using timelines to represent these groupings of events with pointers back to the source data can provide a visual construct that is useful for easing pressure on WM.

Easy to Remember Names: Long strings of characters like MD5/SHA1 hashes or even IP addresses take up valuable space in WM. Often times analysts will try to remember sections of these objects just as the last octet of an IP address or the last few characters of a file has. Another strategy here is to assign common names to various unique hosts and files for quick reference during the investigation. I’ve done this with animals or food in the past. Thus, f527fe6879ae8bf31cbb1e5c32d0fc33 becomes Fennel, and becomes Puma. This is made easier when the tools used facilitate it. Of course, protocols like DNS can make this easier too, but its important to remember that a DNS name simply represents a point to a host, and not a host in itself.


The concepts surrounding the visual investigative hypothesis aren’t new. Most of us know that the right visualizations can help us find evil better, but beyond that we don’t collectively have a lot of solid science that we can use to apply it to security investigations or how we train analysts. While I think there are some practical takeaways we can draw from this immediately, there is still much work to be done. I’m looking forward to continuing my research here and applying cognitive psychology concepts to the security investigation process.