Announcing the Investigation Theory Online Course

Investigation Theory LogoI’m excited to announce my newest training course, with a portion of the proceeds supporting multiple charities.

Register Here

When I first started out, learning how to investigate threats was challenging because there was no formal training available. Even in modern SOCs today, most training is centered around specific tools and centers too much around on the job training. There has never been a course dedicated exclusively to the fundamental art and science of the investigation process…until now.

If you’re a security analyst responsible for investigating alerts, performing forensics, or responding to incidents then this is the course that will help you gain a deep understanding how to most effectively catch bad guys and kick them out of your network. Investigation Theory is designed to help you overcome the challenges commonly associated finding and catching bad guys.

  • I’ve got so many alerts to investigate and I’m not sure how to get through them quickly.
  • I keep getting overwhelmed by the amount of information I have to work with an investigation.
  • I’m constantly running into dead ends and getting stuck. I’m afraid I’m missing something.
  • I want to get started threat hunting, but I’m not sure how.
  • I’m having trouble getting my management chain to understand why I need the tools I’m requesting to do my job better.
  • Some people just seem to “get” security, but it just doesn’t seem to click for me.

Course Format

Investigation Theory is not like any online security training you’ve taken. It is modeled like a college course and consists of two parts: lecture and lab.  The course is delivered on-demand so you can proceed through it at your convenience. However, it’s recommended that you take a standard 10-week completion path, or an accelerated 5-week path. Either way, there are ten modules in total, and each module typically consists of the following components:

  • 1 Core Lecture: Theory and strategy is discussed in a series of video lectures. Each lecture builds on the previous one.
  • 1 Bonus Lecture: Standalone content to address specific topics is provided in every other module.
  • 1 Reading Recommendation: While not meant to be read on pace with the course, I’ve provided a curated reading list along with critical questions to consider to help develop your analyst mindset.
  • 1 Quiz: The quiz isn’t meant to test your knowledge, but rather, to give you an opportunity to apply it to reinforce learning through critical thinking and knowledge retrieval.
  • 1 Lab Exercise: The Investigation Ninja system is used to provide labs that simulate real investigations for you to practice your skills.

Investigation Ninja Lab Environment

This course utilizes the Investigation Ninja web application to simulate real investigation scenarios. By taking a vendor agnostic approach, Investigation Ninja provides real world inputs and allows you to query various data sources to uncover evil and decide if an incident has occurred, and what happened. You’ll look through real data and solve unique challenges that will test your newly learned investigation skills. A custom set of labs have been developed specifically for this course. No matter what toolset you work with in your SOC, Investigation Ninja will prepare you to excel in investigations using a data-driven approach.

This slideshow requires JavaScript.

Get stuck in a lab? I’m just an e-mail away and can help point you in the right direction. Enjoy the labs and want to go farther? You can purchase additional access to more labs, including our upcoming “Story Mode” where you create a character and progress through eight levels of investigation scenarios while trying to attain the rank of Investigation Ninja!

Instructor Q&A

This isn’t a typical online course where we just give you a bunch of videos and you’re own your own. The results of your progress, quizzes, and labs are reviewed by me and I provide real time feedback as you progress. I’m available as a resource to answer questions throughout the course.

Syllabus

  1. Metacognition: How to Approach an Investigation
  2. Evidence: Planning Visibility with a Compromise in Mind
  3. Investigation Playbooks: How to Analyze IPs, Domains, and Files
  4. Open Source Intel: Understanding the Unknown
  5. Mise en Place: Mastering Your Environment with Any Toolset
  6. The Timeline: Tracking the Investigation Process
  7. The Curious Hunter: Finding Investigation Leads without Alerts
  8. Your Own Worst Enemy: Recognizing and Limiting Bias
  9. Reporting: Effective Communication of Breaches and False Alarms
  10. Case Studies in Thinking Like an Analyst

Plus, several bonus lectures!

Cost

The course and lab access are $497 for a single user license. Discounts are available for multiple user licenses where at least 10 seats are purchased (please contact me to discuss payment). A significant portion of the purchase price will go to support multiple charities including the Rural Technology Fund, the Against Malaria Foundation, and others.

You’ll receive:

  • 1-yr Access to Course Videos and Content
  • 1-yr Access to Investigation Ninja
  • A Certification of Course Completion
  • Continuing Education Credits (CPEs/CEUs)

Sign Up Now!

This course is only taught periodically and space is limited.

Spring 2017 Session 1 – Beings January 9th SOLD OUT

Spring 2017 Session 2 – Begins March 20

  • Register now for the March session, as pricing will increase by $100 after January 1st

Making an Impact with Local Security Conferences

Student's Using OSMO Coding Kits

Student’s Using Donated OSMO Coding Kits

Running a non-profit is really tough sledding. It requires a complex balance of spending just enough to raise awareness, while ensuring that the donations you are bringing in are substantial enough to make a positive impact on the world. The absolute best way to ensure success is to partner with other people who are like minded and willing to help.

I’m excited to announce a recent effort resulting from a partnership between the Rural Tech Fund and the great folks who run the Archcon security conference. One of the organizers, Paul, contacted me a month or so ago and asked if the RTF could use the funds generated from the conference to do some positive things in rural and low income areas in Missouri. We made the commitment (as I do with all donations to the RTF) that we would use 100% of the donation to donate equipment to school districts in the area. With that money, we were able to do the following:

  • Robotics and Ardunio kits in the St. Louis / Mehlville area. 
  • Programming kits in Gladstone, MO
  • Robotics and Ardunio kits in Essex, MO. 
  • Electronics kits in Saint Charles, MO
  • Chromebooks for programming classes in the St. Louis / Jennings area
  • Circuitry and robotics kits in the St. Louis / Mason area
  • Raspberry Pi kits in Independence, MO
  • Robotics and coding kit to El Dorado Springs, MO

With a relatively small amount of money, we were able to make donations that will directly impact around 600 students across Missouri. By utilizing giving networks like Donors Choose and matching funds from organizations like the Ewing Marion Kauffman Foundation, the value of the money was maximized to ensure reach to the most number of students.

Don’t just hear about the impact from me though, take it from a couple teachers in these classrooms:

“I am humbled and grateful for the generous donation from the Rural Technology Fund. It will be thrilling to watch the students interact with their new technology and enhance their creative potential. Expressing my thanks does not relay the full measure of emotions at this moment. I am incredibly appreciative…to the point of tears.” – Dr. Flynn (St. Louis)

“Thank you so much for seeing my vision for my students. Your contribution to my class will forever impact the students. To know that one person’s generosity can change the lives of others is the greatest gift ever. Your contribution will bring to STEM to life in my class”. – Ms. Jefferson (Jennings, MO)

While Archcon already had a tangible impact in the security community, this ensured that the conference will have a lasting impact that pays dividends for underprivileged students in the state, as well as for the overall economy of the state.

It’s sometimes hard to find massive wins like this, but this is one I’m very proud to be a part of. I want to thank Paul Jaramillo and the folks who organized and participated in Archcon. It’s a fine conference and I plan to attend myself next year.

If you run a security conference and want to help connect your conference to your community and make a similar impact, please reach out to me. Your donation is tax deductible and I’ll commit to using 100% of it to support technology education. The RTF is a volunteer led organization, so nothing will be eaten up by administrative costs. 

Three Useful SOC Dashboards

I worked in security operation centers for a long time, and I really grew to hate dashboards. Most of them were specially designed pages by vendors meant to impress folks who don’t know any better when they stroll through the SOC and glance at the wall of low-end plasmas. They didn’t really help me catch bad guys any better, and worse yet, my bosses made my ensure they were always functional. Fast forward a few years, and I end up working for a vendor who builds security products. Much to my dismay, while planning for features we end up having to build these same dashboards because, despite my best efforts to persuade otherwise, CISO’s consistently ask for eye candy, even while admitting that it doesn’t have anything to do with the goal of the product. Some of them even tell us, straight up, that they won’t purchase our product if it doesn’t have eye catching visuals.

I provide that backstory to provide some insight into my long, tortuous relationship with useless dashboards. I talk about this enough at work that I feel like I’ve almost created a support group for people who have stress triggers associated with dashboards. If you’ve ever attended a conference talk from my good friend Martin Holste, you may know he hates dashboards even more than me. Alas, I’m not here just to rant. I actually believe that dashboards can be useful if they focus less on looking like video games and they help analysts do their job better. So, in this post I’m going to talk about three dashboard metrics you can collect right now that are actually useful. They won’t look pretty, but they will be effective.

Data Availability

The foundation of any investigation is rooted in asking questions, making hypotheses, and seeking answers that either disprove or prove your educated guesses. Your questioning and answer seeking with both be driven, in part, based on the data you have available. If you have PCAP data then you know you can seek answers about the context within network communication, and if you have Sysmon configured on your Windows infrastructure, you know you can look for file hashes in process execution logs.

While the existence of a data source is half the battle, the other half is retention. Some sources might have a specific time window. You might store PCAP for 3 day and flow data for 90 days, for example. Other data sources will probably use a rolling window, like most logs on Windows endpoints that are given a disk quota and roll over when that quota is met. In both cases, the ability to quickly ascertain the availability of data you have to work with is critical for an analyst. In short, if the data isn’t there, you don’t want to waste time trying to look for it. I contend that any time spent gathering data is wasted time, because the analyst should spend most of their time in the question and answer process or drawing conclusions based on data they’ve already retrieved.

A data availability section on a live dashboard helps optimize this part of the analyst workflow by providing a list of every data source and the earliest available data.

dashboard-dataavailability

In the example above I’ve created a series of tiles representing five different data types common to a lot of SOCs. Each tile boldly displays the name of the data source, and the earliest available date and time of data for it. In this example, I’ve also chosen to color code certain tiles. Data sources with a fixed retention period are green, sources with a rolling retention period based on a disk quota are yellow and red. I’ve chosen to highlight endpoint logs in red because those are not centralized and are more susceptible to a security event causing the logs to roll faster. The idea here is to relay some form of urgency into the analyst if they need to gather data from a particular source. While PCAP, flow, and firewall logs are likely to be there a few hours later, things can happen that will purge domain auth and Windows endpoint logs.

Ideally, this dashboard component is updated quickly and in an automated fashion. At minimum, someone updating this manually once a day will still save a lot of time for the individual analyst or collective group.

Open Case Status

Most SOCs use some form of case tracking or management system. While there aren’t a lot of really great options that are designed with the SOC in mind, there are things people find a way to make work like RTIR, Remedy, Archer, JIRA, and more. If integrated properly, the case management system can be a powerful tool for facilitating workflow when you assign users to cases and track states properly. This can be a tremendous tool for helping analysts organized, either through self organization or peer accountability.

 

dashboard-casestatus

In this example, I’ve gone with a simple table displaying the open cases. They are sorted and color coded by alive time, which is the time since the case was opened. As you might expect, things that have been pending for quite some time are given the more severe color as they require action. This could, of course, be built around an SLAs or internal guidelines you use for required response and closure times.

The important thing here is that this dashboard component shows the information the analysts needs to know. This provides the ability to determine what is open (case number), who they can talk to about it (owner), how serious it is (status), what it’s waiting on (pending), and how long have we known about the issue (alive).

Unsolved Mysteries

On any given day an analyst will run into things that appear to be suspicious, but for which there is no evidence to confirm that suspicion. These unsolved mysteries are usually tied to a weird external IP address or domain name, or perhaps an internal user or system. In a single analyst SOC this is easily manageable because if that analyst runs across the suspicious thing again it is likely to draw attention. That is a tougher proposition in the larger SOC however, because there is a chance that a completely different analyst is the one who runs across the suspicious entity the second time. In truth, you could have half a dozen analysts who encounter the same suspicious thing in different contexts without any of them knowing about the other persons finding. Each encounter could hold a clue that will unravel the mystery of what’s going on, but without the right way to facilitate that knowledge transfers something could be missed.

As a dashboard component,  using watch lists to spread awareness of suspicious entities is an effective strategy. To use it, analysts must  have a mechanism for adding things to a watch list, which is displayed on a screen for reference. Any time an analyst runs across something that looks suspicious but they can’t quite pin down, they first check the screen and if it’s not on there, they add it. Everything that shows up on this list is auto cycled off of it every 24-48 hours unless someone else puts it back on there.

dashboard-weirdthings

In this component, I’ve once again chosen a simple table. This provides the thing that is weird (item), who to talk to about it (observer), when it was observed in the data (date), and where you can go to find out the context of the scenario in which it was found (case) if there is any.

Conclusion

A Dashboard doesn’t have to use a fancy chart type or have lasers to be useful. In this post I described three types of information that are useful in a SOC when displayed on a shared dashboard. The goal is to use group dashboards to help analysts save time or be more efficient in their investigations. If you have the capacity to display this information, you’ll be well on your way to doing both of those things.

 

Do you have a really useful dashboard idea that you think is relevant in most SOCs? Let me know and I might blog about it down the road in a follow up.

Interested in learning more about the investigation process and how these dashboards fit in? Sign up for my mailing list to get first shot at my upcoming course focused entirely on the human aspect of security investigations.

The Effects of Opening Move Selection on Investigation Speed

What follows is a shortened version of a longer paper that will be released at a later time. You can also learn more about this research by watching my recent Security Onion Conference 2016 video where I discuss these results and other similar experiments.

Background

The core construct of computer network defense is the investigation. It’s here where human analysts receive anomalous signals and pursue evidence that might prove the existence of an information system breach. While little formalized methodology related to the investigation process exists, research in this area is beginning to emerge.

Existing research in human thought and knowledge acquisition is applicable to information security and computer network defense. Daniel Kahneman provided modern research in support of a dual-process theory of thinking that defines two separate processes governing human thought. The first process, called intuitive or system 1 thinking, is automatic and usually kicks in without directly acknowledging it. You don’t have to think about brushing your teeth or starting your car, you just do it. The second process, called reflective or system 2 thinking, is deliberate and requires attentive focus. You have to think about math problems and your next move when playing checkers. Both of these systems are in play during the investigation process.

In an investigation, analysts use intuitive thought when pursuing data related to an alert. This is most often the case when the analyst makes their opening move. The analyst’s opening move is the first data query they execute after receiving an alert. By virtue of being the first move, the analyst doesn’t apply a lot of explicit reflective thought on the issue and they simply jump to the place they believe will provide the quickest and most definitive answer. It’s assumed that in these cases, the analyst’s first move is probably the data source they perceive as being the most valuable.

The goal of the present research is to determine which common data source analysts were more likely to use as their opening move, and to assess the impact of that first move on the speed of the investigation.

Methods

The foundation of this research was a purpose built investigation simulator. The investigation simulator was built to recreate the investigation environment in a tool agnostic manner, such that individual scenarios could be loaded for a sample population and the variables could be tightly controlled.

A pool of security analysts was selected based on their employment history. Every analysts selected was currently or recently in a role were they were responsible for investigating security alerts to determine if a security compromise had occurred. Demographic information was collected, and analysts were placed into three skills groups based on their qualifications and level of experience: novice, intermediate, or expert.

 

Group A – Exploit Kit Infection

The primary experiment group was asked to connect to the investigation simulator remotely and work through the investigation scenario provided to arrive at a conclusion whether an infection or compromise had successful occurred.

The scenario presented the user with a Suricata IDS alert indicating that an internal host visited an exploit kit landing page.

 

openingmove-figure1

Figure 1: The Suricata IDS alert that initiated the simulation

The following data sources were provided for investigation purposes:

Data Source Query Format Output Style
Full packet capture (PCAP) Search by IP or port TCPDump
Network flow/session data Search by IP or port SiLK IPFIX
Host file system Search by filename File path location
Windows Logs Option: User authentication/process create logs Windows event log text
Windows Registry Option: Autoruns/System restore/application executions (MUI cache) Registry keys and values
Antivirus Logs Search by IP Generic AV log text
Memory Option: Running process list/Shim cache Volatility
Open Source Intelligence Option: IP or domain reputation/file hash reputation/Google search Text output similar to popular intelligence providers

Table 1: Data source provided for Group A

Subjects were instructed that they should work towards a final disposition of true positive (an infection occurred), or false positive (no infection occurred). Whenever they had enough information to reach a conclusion, they were to indicate their final disposition in the tool, at which point the simulation exited.

The simulator logged every query the analysts made during this experiment, along with a timestamp and the start and end time. This produced a timeline of the analysts enter investigation, which was used to evaluate the research questions.

 

Group B – PCAP Data Replaced with Bro Data

Based on results achieved with group A, a second non-overlapping sample group of analysts were selected to participate in another experiment. Since group indicated a preference for higher context PCAP data, the second scenario removed the PCAP data option and replaced it with Bro data, another high context data source that is more structured and organized. The complete list of data sources provided for this group were:

Data Source Query Format Output Style
Bro Search by IP Bro
Network flow/session data Search by IP or port SiLK IPFIX
Host file system Search by filename File path location
Windows Logs Option: User authentication/process create logs Windows event log text
Windows Registry Option: Autoruns/System restore/application executions (MUI cache) Registry keys and values
Antivirus Logs Search by IP Generic AV log text
Memory Option: Running process list/Shim cache Volatility
Open Source Intelligence Option: IP or domain reputation/file hash reputation/Google search Text output similar to popular intelligence providers

Table 2: Data source provided for Group B

All experiment procedures and investigation logging measures remained in place, consistent with group A.

Group C – Survey Group

A third semi-overlapping group was selected at random to collect self-reported statistics to assess what opening move analysts self reported they would be more likely to make given a generic investigation scenario.

Using a combination of manually polling analysts and collecting responses from Twitter polling, analysts were asked the following question:

In a normal investigation scenario, what data source would you look at first?

The multiple-choice options presented were:

  1. PCAP
  2. Flow
  3. Open Source Intelligence
  4. Other

Results

The first item evaluated was the distribution of opening moves. Simply put, what data source did analysts look at first?

In Group A, an 72% of analysts chose PCAP as their first move, 16% chose flow data, and the remaining 12% chose OSINT. The observed numbers differ significantly from the numbers analysts reported during information polling. In the Group C polling, 49% of analysts reported PCAP would be their first move, 28% chose flow data, and 23% chose OSINT.

openingmove-chart1

Chart 1: Opening move selection observed for Group A

The mean time to disposition (MTTD) metric was calculated for each first move group by determining the difference between start and end investigation time for each analysts and averaging the results of all analysts within the group together. Analyst’s who chose PCAP had a MTTD of 16 minutes, those who chose flow had a MTTD of 10 minutes, and those who chose OSINT had a MTTD of 9 minutes.

openingmove-chart2

Chart 2: Time to disposition for Group A

 

In Group B where PCAP data was replaced with Bro data, 46% of analysts chose Bro data as their first move, 29% chose OSINT, and 25% chose flow.

openingmove-chart3

Chart 3: Comparison of group A and B opening moves

Analysts who chose Bro had a MTTD of 10 minutes, while those who chose flow and OSINT and MTTDs of 10 minutes and 11 minutes, respectively.

openingmove-chart4

Chart 4: Comparison of group A and B average time to close

Discussion

While not entirely conclusive, the data gained from this research does provide several suggestions. First, given an overwhelming 72% of people chose to begin their investigation with PCAP data, it’s clear that analysts prefer a higher context data source when its available, even if other lower context data sources available. In these simulations there were multiple ways to come to the correct conclusion, and PCAP data did not have to be examined at all to reach it.

The data also suggests that an opening move to a high context but relatively unorganized data source can negatively affect the speed an analyst reaches an appropriate conclusion. The MTTD for analysts whose opening move was PCAP in Group A was significantly higher than those who started with lower context data sources flow and OSINT. This is likely because PCAP data contains extraneous data that isn’t beneficial to the investigator, and it takes much longer to visually parse and interpret. Examining the results of the group B experiment further supports this finding. PCAP was replaced with Bro log data, which generally contains most of the same useful information that PCAP provides, but organizes it in a much more intuitive way that makes it easier to sift through. Analysts who chose Bro data for their opening move had a MTTD that was much lower than PCAP and comparable to flow and OSINT data sources.

The comparison between observed and reported opening moves highlights another finding that analysts often don’t understand their own tendencies during an investigation. There was a significant difference between the number of people who reported they would choose to investigate an anomaly with PCAP, and those who actually did. Opening move selection is somewhat situational however, so the present study did not introduce enough unique simulations to truly validate the statistics supporting that finding.

Possible limitations for this study mostly center on a limited number of trials, as only one simulation (albeit modified for one group) was used. More trials would serve to strengthen the findings. In addition, there is some selection bias towards analysts who are more specialized in network forensics than host forensics. This likely accounts for no first moves being to host-based data. Additionally, in the simulations conducted here access to all data sources took an equal amount of time. In a real world scenario, some data sources take longer to access. However, since PCAP and other higher context data sources are usually larger in size on disk, the added time to retrieve this data would only strengthen these findings that PCAP data negatively affects investigation speed.

Conclusion

Overall, this research provides insight into the value of better organized higher context data sources. While PCAP data contains an immense level of context, it is also unorganized, hard to filter and sift through compared to other data types, and has extraneous data not useful for furthering the investigation. To improve investigation efficiency, it may be better to make opening moves that start with lower context data sources so that a smaller net can be cast when it comes time to query higher context sources. Furthermore, when more organized higher context data sources are available, they should be used.

While the present research isn’t fully conclusive due to its sample size and a limited number of simulation trials, it does provide unique insight into the investigation process. The methods and strategy used here should be applicable for additional research to further confirm the things this data suggests.

 

Interested in learning more about the investigation process, choosing the right data sources to look at, and becoming a better analyst? Sign up here to be the first to hear about my new analyst training course being released later this year. Mailing list subscribers will get the first opportunity to sign up for the exclusive web-based course, and space is limited. Proceeds from this course will go to benefit charity.

Video: Tracking Investigations with Timelines

As humans, we rely on visualizing things to solve problems, even when we don’t realize it. In this video, I want to talk about how you can use timelines to visualize investigations. This is useful for tracking active investigations, retracing your steps and identifying gaps in your analysis, and relaying investigation output to management.

If you like this video, you’ll enjoy the course its a part of that I’m releasing in a few months. You can learn more about the course by signing up for my mailing list.

In this thirty minute video I illustrate the complexity of investigations and describe why visualizations are important. From there, I explain how timelines can fit this gap, and the types of events that are notable for tracking on a timeline. From there, I use VisJS to provide an example of how you can create simple timelines to track your investigations.

I’ve also included the following resources:

  • A sample timeline using VisJS
  • A directory structure and HTML page for managing timelines

You can download these resources here.