Step 1 – Brainstorm / Generate Hypothesis / Think Steps
The second stairway begins with a brainstorming session. Does your CISO ever ask if you are susceptible to the new threat he saw on the news? This stairway is for those moments too because most likely you are going to then brainstorm the likelihood by looking up the threat. This step is also for the generation of a hypothesis and Think Steps.
Step 2 – Determine Scope
The scope, or goals, of the analysis need to be determined in addition to the environment or sources you will need for your analysis. Remember, the weather forecast workflow taught us it is important to understand what environment you are collecting observables from.
Step 3 – KAC / Devil’s Advocate
Key Assumptions Check (KAC) is an analysis technique where you list and review any assumptions regarding a topic. Once you list the assumptions you determine the likelihood of each. It is a great way to flush out any biases you might have regarding the hypothesis, or topic. It is also a great time for the Devil’s Advocate technique. This technique is when you attempt to think of any possible alternative to the topic at hand, or in this case the hypothesis.
Step 4 – Compile Data / QoI Check
Once you know what data sources you need you can start to compile the data. Once the data is compiled you can do what is called a Quality of Information Check, or QoI. A QoI evaluates the completeness of the information available as well as the data sources. This check is important because it can identify information gaps. If you discover an information gap a new information or intelligence requirement can be created. In addition, it can assist help boost confidence levels of analytic decisions.
Step 5 – Clean Data / Omit Useless Data
When I say clean the data, I mean to ensure the dataset is organized in a common taxonomy. It can be extremely irritating when this does not occur and could result in an incomplete dataset. For example, if you have the data field listed as San Diego in a few records and SD in a few others and SDCA in a few others it can get confusing quickly. If you go to query all the logs from the SD office, you will not get the results from the records with San Diego or SDCA listed. This is also the time to omit, or get rid of, any useless data that is not important to your investigation.
Step 6 – EDA / Visualize and Regression
Exploratory Data Analysis (EDA) is a form of analysis where you are given a dataset, but not necessarily a hypothesis or data model to match it to. In this form of analysis, you explore the data in order to generate a hypothesis. This is also the time to visualize the data and perform Regression Analysis. Regression Analysis is when you attempt to find relationships between variables in a dataset. Since you already have a hypothesis this step is OPTIONAL. You could move on straight to Confirmatory Analysis. Personally, I still like to ensure I am correct by looking over the data once more. This is a case by case opinion though. Sometimes, you just know. However, in the event you do decide to perform EDA and you find discrepancies that disprove your hypothesis you can go back to square one or update your hypothesis as needed before moving on to Confirmatory.
Step 7 – Confirmatory
Confirmatory Analysis is when you put your hypothesis or hypotheses to the test using the Think Steps. In the event that you are unable to validate your hypothesis you can start again at Step 5 with further Exploratory Analysis.
Step 8 – Disseminate
This is the single most important step in the stairway. It is the end goal which is dissemination. This is where you conclude your analysis and interpret your results. This can be in the form of a report or just a note in ticket describing what your findings are and how you came to that conclusion.