Analytics Camp 2013 – Documenting Mathematical Models
Sessions 4 I attended at Analytics Camp 2013 was on documenting mathematical models.
Steve Burnett and Melinda Thielbar are the presenters. I moved to the much smaller room.
MT: example of a new employee asking for the documentation of the models her new banking employer works. They gave her a stack of SAS programs with no instructions on what to run first.
MT: most analysis works on data in rows and columns. Are they continuous, categorical, predictor(?), outcome(?)
The matrix goes into the magic that flows to the answer.
SB: “raw” data is where you start. There might be multiples of data tables that might be joined together. There might be a filter that combines multiple tables into a shorter table.
Find the TEDx talk about the outliers. Part of the documentation would be how to deal with outliers. Remove dirty, outliers, impossible data points. Part of the documentation would be defining definitions of terms.
Data cleaning should be documented. What are the processes used to clean the data of dirty, outliers, and impossible data points.
Steve Burnett and Melinda Thielbar Tag Teaming the Session
What does she mean by magic?
Adjacency matrix – discussion within the session.
Melinda does math. “Description of variation in yi”
Documentation of your model is one of the ways you can scale and not iterate 1 on 1.
Glossary of terms. You don’t have to define something that you can look up. Overview. Documenting of data preparation. Filtering criteria. Definitions for rows and columns.
Document how to use the answer you get.
Document how you verify the answers you get.