Home > Conferences, Meeting Notes > Analytics Camp 2013-05-04 Data Built to Last

Analytics Camp 2013-05-04 Data Built to Last

1st Session: Data Built to Last with Melinda Thielbar

She’s a data scientist in the RTP area.

2013-05-04 10.42.52

Three things: Actionable, Verifiable, Repeatable.

Action:
1. I wish I knew this. This is what I would do if I knew this.
Who is fraudulent? How can I stop them?

2. What do I know now?
Technical process is taking the thing you wish you knew and turning it into what to do now.
You need a feedback loop from what you wish you knew and what you know now. (agile process methods?)

Verify:
Take action and see if it worked. First verification should be cheap before you go wild.
A/B testing-cheap way to run experiments.
Build a process for the end user so you don’t have to babysit them.

Repeatable: use programmers to build something that is repeatable.

Resources:
The Endeavor Blog is a good resource.
A learning resources page is on the wiki. There is a Coursera class on data analytics.
Follow hmason on twitter Hillary Mason.
openIntro.org/stat starts you with statistics.
More resources and web sites on the analytics camp wiki.

Advertisements
Categories: Conferences, Meeting Notes
  1. rickpack
    May 4, 2013 at 2:56 pm

    Great notes. To expand a bit, here are a few that I also took:
    Agile versus waterfall development of statistical methodology + software…sounds like Waterfall more sequentially constrained and therefore requires more initial careful planning (vulnerable to the unpredicted?) versus Agile requires more frequent reconsideration
    .
    I liked Melinda’s idea of converting a customer’s data questions into a multidimensional question with a numeric answer (hope I heard that right)

    While analysts sometimes need more information to be collected, Melinda feels that after her many years of experience, we are at a point at which the needed data can be found somewhere (might have to search for source and data may be dirty)

    Analysts spend most of their time on the repeatable dimension (60%).

    Iterative deliverables with suggestions guard against scope creep.

    Futureproofing segmentation (do not know what this meant – I think Bruce Conner mentioned it)

    Discussion: Mapreduce / Hadoop might be solving a p-dimensional problem with a linear or otherwise mathematically untenable solution, but seems to predict some solutions well. Might use appropriately complex algorithms.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: