Saturday, April 21, 2012

Lecture 25 & 26 Visualization and social media monetization

Key takeaway from the video, Hans Rosling shows the best stats you've ever seen

  • Observe data in different dimensions trend/moving pattern over time, record the moving paths to perceive past trend and predict future pattern (Use data to do projection based on the past pattern)
  • Visualize data based on reasonable measure, such as population
  • Show graphs by both Aggregate data and split data
    • Dangerous to only use average data (can't see the big picture), since the details can be much different from average. Also, average can be affected by outlier and misrepresent the truth. 
  • Clearly show over time and regional pattern
  • Link data with internet the draw the whole picture and tell story about the data
  • Explain the behavior of the data - not just how data is moving but why
  • Volume of data makes things different - "big data" to be representative
  • How to Access to data and Cleaning of data are important
  • Present data in a visually appealing way
  • Data is not always in/about a company, rather, data outside the company, such as industry trend, is also important to predict future.

Takeaway from GOMC Competition

From a company's perspective, performance of campaign is the most important thing. Therefore it's critical to identify goal of campaign and estimate what the costs and benefits are to measure the overall ROI for the whole campaign. While comparing the performance of campaign, not only compare to pre-campaign but also to post campaign, to show the contribution of the campaign. If the conversions are much higher than in both pre-campaign and post-campaign period, that means that campaign did increase conversions, helping organization achieve goal.

For the pre-campaign comparison, we should not only compare the campaign period with previous month but also the same month in the past year. Comparing to the same month in different enables us to capture the seasonality that is not able to see in only compare months in the same year. We will know whether the increase of conversion is due to campaign or just in hot season. On the other hand, comparing the the months just prior campaign allow us to get the overall scale of the visits and conversions. We can perceive that whether the increase by year is actually attributed by campaign rather than just business expansion.

Friday, April 20, 2012

Lecture 23 & 24 Dimensional modeling and balanced scorecard


  • Dimensional modeling Miscellaneous details
    • Slowly changing dimension
      • Purpose: Changes happen in attribute's value over time (e.g. brand, department, …), but don't happen frequently, rather happen once a while
      • Type 0: nothing is ever going to change (this is not typical)
      • Type 1: directly update attributes, completed lose history but simple
      • Type 2: in order to preserve history, split tables
        • Every time value of attribute change, add a new tuple with a new surrogate key, add start time and end time to the new tuple
      • Type3: make changes to the table, Add a new column/new attribute called "new" territory, and change the previous one as "old" territory
    • Rule playing dimension
    • Junk dimension
      • Put data altogether
    • Semi-additive
    • Non-additive: average, minimum, maximum, can't add together, e.g. balance of bank statement