Learn to Code for Data Analysis

Michel Wermelinger | Saturday 12:30 | Ferrier Hall

The 4-week MOOC "Learn to Code for Data Analysis" is a hands-on course that introduces programming and shows how to access open data, clean and analyse it, and produce simple visualisations. The course uses Python, the pandas data analysis library, and the browser–based Jupyter Notebooks as the programming environment. The notebook style allows us to weave explanations, code, and the corresponding results in an interactive document where students do the many exercises. Each weekly project (based on real data from the WHO, UN, World Bank and Weather Underground) is written up in a notebook that can be modified by learners and shared publicly.

In this talk we summarise the pedagogical approach taken, and compare it to Merrill's First Principles of Instruction. We comment on the advantages and disadvantages of the software used, in particular the notebook environment, and difficulties felt by learners. We muse on the highlights and low points of forum discussions. We synthesize lessons learned, and reflect on the limitations of MOOCs and on the power of data for teaching programming.