04. Course structure

The course is structured around activity theory, in which we work, as a community, with data, to determine analytical objectives and to carry out these analytical objectives using appropriate statistical and machine learning tools, in adherence with appropriate community rules.

The course is structured into two quarters: the first and last seven weeks of semester:

First seven weeks of semester (January 22 – March 7)

Data for the first seven weeks of semester comes from the Armed Conflict Location and Event Data organization (ACLED), and provides information on political and other violence in the United States and in India.

You can download this data as a .csv file here.

Last seven weeks of semester (March 18 – April 30)

Data for the last seven weeks of semester comes from the Washington Post and provides information on police shootings in the United States.

You can download this data as a .csv file here.

Your work journal

Each student must keep an individual work journal in which you record your work, thoughts, and ideas for data analysis. 

Your work journal should be set up as a WordPress site at sites.umassd.edu and your URL should be of the form


where, of course “YourName” is your name.

To gain full credit (15%) you must make individual entries in your work journal by Friday of each week. Please make these entries as posts (not pages).

Exploratory data analysis

Productive data analysis begins with description:

  • What is the distribution of each of your variables?
  • Are your variables approximately normally distributed?
  • What are the descriptive statistics?
    • Minimum
    • Maximum
    • Median
    • Mean
    • Standard deviation
    • Skewness
    • Kurtosis
  • What correlations exist between variables?

Focus on describing your data until you feel you have gained a deep(er) understanding of the data.