04.1 Activity theory
In this course, the focus is on raw data, the motivation to transform that data — the objective — and the tools used to effect those transformations. From an activity theory perspective, tools – such as analytical methods, or software and algorithms to carry out those methods – are introduced to enable students to effectively transform raw data into something more useful.
Our focus is on activity motivated by student desire to transform data, using tools they have at hand, or are capable of developing: this constitutes an “object” (or “objective”) in activity theory. Data is transformed through activity that relates to an objective usually coming from a naturally arising question about the data.
This is the opposite of a scenario in which techniques of data analysis are taught with artificially designed and relatively simple toy data (that is, students practice tool use in the absence of appropriate or realistic data). Becoming a useful and skillful data scientist requires addressing the full complexity of data, and finding appropriate tools to effect insightful transformations on those data. This is the central reason why activity theory drives so much of our thinking in course design for data science: from an activity theory perspective the context of data science for students is the raw data, questions posed about those data, agreed objectives, and transformation of the data by activity, utilizing analytical tools, in a cyclic process.
This process can be encapsulated in the following activity diagram:
The community is the class of students and the instructor, but may also include an audience, other than the instructor, for whom the students are to build a data product such as a predictive model, and write a report. For example, the data may well have come from someone who wants to know certain things about the data, so in this case students write reports for that person, who is also part of the community
The subject or subjects consist of an individual student or small groups of students working together to produce an outcome, typically a written report of an analysis or a data product.
The tools are usually the software tools, such as the Python programming language and conceptual tools, such as regression or clustering techniques, that students can bring to bear on the objective.
The object, or objective, is determined through discussion by students, the instructor and any external client.
The rules vary from activity to activity, and may include such general things as avoidance of plagiarism, appropriate referencing of sources, cooperation within and between student teams, sharing of findings, ethical behavior, and responsibility for meeting deadlines.
Division of labor can work in several ways including different students within a team taking charge of different aspects of analysis, or different teams focusing on different aspects of an objective with the aim of pooling findings.
The same data set may — and usually does — generate a number of different activities and objectives as students ask further questions about the data, and set out to examine their determined objectives. When this happens the objective will change, the subjects may change in that students may form new groups, spontaneously or at the instructor’s direction, the division of labor may change, and the tools will most likely need to be modified and new tools brought to bear on achieving the objective.
References:
Leontiev, A. N. (1978), Activity, consciousness, and personality (originally published in Russian in 1975), Prentice-Hall.
Nardi, B. (1996), Context and Consciousness: Activity Theory and Human-Computer Interaction, Cambridge, MA: MIT Press.
Wikipedia. Activity Theory. https://en.wikipedia.org/wiki/Activity_theory