Shin Alexandre Koseki
PhD, MArch

@sjinko


Thinking

© Avalanche Studio

Data Think!

Thinking Urban Research With Data

The mastery of urban data ecosystems is an essential asset to designing, planning and studying cities and extended territories. Since the mid-90s, urban environments are at the center of social, political, ecological and technical innovation. In achieving a just and sustainable society, the data produced in cities, villages and remote areas might just hold a key solution.

Data Think! provides training in multimodal data study and analysis, with focus on both theory and practice. In the course of a week, participants learn about state-of-the-art approach in data-centric research from world-leading experts. Structured around the five phases of the data cycle: collection, curation, processing, visualization and archiving, the course strengthens participants capability to think and work with data in spatial design, planning and research.

At the core of Data Think! is a collaborative project: the development of a multimodal data processing pipeline. In interdisciplinary teams, participants conceive, design, and develop one of the five segments of the pipeline, which follow the five phases of the data cycle. Under the supervision of leading experts in data-centric research, this “hands on” approach infuses participants with digital skills needed today in research, design and the industry.

In addition to learning about data management, collection, curation, processing, visualization and archiving, participants also learn about data ontology, digital risks, data power, data-centric design, open data, version controlling and programming with Python 3.0.

In joining Data Think!, participants acquire the necessary skills to address future urban challenges with todays’ data, and make an impact on cities and their extended territories for future generations.


In a previous edition of Data Think! Emeritus Professor and Philosopher Walther Zimmerli reminded us that “data is just noise from which we make sense”, the researcher’s active act of creation. The current abundance of data therefore results from a sustained effort to make sense of the world. With other actors who produce and analyze data, scientists share therefore the great responsibility of defining the social understanding of the world we live in. Making sense of the world is not a mundane exercise, it is, by definition, the core mission of the scientific odyssey.

Research data has historically been produced either by researchers, by a state agency whose mission is to generate public information or by service companies specialized in data collection. The development of scientific practices and disciplines has been deeply intertwined with the methods, approaches, and framework given to the production of data. Starting in the 1970s, the development of the Internet and mass digitization of information has changed the status of data, in all aspects of society, including research. Today, most data come from “traces” people leave behind, unknowingly. When one surfs the web, when one makes phone calls, when one walks in the street, their every movement, interaction, choice gets recorded one way or another. This fueled the growth of a “digital ecosystem” made of bits and pieces of information about pretty much everything and nothing at the same time. Bits and pieces that can then be harvested to fuel social, political, economic and operational aspirations of states and organizations, public or private.

As a result, many have called for an inversion of the data-follows-design-follows theory paradigm, towards a “data-centric” science. This new approach to scientific research grounds in three normative pillars: care, openness, universality. These pillars are those at the core of the reproducibility crisis and the notion of open science. “Data-centric” heralds a new scientific know-how. One that places theory after the experiment, and all aspects of the scientific pipeline on a similar level of importance.

Data Think! proposes an introduction to data-centric research for a deep and acute understanding of this novel paradigm shift in science. Combining theoretical input and hands-on exploration, participants develop a collaborative research pipeline and address all current aspects of data, from gathering to management, to analysis and valorizations. We also dedicate a large part of the course to a critical understanding of data in the age of digitization, and on the specificity of social and urban data in today’s research, governance and economy.

Data Think! is given as part of the EPFL Doctoral Program in Architecture and Science of the City (EDAR). This year, Data Think! is developed in partnership with the EPFL Doctoral Program in Digital Humanities (EDDH), the UNIL Doctoral School in Digital Studies (PDEN)**, the Join Center for Digitization and Computation in the Social Science and the Humanities of EPFL and UNIL (dhCenter), the Join Center for Digital Visual Studies of the University of Zurich and the Max-Planck Gesellschaft (CDVS), and the West Switzerland University Conference (CUSO).

Course Form

Learning Outcomes

At the end of the course, participants are able to:

Main objective: mixed learning outcomes

  • Conceive a research data-centric interdisciplinary research pipeline (collection, curation, analysis, visualization, archiving);
  • Work collaboratively on the implementation of a research pipeline using Jupyter Notebook and version controlling. Sub-objective 1: Theoretical learning outcomes
  • Discuss the epistemology of their data;
  • Discuss the ontology of their data. Sub-objective 2: Strategic learning outcomes
  • Define a data cycle strategy in accordance with Open Science;
  • Conceive and produce a data management plan;
  • Conceive a technological valorization strategy. Sub-objective 3: Technical learning outcomes
  • Use basic Linux-like terminal commands;
  • Use basic functions of a version controlling system (Github, Gitlab, etc);
  • Operate basic coding commands in Python 3.0 (beginner to intermediary levels);
  • Use collaborative online research notebooks (Jupyter Notebook).

Format summary

DataThink! takes place over five consecutive days (Monday to Friday). With the help of lecturers and experts, participants explore the use of data in research protocols, from design to archiving. In order to improve research skills and computational thinking, participants work directly with Jupyter Notebooks and Github. Lectures and group debates complement in-class exercises and collaborative workshops where participants learn to build a digital research pipeline centered on their data cycle in the spirit of Open Science and Open Data.

Lectures

Each day, two keynote lectures inaugurate the session and set the thematic and methodological tone for the day. Each keynote will elaborate on a particular topic, combining addresses from more senior and experienced (morning) and younger researchers (see Schedule and Themes).

Research Pipeline

Central to Data Think! is the collaborative development of a “Research Pipeline.” Together, the participants learn about the challenges and implication of data-centric research in social science, design and the humanities by conducting research. The pipeline is set up using Renku, the collaborative learning notebook developed at EPFL and based on the Jupyter Notebook Project.

In creating the Research Pipeline, participants also learn the principles of version controlling with GIT and programming in Python. Although programming skills would surely be useful to participants, they are not required, since the research pipeline can be described as text and drawings. Likewise, data visualization can be achieved directly by programming, but also in drawings, where the participants develop a conceptual understanding of the potential and limitation of visualizing research data as part of the research process.

Intra- and Interinstutitonal Collaborations

Primarily registered at EPFL Doctoral Program in Architecture and Science of the City (EPFL EDOC EDAR), Data Think! Benefits from the support and contribution of many Swiss- based academic initiatives focused on digital studies and methods.

The dhCenter UNIL-EPFL (EPFL ENT-R CHD/UNIL LETTRES CHD), the Join Center for digital studies and computational methods in the social science and the humanities of EPFL and UNIL, provides the main institutional attach to Data Think! since both Dr. Koseki and Dr. Mazel- Cabasse are employed at the dhCenter. The course stems from the dhCenter’s mission to equip young researchers with interdisciplinary technical and critical digital skills.

Naturally, the course ties with the UNIL Programme doctoral en études numériques (UNIL PDEN), which provides UNIL participants to Data Think!, and financial support for its production. Additional funding may be acquired through PDEN for hiring external lecturers and cover their transportation and accommodation fees (decision pending).

Likewise, the course benefits from the support of the EPFL Doctoral Program in Digital Humanities (EPFL EDOC EDDH), which provides scientific, logistical and communicational support to the course. This support consists mostly helping define the content of the course and advertising its registration across the EDDH network.

Through its Digital Habitat Initiative, the EPFL Habitat Research Center (EPFL ENAC HRC) is involved in Data Think! by providing expertise on the organization of the block course. The previous edition of Data Think! was organized through the Digital Habitat Initiative.

Addition guidance in the set up of the course and the selection of lecturers should be provided by collaborators of EPFL Collège des Humanitées (EPFL CDH) who are responsible for the development of courses on digitalization between UNIL and EPFL.

The Center for Digital Visual Studies (UZH-MPG CDVS) is a new joint research center between the University of Zurich and the Max-Planck Society, schedule to launch in the spring of 2020. As part of a preliminary cooperation agreement between the CDVS UZH-MPG and the dhCenter UNIL-EPFL, the former has agreed to provide funded participants to Data Think! The scientific collaborators of the CDVS shall also give some input on the content of the course, especially regarding visual data analyses and representation.

Team for the EPFL Renku Project and the EPFL Library provide addition guidance on the use of Renku, a platform for collaborative education based on the Jupyter Notebook Project.

The Centre d’aide à l’apprentissage (EPFL CAPE) provide expert evaluation of the course based on the feedback it received from the participants in the 2018 edition.

We are currently in discussion with collaborators of a large telecommunication company to use their data and their data infrastructure as part of the course.

Excerpt from the syllabus


Course:

Data Think! Thinking Urban Research With Data

Host institutions:

EPFL

Year:

2018–2020

Main lecturers:

Shin Alexandre Koseki

Assistants:

With the participation of:

EPFL, KU Leuven, Universityy of Graz, TU Delft, Humboldt University Berlin, Lund University, Mendrisio, ETH Zurich, University of Bern, and University of Lucerne