The Case of the Missing Data
This is the third research note by Onkar Hoysala, one of the researchers who received the Social Media Research grant for 2016.
In the previous post, I broadly discussed the primary findings from the five pilot interviews I conducted – issues related to data, and contextualisation of models. Here, I take the case study of Vinay’s team to examine the issue of data, and its implications for practice, in greater detail. Vinay is a professor of civil engineering from a reputed university in Bangalore, whose group works on transportation simulation and policies. I examine the process by which Vinay and his group overcome the lack of access to data, and the resulting practices that emerge. In this project, I use the lens of practice, borrowing concepts of Communities of Practice, and technology enactment – and in this post I lay the foundations of the same.
First, a note on what Practice and Community of Practice (CoP) are, and why I use these lenses. The history of the term CoP goes back to Lave & Wenger (1991), where they discuss how learning is not an individual process, but a social one. Practice, here, is “everyday activity of the participants”. It is these everyday activities that are at the core of shaping what people do, according to theories of practice (Lave & Wenger, 1991; Wenger, 1999). How do people learn to do what they do, and build their identities? Theories of social learning explain that a newcomer learns by actively being a peripheral participant in a Community of Practice, and over time moves to become an old-timer of the practice. These everyday activities, further, are shaped by both structure (be in technological structures, or social structures) and by the agency of the participant (for example, the social position of the participant). In the words of Orlikowski (2000), this results in technologies being enacted within the contexts in which they are used.
In my interviews with Vinay, we spent close of a third of our time discussing the issues of access and availability of data. The remainder of the post presents details from the interviews.
Case study – Vinay and his group of transportation simulation designers
Vinay is a civil engineer by training. Over the course of his Masters and Doctoral education, he became interested in developing computational tools for planning, operation, and traffic flow management, in the context of sustainable transportation. His group works on both micro-level aspects of traffic modelling, as well as system-level modelling and planning.
Vinay engages with stakeholders from the government both formally and informally. Informally, his team sends reports of their work to the bureaucrats in-charge. He also gives interviews to newspapers, and participates in different non-government consortiums, for example one which advocates for better public bus transport. His group also engages with municipal government agencies for funding his work, through the university he is affiliated with.
Some of his projects in the area of big data analytics and urban mobility are also funded by international funding agencies, and are in collaboration with universities and industries in European countries. Some of these multi-national projects involve direct collaboration with municipal government bodies as well. For example, in one of the projects funded by a government funding agency on a European nation, two people from the state-level body for land transportation policy will be working with Vinay’s group on their doctoral degree. However, such collaborations have not been easy to form. As he explains:
It is very rare. And it’s been so difficult in an Indian ecosystem to do it, which is not favourable for these kinds of collaboration, unfortunately. But it took us a lot of time. We have spent almost 2.5 years in conceptualising and developing these kind of project proposals, and bring them on board. It took so long. Eventually it happened.
Vinay envisions that such collaborations will lead to more synergy between government, industries and academia. However, with regard to having access to the necessary data to carry out such work, there simply isn’t adequate data available according to Vinay. Speaking about the data that is being collected already by the Bangalore Traffic Police, he says
If you go to the traffic management centre, the traffic police have deployed an excellent set of technologies. They should be applauded for doing that. But it is all minus intelligence.
This lack of ”intelligence”, which algorithms and models bring in, are one of the reasons for ineffective collection of data. Further, in many cases data is simply non-existent. Vinay and his group work on activity-based demand modelling and travel choices of consumers. To carry out this work, data on what activities people perform and how they relate to their travel choices, is needed. However, as Vinay says:
We don’t get this kind of data easily. Often in many cases we have to bank upon doing our own primary survey. Specifically for this activity based modelling, we have designed a new travel diary, on which we have also written a paper in Current Science, having a more effective diary design for collecting data in the Indian context.
Using this survey instrument, his group has conducted a survey of around 10,000 individuals in 2,500 households regarding their travel activities.
While one interview is too little to ascertain how it is that simulation designers overcome the lack of access to data, one can certainly draw certain inferences which will help in designing the detailed interviews, as well as provide hints to some of the areas that need to be kept in mind during ethnographic observation.
Engaging with the State for data or for other purposes is what many researchers in different fields do as part of their work. Vinay’s group is no different; though in this case, what we see is an example of an instance of learning socially how to overcome issues of data, through newer forms of engagement with the state. For example, Vinay and his group established, over a course of 2.5 years, a relationship with different government agencies, which they could then leverage when it came to applying for the joint project with the above mentioned European funding agency, which Vinay says is a rare and unique opportunity. What is interesting here is the mode of engagement with the State, where along with providing the necessary support through data or policy advice, two employees from the state level transportation policy will enroll in a doctoral program supervised by Vinay.
Further, in cases where there is no data available, such as in the case of activity based demand data, Vinay and his group have designed new methods of collecting such data for Indian contexts. The travel diary designed by Vinay’s group, reified now by the different publications by the group on the topic, can form the basis for other researchers to collect similar data, thus influencing the field. The data collected by them is now being used to develop a detailed meso-level simulation for Bangalore.
These are practices which have evolved over time, through engagement with different stakeholders. While this requires further examination, here we see an instance of how structures (in this case the organisational structures of the data disseminating organisations) influence practice.
The overall question of the lack of data, however, leads us to a larger question of what ought to be the nature of data collected by government agencies. A full examination of the same is beyond the scope of this post, and I just touch upon it here. While Vinay’s group has had some success in accessing data from the traffic police or the local bus transport agency, these have required a constant push from Vinay to convince them about the usefulness of their work, or to “make them understand the value such tools would have”, in order to gain access to the data – data which proponents of open data say ought to be publicly available.
In terms of my overall study, while these interviews provided some insight into the nature of the reflexive relationship between technology and practice, it has thrown open more new questions that need to be addressed. For instance, what is the nature of the influence that these practices have on the field of transportation simulation? Vinay discussed that the data is being collected without “intelligence”. However, technologies such as the Area Traffic Control Systems (ATCS) are currently being introduced in Bangalore, which bring in some form of intelligence– so what does that mean for the practice of the simulation and modelling designers like Vinay? I will explore these in the next post.
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge university press.
Orlikowski, W. J. (2000). Using Technology and Constituting Structures: A Practice Lens for Studying Technology in Organizations. Organization Science, 11(4), 404–428.
Wenger, E. (1999). Communities of practice: Learning, meaning, and identity. Cambridge university press.
 Name of the agency not provided for purposes of anonymity.
 Activity-based demand modelling is a method of transport modelling and simulation that bases the demand for transportation on the different activities that people do in their everyday lives.
 Through these 2.5 years, Vinay’s group as engaged with the State in different projects, including one where the group used the bus route and bus stop data to develop a hub and spoke model for the city, which Vinay says was not taken up as the bureaucrats involved were not too keen. This raises other questions about how tools like simulations and models are perceived by the decision making authorities, but is out of scope for this post.
 Provincial state of Karnataka.
 Bangalore traffic may turn smoother as signals set to turn intelligent. Last accessed on 5th June 2016. http://economictimes.indiatimes.com/news/politics-and-nation/bengaluru-traffic-may-turn-smoother-as-signals-set-to-turn-intelligent/articleshow/52532728.cms