Design study of movementslicer: an interactive visualization of patterns and group meetings in 2d movement data

As the use of modern devices capable of stream data is becoming more prevelant in all aspects of life, enormous amounts of digital data are being recorded. For example, everyday use of devices such as smart-phones, GPS receivers and RFID tags has resulted in the collection of massive amounts of Movement or GPS1 data. Increased availability of such large movement datasets has led to much research in development of analytical and visualization techniques [Rinzivillo et al. (2008); Wood et al. (2010); Andrienko and Andrienko (2011)]. Analysis of such datasets has the potential to provide insights into the data by discovering patterns in the motion of objects, finding outlier activities or discovering relationships between objects. These patterns can be useful in a number of ways, for example, recommending places to visit based on a user’s past and current positions, or police departments monitoring suspicious activity patterns [Zheng et al. (2010b); Yan et al. (2007)]. Visualization lies at the forefront of exploratory analysis of such datasets.

Although useful, visualizing such data is challenging for a few reasons:
• There are multiple variables involved: latitude and longitude as a function of time and object id, where object id identifies a person, vehicle, or other moving entity.
• Movements often cross each other and may repeatedly travel other the same pathways to the same locations, causing occlusion (overplotting) when drawn on a 2D geographic map.
• Movements may occur at widely varying physical scales, e.g., ranging from tens of meters (moving between two buildings) to hundreds of kilometers (traveling between cities) in the case of a single GPS dataset, and zooming out on a 2D map will leave only the largest-scale movements salient.
• The data may cover long spans of time containing thousands of events.

Many previous approaches to visualizing movement data have involved the display or analysis of the shapes of movement trajectories [Kapler and Wright (2004); Crnovrsanin et al. (2009); Hurter et al. (2009); Guo et al. (2011)], or have proposed ways of dealing with large numbers of moving objects [Andrienko and Andrienko (2011); Zeng et al. (2013)], or both [Liu et al. (2011); Buchin et al. (2013); Krüger et al. (2013); Wang et al. (2013)], often making use of aggregation [Elmqvist and Fekete (2010)].

In the current work, we are instead interested in understanding a small number (< 20) of moving objects. In such a case, we would like to avoid aggregating groups of objects, so that the history of each entity is visible. At the same time, we are not interested in the detailed shape of movement trajectories. Instead, we are interested in the discrete locations (e.g., rooms, workstations in a factory, buildings, addresses, cities) visited by the moving entities. We are interested in where the people or objects have been (in terms of discrete locations), when they were there, how many times they visited different locations, and in what order. We are also interested in meetings that occur between the objects or persons. Such information is difficult to convey in a single 2D geographic map, due to overplotting or the need for animation.

There are several scenarios where we may wish to understand the movements of a small number of objects or persons over a set of discrete locations. These include analyzing meetings and activities of suspected criminals (gang members, terrorists) whose cell phones are tracked; monitoring offenders on probation; understanding movements of a team of workers and their equipment in a factory, to improve workflow processes; analyzing movements of groups of visitors in a museum; understanding movements of health care professionals, patients, and equipment within a hospital, to optimize changes to floorplans or reduce the risk of pathogens spreading; or movements of several customers within a large store, to optimize merchandise displays and layout. Such scenarios may also arise if the user begins with an overview of a large, aggregated dataset, and selects a small subset of moving entities to analyze in more detail.

Devices capable of recording data are entering into every aspect of life. Modern devices can record data ranging from your location history (e.g., GPS) to your sleep cycles (e.g., FitBit), respectively. This data can be leveraged to find patterns and meaningful relations or trends. Such analysis is helpful to both the user and the industry in making informed decisions. For example, insights from movement data can be useful in making better recommendations for travelling (Zheng et al. (2010b)) or suspicious activity detection (Yan et al. (2007); Guo et al. (2011)).

Analyzing such immense amounts of movement data has its own problems. Raw movement data consists of long sequences of time-stamped coordinates. Limitations of human working memory make it extremely difficult to find significant patterns in such large datasets (Baddeley (1992)). Computers are capable of remembering data indefinitely and thus, are efficient at finding trends in the dataset. However, just using algorithms to discover patterns or outliers can only give a superficial amount of information about the dataset. To perform insightful decisions based on patterns, human reasoning is required. Heer and Shneiderman (2012) refers to this logic based reasoning as domain-specific knowledge. Hollands et al. (2004) states that information visualization helps improve human perception, allowing it to work with larger datasets and improving reasoning capacity. Visualization also offers the advantage of allowing presentation of data from different perspectives and interactivity that further increases humans’ ability to understand the data. For example, if you consider geo-spatial data, plotting the data on a map may help infer regional relations where as animating on a timeline allows for discovering temporal trends.

Visualization can be designed in two ways. Either the user starts with a task or a hypothesis in mind and designs a visualization that assists him or her in supporting the specific tasks using the data, or, the user starts with a way to visualize the data and finds patterns that help him or her formulate hypothesis. In both cases, whichever one precedes the other, the two parts of task formation and visualization design are equally important for providing meaningful insights into the data. In this thesis, we follow the former path. We suggest a handful of tasks that we aim to support and then design visualizations to facilitate performing those tasks.

Taxonomy

Taxonomy of tasks

Tasks are fundamental for effective visual analysis. As mentioned earlier for this thesis, we first define tasks which we then use to guide the process of visualization design. Defining tasks for movement data is complex due to: involvement of multiple variables such as space, time and object id’s; varying physical scale, i.e. data can be specific to one city or may span multiple countries; and the granularity of time may also vary, people might want to understand data over a month or many years.

A taxonomy helps alleviate the problem. Dodge et al. (2008) states several advantages of defining a taxonomy: accurate task definitions and categorization help in designing focused visualizations, and it aid designers in generalizing visualizations across tasks. Also accurate definitions help in evaluating the visualization during user studies .

Bertin (1967) uses task types and reading levels to define a framework for task typology. He defines task types according to the variables in the dataset. For example, in case of GPS data the variables are time, location and object id. Examples of possible tasks are: Given a particular time, what is the location of an object?; Given an object and it’s location when was it there?. For each task type, Bertin further defines three levels of reading: elementary (single object), intermediate (multiple objects) and overall (all objects). While Bertin talks about data in general, Peuquet (1994) describes task types specifically for spatio-temporal data. She states that there are three basic kinds of tasks you can perform on a given movement dataset:

• when + where → what : gives information about which object or group of objects were present at a given location or set of locations at any given time or set of times.
• when + what → where : gives information about the location or set of locations occupied by an object or a group of objects at a given time or set of times.
• where + what → when : gives information the time or set of times at which a given object or group of objects occupied a given location or set of locations.

Andrienko et al. (2003) uses a mixture of Bertin’s and Peuquet’s theories with slight modifications. They merge the intermediate and overall levels by Bertin into a single level and call it the set level or the general level. They divide the type of tasks into two categories instead of three by Peuquet :

• when → what + where : gives information about an object or a set of objects and their location or set of locations at any given time or set of times.
• where + what → when : gives information about time or set of times given an object or a set of objects and their location or set of locations.

Le rapport de stage ou le pfe est un document d’analyse, de synthèse et d’évaluation de votre apprentissage, c’est pour cela rapport-gratuit.com propose le téléchargement des modèles complet de projet de fin d’étude, rapport de stage, mémoire, pfe, thèse, pour connaître la méthodologie à avoir et savoir comment construire les parties d’un projet de fin d’étude.

Table des matières

INTRODUCTION
CHAPTER 1 LITERATURE REVIEW
1.1 Introduction
1.2 Taxonomy
1.2.1 Taxonomy of tasks
1.2.2 Taxonomy of visualization
1.3 Visualizations of movement data
CHAPTER 2 OBJECTIVES, METHODOLOGY AND DESIGN DECISIONS
2.1 Objectives and methods
2.2 Design: Gantt Charts
2.2.1 Why Gantt Charts?
2.2.2 Design Choices
2.2.2.1 Depiction of Meetings
2.2.2.2 World Lines in Location-Centric Views
CHAPTER 3 MOVEMENTSLICER
3.1 Clustering
3.2 Description
3.2.1 Matrix view
3.2.2 Map view
3.2.3 Gantt chart
3.2.4 Coordinating views
3.3 Implementation
3.4 Case Studies
3.4.1 Case Study 1: One individual over 6 months
3.4.2 Case Study 2: Six people over 1 month
3.4.3 Case Study 3: GeoLife data
CONCLUSION