Take home exercise 4

In this article, we will reveal the daily routines of two selected participant of the city of Engagement, Ohio USA. To address the purpose of this take-home exercise, I will be using ViSIElse and other appropriate visual analytics methods.

Li Minqi https://www.linkedin.com/in/minqi-li/ (School of Computing and Information System)https://example.com/spacelysprokets
2022-05-22

1. Overview

1.1 Introduction

In this take-home exercise, we will pick up two participant from the research to compare their daily routine on weekdays, such as sleep time, working hours as well ask dining schedule.

The data used in this exercise is from VAST Challenge 2022, and processed by RStudio. It is assumed that the volunteer participants are representative of the city’s population.

2. Data Preparation

2.1 R packages installation

The following code chunk installs the required R packages and loads them onto RStudio environment.

packages = c('scales', 'viridis', 
             'lubridate', 'ggthemes', 
             'gridExtra', 'tidyverse', 
             'readxl', 'knitr',
             'data.table', 'ViSiElse','zoo','hrbrthemes')

for(p in packages){
  if(!require(p,character.only= T)){
    install.packages(p)
  }
  library(p,character.only=T)
}

2.2 Import Datasets

This article will use the dataset provided by VAST Challenge 2022. The code chunk below import a set of data files from the folder into R by using [list.files()] and save it as an tibble data frame. Since the data is too big to upload to github, we will convert the file into rds format after we filtered the data we needed for this study.

2.3 Data Wrangling

Participant

Participant with ID 4 has the lowest average available balance in March, while participant with ID 18 has the highest. Therefore, we will select these two participants daily routine in March for comparison.

[Participant 4]

participant4 <-read_rds("data/ActivityLogs/participant4.rds")

[Participant 18]

participant18 <-read_rds("data/ActivityLogs/participant18.rds")

2.4 Data Transformation

transform_par4 <- participant4 %>%
  mutate(Dates = as.numeric(format(as.Date(timestamp,format="%m/%d/%Y"), format = "%d"))) %>%
  mutate(Timestamp = as.numeric(format(timestamp,format="%H"))*60+as.numeric(format(timestamp,format="%M"))) %>%
  select(Dates,Timestamp,currentMode,hungerStatus,sleepStatus)
transform_par18 <- participant18 %>%
  mutate(Dates = as.numeric(format(as.Date(timestamp,format="%m/%d/%Y"), format = "%d"))) %>%
  mutate(Timestamp = as.numeric(format(timestamp,format="%H"))*60+as.numeric(format(timestamp,format="%M"))) %>%
  select(Dates,Timestamp,currentMode,hungerStatus,sleepStatus)

Sleep time

Sleep_time <- transform_par4 %>%
  filter(sleepStatus %in% c('Awake','PrepareToSleep')) %>%
  group_by(Dates) %>%
  summarise(Wake_time = min(Timestamp),Sleep_time = max(Timestamp))
Sleep_time18 <- transform_par18 %>%
  filter(sleepStatus %in% c('Awake','PrepareToSleep')) %>%
  group_by(Dates) %>%
  summarise(Wake_time = min(Timestamp),Gotobed_time = max(Timestamp))

March 11 is a noise for participant 4, since participant slept after 00:00. We will exclude this day from the dataset.

new_transform_par4 <- transform_par4[!transform_par4$Dates %in% c('11','1')]
transform_par18 <- transform_par18[!transform_par18$Dates=='1']
Sleep_time <- new_transform_par4 %>%
  filter(sleepStatus %in% c('Awake','PrepareToSleep')) %>%
  group_by(Dates) %>%
  summarise( Wake_time = min(Timestamp),Gotobed_time = max(Timestamp) )
unique(new_transform_par4$currentMode)
[1] "AtHome"       "Transport"    "AtWork"       "AtRestaurant"
[5] "AtRecreation"

Work time

Work_time <- new_transform_par4 %>%
  filter(currentMode == "AtWork" ) %>%
  group_by(Dates) %>%
  summarise(Start_work = min(Timestamp),End_work = max(Timestamp))
Work_time18 <- transform_par18 %>%
  filter(currentMode == "AtWork" ) %>%
  group_by(Dates) %>%
  summarise(Start_work = min(Timestamp),End_work = max(Timestamp))

Transport time

Transport <- new_transform_par4 %>%
  filter(currentMode == "Transport" ) %>%
  group_by(Dates) %>%
  summarise(Leave_home = min(Timestamp),Go_home = max(Timestamp))
Transport18 <- transform_par18 %>%
  filter(currentMode == "Transport" ) %>%
  group_by(Dates) %>%
  summarise(Leave_home = min(Timestamp),Go_home = max(Timestamp))
unique(new_transform_par4$hungerStatus)
[1] "BecomingHungry" "Hungry"         "Starving"      
[4] "JustAte"        "BecameFull"    

Dining time

Dining <- new_transform_par4 %>%
  filter(hungerStatus == "JustAte" ) %>%
  group_by(Dates) %>%
  summarise(Ate_Breakfast = min(Timestamp),Ate_dinner = max(Timestamp))
Dining18 <- transform_par18 %>%
  filter(hungerStatus == "JustAte" ) %>%
  group_by(Dates) %>%
  summarise(Ate_Breakfast = min(Timestamp),Ate_dinner = max(Timestamp))

2.4 Data concatenate

Daily_Routine <- full_join(Sleep_time, Work_time, by = "Dates")
Daily_Routine <- full_join(Daily_Routine, Transport, by = "Dates")
Daily_Routine <- full_join(Daily_Routine, Dining, by = "Dates")
Daily_Routine18 <- full_join(Sleep_time18, Work_time18, by = "Dates")
Daily_Routine18 <- full_join(Daily_Routine18, Transport18, by = "Dates")
Daily_Routine18 <- full_join(Daily_Routine18, Dining18, by = "Dates")

From the above table, we can infer that participant does not work during weekends. We only analyse the daily routine during weekdays, so we will remove all the weekends and offdays.

Daily_Routine <- Daily_Routine[complete.cases(Daily_Routine), ]
Daily_Routine18 <- Daily_Routine18[complete.cases(Daily_Routine18), ]
Daily_Routine <-lapply(Daily_Routine,as.numeric)
Daily_Routine <-lapply(Daily_Routine,abs)

X <- data.frame(id = Daily_Routine$Dates, Daily_Routine$Wake_time, Daily_Routine$Ate_Breakfast,Daily_Routine$Start_work,Daily_Routine$Leave_home,Daily_Routine$End_work,Daily_Routine$Go_home,Daily_Routine$Ate_dinner,Daily_Routine$Gotobed_time)
Daily_Routine18 <-lapply(Daily_Routine18,as.numeric)
Daily_Routine18 <-lapply(Daily_Routine18,abs)

Y <- data.frame(id = Daily_Routine18$Dates, Daily_Routine18$Wake_time, Daily_Routine18$Ate_Breakfast,Daily_Routine18$Start_work,Daily_Routine18$Leave_home,Daily_Routine18$End_work,Daily_Routine18$Go_home,Daily_Routine18$Ate_dinner,Daily_Routine18$Gotobed_time)

3 Visualisation

v2 <- visielse(X, 
               informer = "mean", 
               doplot = F, 
               pixel = 30)
plot(v2, 
     vp0w = 0.7, 
     unit.tps = "min", 
     scal.unit.tps = 30, 
     main = "Participant 4")

v18 <- visielse(Y, 
               informer = "mean", 
               doplot = F, 
               pixel = 30)
plot(v18, 
     vp0w = 0.7, 
     unit.tps = "min", 
     scal.unit.tps = 30, 
     main = "Participant 18")

4 Conclusion

Participant 18 with higher money balance is well disciplined.

According to our data, we found that participant 18 does every thing almost at the same time every day during weekdays.

Both participant has regular wake up and working schedule.

Both participant wake up and go to work punctually every day.

Participant 4 spends a lot of time outside after work

According to the data, participant 4 usually would not go home directly after work.