demo_EyeTrackingData.ipynb
demo_EyeTrackingData.Rmd
This notebook presents the application of the BRIL algorithm to real-world datasets, as complementary material of the article :
Brilhault A, Neuenschwander S, Rios RA, “A New Robust Multivariate Mode Estimator for Eye-tracking Calibration”, 2021, Behavior Research Methods (submitted)
The data consists of eye-tracking calibrations recorded in the context of visual studies in capuchins (carried out at the Brain Institute of the Federal University of Rio Grande de Norte, Brazil). Eye positions were recorded using the eye-tracking setup developed by Matsuda Keiji et al.. The system relies on a PointGrey Grasshopper3 infrared camera (GS3-U3-41C6NIR-C). Image acquisition was set at a frequency of 235Hz. The eye coordinate signals generated by the system were recorded through an acquisition board (National Instruments E-Series board), at a sampling frequency of 1000 samples/s.
We demonstrate the robustness of our algorithm, BRIL, in estimating the reference coordinates for each calibration target despite the highly contaminated data resulting from the subjects lack of attention or cooperation. Note that the three datasets presented in the article as Set1, Set2, and Set3 (with low, medium, and high contamination) correspond to the sessions juj011a00, ded00800, and ded005a01, respectivly.
To run this notebook interactively, follow the link below:
The following R packages are required for the execution of the notebook:
install.package("tidyverse")
)install.package("remotes")
, then the BRIL package with
remotes::install_github("adrienbrilhault/BRIL", subdir = "pkg")
)The horizontal and vertical eye-positions recorded during the
calibrations are stored in the variables x
and
y
, respectively. Each consist of a matrix, rows
corresponding to single trials, and columns to the successive
measurements within the trials (1 per millisecond, since the data was
acquired at 1000hz). Metadata
, having the same number of
rows, contains the information relative to each trial in x
and y
, namely the session, trial number, target presented,
total the number of trials and targets for the corresponding session, as
well as the durations of the recording (in ms). Note that as the
duration varies from a session to another, the tables x
and
y
were filled with NA
until the end of the
rows in shorter trials.
# Load libraries
suppressPackageStartupMessages(library(BRIL))
suppressPackageStartupMessages(library(tidyverse))
## Warning: package 'tidyverse' was built under R version 4.0.5
## Warning: package 'tibble' was built under R version 4.0.5
## Warning: package 'tidyr' was built under R version 4.0.5
## Warning: package 'dplyr' was built under R version 4.0.5
## Load data
rm(list=ls())
#load("Data/eyeTrackingCalibrations.Rdata")
load(gzcon(url('https://github.com/adrienbrilhault/BRIL/raw/master/data/eyeTrackingCalibrations.Rdata')))
## Print preview of the data
cat("Variables:\n")
## Variables:
## [1] "groundTruth" "metadata" "x" "y"
##
## metadata: data.frame ( list ), dim 240 6
print(metadata[1:5, ])
## session trial nTrials duration target nTargets
## 1 ded005a01 1 50 3000 4 5
## 2 ded005a01 2 50 3000 2 5
## 3 ded005a01 3 50 3000 5 5
## 4 ded005a01 4 50 3000 2 5
## 5 ded005a01 5 50 3000 4 5
cat("\nSessions: \n")
##
## Sessions:
## [1] "ded005a01" "ded006a00" "ded00800" "juj00300" "juj011a00"
##
## x: matrix array ( integer ), dim 240 3000
print(x[0:10])
## [1] 14199 21703 22505 22514 17203 18172 22400 21573 -32768 20824
##
## y: matrix array ( integer ), dim 240 3000
print(y[0:10])
## [1] -2467 -10002 1037 -9699 -6936 607 834 -9940 -32768 -8358
Since one calibration point is drawn randomly within the 5 calibration targets at each trial, the number of trials per target within one session may vary. The table below shows the characteristics of each session, with the number of samples collected for each of the targets, as well as the number of valid samples, once incorrect positions have been filtered out (these being the result of eye blinks, artifacts, or saturated values from the analog acquisition board).
summary <- metadata %>%
select(session, duration, nTrials, nTargets, target) %>%
arrange(session, target) %>%
distinct_all()
summary$nTrialsTarget <- NA
summary$nSamples <- NA
summary$nValidSamples <- NA
for (i in 1:nrow(summary)) {
selectedIndices <- metadata$session == summary$session[i] &
metadata$target == summary$target[i]
df <- data.frame(
X = as.vector(t(x[selectedIndices, ])),
Y = as.vector(t(y[selectedIndices, ]))
)
# Remove trailing NA fill
df <- df[!is.na(df$X) | !is.na(df$Y), ]
summary$nSamples[i] <- nrow(df)
# Remove overflowed values and error codes from eyetracker
df <- df[df$X != -32768 & df$Y != -32768, ]
df <- df[df$X != 32767 & df$Y != 32767, ]
summary$nTrialsTarget[i] <- sum(selectedIndices)
summary$nValidSamples[i] <- nrow(df)
}
rownames(summary) <- NULL
colnames(summary) <- c(
"Session", "Duration", "TotalTrials", "TotalTargets",
"Target", "TargetTrials", "Samples", "ValidSamples"
)
data.frame(summary)
## Session Duration TotalTrials TotalTargets Target TargetTrials Samples
## 1 ded005a01 3000 50 5 1 6 18000
## 2 ded005a01 3000 50 5 2 8 24000
## 3 ded005a01 3000 50 5 3 10 30000
## 4 ded005a01 3000 50 5 4 19 57000
## 5 ded005a01 3000 50 5 5 7 21000
## 6 ded006a00 2200 40 5 1 9 19800
## 7 ded006a00 2200 40 5 2 4 8800
## 8 ded006a00 2200 40 5 3 8 17600
## 9 ded006a00 2200 40 5 4 8 17600
## 10 ded006a00 2200 40 5 5 11 24200
## 11 ded00800 1200 50 5 1 12 14400
## 12 ded00800 1200 50 5 2 10 12000
## 13 ded00800 1200 50 5 3 7 8400
## 14 ded00800 1200 50 5 4 11 13200
## 15 ded00800 1200 50 5 5 10 12000
## 16 juj00300 1700 50 5 1 7 11900
## 17 juj00300 1700 50 5 2 8 13600
## 18 juj00300 1700 50 5 3 12 20400
## 19 juj00300 1700 50 5 4 12 20400
## 20 juj00300 1700 50 5 5 11 18700
## 21 juj011a00 1700 50 5 1 7 11900
## 22 juj011a00 1700 50 5 2 14 23800
## 23 juj011a00 1700 50 5 3 12 20400
## 24 juj011a00 1700 50 5 4 6 10200
## 25 juj011a00 1700 50 5 5 11 18700
## ValidSamples
## 1 15607
## 2 21318
## 3 26602
## 4 46419
## 5 18536
## 6 16047
## 7 7353
## 8 14841
## 9 14875
## 10 20252
## 11 11122
## 12 11016
## 13 7637
## 14 12127
## 15 11022
## 16 11850
## 17 13077
## 18 20258
## 19 20371
## 20 18670
## 21 11900
## 22 23175
## 23 20400
## 24 8951
## 25 18700
Process one sample distribution, pooling all the trials for a specific session and target. Filter the incorrect sample, then apply the BRIL algorithm to estimate the main mode.
set.seed(123)
# Select a session and target number
session <- "ded005a01"
target <- 1
# Filter the corresponding trials
selectedIndices <- metadata$session == session & metadata$target == target
df <- data.frame(
X = as.vector(t(x[selectedIndices, ])),
Y = as.vector(t(y[selectedIndices, ]))
)
# Remove trailing NA fill
df <- df[!is.na(df$X) | !is.na(df$Y), ]
# Remove overflowed values and error codes from eyetracker
df <- df[df$X != -32768 & df$Y != -32768, ]
df <- df[df$X != 32767 & df$Y != 32767, ]
# Draw 1000 random samples and estimate the mode with the BRIL method
df <- sample_n(df, 1000)
res <- bril(df, method = "Projection", testNormal = "Mardia", maxIterations = 5)
# Plot the result
options(jupyter.plot_mimetypes = "image/svg+xml")
theme_set(theme_bw(base_size = 16))
ggplot(df, aes(x = X, y = Y)) +
geom_point(size = 2, shape = 1) +
geom_point(aes(x = res$mode[1], y = res$mode[2]),
colour = "red",
shape = 3, size = 3, stroke = 2
) +
coord_fixed(ratio = 1, expand = TRUE)
Apply the same procedure as above for each of the targets of one or several sessions, plotting the final result with each sample displayed in a color corresponding to the target that was presented at the time of the recording.
# Select one or several sessions to process
sessions <- c("ded005a01")
# Analyze each session
for (session in sessions) {
dfCalibration <- data.frame()
dfModes <- c()
# Iterate through each target within the session
for (target in sort(unique(metadata$target[metadata$session == session]))) {
# Filter the corresponding trials
selectedIndices <- metadata$session == session & metadata$target == target
df <- data.frame(
X = as.vector(t(x[selectedIndices, ])),
Y = as.vector(t(y[selectedIndices, ]))
)
# Remove trailing NA fill
df <- df[!is.na(df$X) | !is.na(df$Y), ]
# Remove overflowed values and error codes from eye-tracker
df <- df[df$X != -32768 & df$Y != -32768, ]
df <- df[df$X != 32767 & df$Y != 32767, ]
# Draw 1000 random samples and estimate the mode
df <- sample_n(df, 1000)
res <- bril(df, method = "L2", testNormal = "Mardia", maxIterations = 5)
# Update structures
df$Target <- paste("Target", target)
dfCalibration <- rbind(dfCalibration, df)
dfModes <- rbind(dfModes, res$mode)
}
# Plot samples and mode
dfModes <- data.frame(dfModes)
colnames(dfModes) <- c("X", "Y")
rownames(dfModes) <- NULL
g <- ggplot(dfCalibration, aes(x = X, y = Y, color = Target)) +
geom_point(size = 1, shape = 1) +
geom_point(
data = dfModes, aes(x = X, y = Y),
colour = "black", shape = 3, size = 2, stroke = 2
) +
coord_fixed(ratio = 1, expand = TRUE) +
ggtitle(paste("Session",session))
print(g)
# save image
if (FALSE) {
ggsave(g, filename = paste0("calibration_", session, ".jpg"), width = 7, height = 6)
}
}