This notebook presents the application of the BRIL algorithm to real-world datasets, as complementary material of the article :

Brilhault A, Neuenschwander S, Rios RA, “A New Robust Multivariate Mode Estimator for Eye-tracking Calibration”, 2021, Behavior Research Methods (submitted)

The data consists of eye-tracking calibrations recorded in the context of visual studies in capuchins (carried out at the Brain Institute of the Federal University of Rio Grande de Norte, Brazil). Eye positions were recorded using the eye-tracking setup developed by Matsuda Keiji et al.. The system relies on a PointGrey Grasshopper3 infrared camera (GS3-U3-41C6NIR-C). Image acquisition was set at a frequency of 235Hz. The eye coordinate signals generated by the system were recorded through an acquisition board (National Instruments E-Series board), at a sampling frequency of 1000 samples/s.

We demonstrate the robustness of our algorithm, BRIL, in estimating the reference coordinates for each calibration target despite the highly contaminated data resulting from the subjects lack of attention or cooperation. Note that the three datasets presented in the article as Set1, Set2, and Set3 (with low, medium, and high contamination) correspond to the sessions juj011a00, ded00800, and ded005a01, respectivly.

To run this notebook interactively, follow the link below:

Binder

Requirements

The following R packages are required for the execution of the notebook:

Libraries and dataset loading

The horizontal and vertical eye-positions recorded during the calibrations are stored in the variables x and y, respectively. Each consist of a matrix, rows corresponding to single trials, and columns to the successive measurements within the trials (1 per millisecond, since the data was acquired at 1000hz). Metadata, having the same number of rows, contains the information relative to each trial in x and y, namely the session, trial number, target presented, total the number of trials and targets for the corresponding session, as well as the durations of the recording (in ms). Note that as the duration varies from a session to another, the tables x and y were filled with NA until the end of the rows in shorter trials.

## Warning: package 'tidyverse' was built under R version 4.0.5
## Warning: package 'tibble' was built under R version 4.0.5
## Warning: package 'tidyr' was built under R version 4.0.5
## Warning: package 'dplyr' was built under R version 4.0.5
## Load data
rm(list=ls())
#load("Data/eyeTrackingCalibrations.Rdata")
load(gzcon(url('https://github.com/adrienbrilhault/BRIL/raw/master/data/eyeTrackingCalibrations.Rdata')))

## Print preview of the data
cat("Variables:\n")
## Variables:
## [1] "groundTruth" "metadata"    "x"           "y"
cat("\nmetadata: ", class(metadata), "(", typeof(metadata), "), dim", dim(metadata), " \n")
## 
## metadata:  data.frame ( list ), dim 240 6
print(metadata[1:5, ])
##     session trial nTrials duration target nTargets
## 1 ded005a01     1      50     3000      4        5
## 2 ded005a01     2      50     3000      2        5
## 3 ded005a01     3      50     3000      5        5
## 4 ded005a01     4      50     3000      2        5
## 5 ded005a01     5      50     3000      4        5
cat("\nSessions: \n")
## 
## Sessions:
print(sort(unique(metadata$session)))
## [1] "ded005a01" "ded006a00" "ded00800"  "juj00300"  "juj011a00"
cat("\nx: ", class(x), "(", typeof(x), "), dim", dim(x), " \n")
## 
## x:  matrix array ( integer ), dim 240 3000
print(x[0:10])
##  [1]  14199  21703  22505  22514  17203  18172  22400  21573 -32768  20824
cat("\ny: ", class(y), "(", typeof(y), "), dim", dim(y), " \n")
## 
## y:  matrix array ( integer ), dim 240 3000
print(y[0:10])
##  [1]  -2467 -10002   1037  -9699  -6936    607    834  -9940 -32768  -8358

Summary statistics

Since one calibration point is drawn randomly within the 5 calibration targets at each trial, the number of trials per target within one session may vary. The table below shows the characteristics of each session, with the number of samples collected for each of the targets, as well as the number of valid samples, once incorrect positions have been filtered out (these being the result of eye blinks, artifacts, or saturated values from the analog acquisition board).

summary <- metadata %>%
  select(session, duration, nTrials, nTargets, target) %>%
  arrange(session, target) %>%
  distinct_all()

summary$nTrialsTarget <- NA
summary$nSamples <- NA
summary$nValidSamples <- NA

for (i in 1:nrow(summary)) {
  selectedIndices <- metadata$session == summary$session[i] &
    metadata$target == summary$target[i]

  df <- data.frame(
    X = as.vector(t(x[selectedIndices, ])),
    Y = as.vector(t(y[selectedIndices, ]))
  )

  # Remove trailing NA fill
  df <- df[!is.na(df$X) | !is.na(df$Y), ]
  summary$nSamples[i] <- nrow(df)
  # Remove overflowed values and error codes from eyetracker
  df <- df[df$X != -32768 & df$Y != -32768, ]
  df <- df[df$X != 32767 & df$Y != 32767, ]

  summary$nTrialsTarget[i] <- sum(selectedIndices)
  summary$nValidSamples[i] <- nrow(df)
}
rownames(summary) <- NULL
colnames(summary) <- c(
  "Session", "Duration", "TotalTrials", "TotalTargets",
  "Target", "TargetTrials", "Samples", "ValidSamples"
)

data.frame(summary)
##      Session Duration TotalTrials TotalTargets Target TargetTrials Samples
## 1  ded005a01     3000          50            5      1            6   18000
## 2  ded005a01     3000          50            5      2            8   24000
## 3  ded005a01     3000          50            5      3           10   30000
## 4  ded005a01     3000          50            5      4           19   57000
## 5  ded005a01     3000          50            5      5            7   21000
## 6  ded006a00     2200          40            5      1            9   19800
## 7  ded006a00     2200          40            5      2            4    8800
## 8  ded006a00     2200          40            5      3            8   17600
## 9  ded006a00     2200          40            5      4            8   17600
## 10 ded006a00     2200          40            5      5           11   24200
## 11  ded00800     1200          50            5      1           12   14400
## 12  ded00800     1200          50            5      2           10   12000
## 13  ded00800     1200          50            5      3            7    8400
## 14  ded00800     1200          50            5      4           11   13200
## 15  ded00800     1200          50            5      5           10   12000
## 16  juj00300     1700          50            5      1            7   11900
## 17  juj00300     1700          50            5      2            8   13600
## 18  juj00300     1700          50            5      3           12   20400
## 19  juj00300     1700          50            5      4           12   20400
## 20  juj00300     1700          50            5      5           11   18700
## 21 juj011a00     1700          50            5      1            7   11900
## 22 juj011a00     1700          50            5      2           14   23800
## 23 juj011a00     1700          50            5      3           12   20400
## 24 juj011a00     1700          50            5      4            6   10200
## 25 juj011a00     1700          50            5      5           11   18700
##    ValidSamples
## 1         15607
## 2         21318
## 3         26602
## 4         46419
## 5         18536
## 6         16047
## 7          7353
## 8         14841
## 9         14875
## 10        20252
## 11        11122
## 12        11016
## 13         7637
## 14        12127
## 15        11022
## 16        11850
## 17        13077
## 18        20258
## 19        20371
## 20        18670
## 21        11900
## 22        23175
## 23        20400
## 24         8951
## 25        18700

Filter data for a given session and target

Process one sample distribution, pooling all the trials for a specific session and target. Filter the incorrect sample, then apply the BRIL algorithm to estimate the main mode.

set.seed(123)

# Select a session and target number
session <- "ded005a01"
target <- 1

# Filter the corresponding trials
selectedIndices <- metadata$session == session & metadata$target == target
df <- data.frame(
  X = as.vector(t(x[selectedIndices, ])),
  Y = as.vector(t(y[selectedIndices, ]))
)

# Remove trailing NA fill
df <- df[!is.na(df$X) | !is.na(df$Y), ]

# Remove overflowed values and error codes from eyetracker
df <- df[df$X != -32768 & df$Y != -32768, ]
df <- df[df$X != 32767 & df$Y != 32767, ]

# Draw 1000 random samples and estimate the mode with the BRIL method
df <- sample_n(df, 1000)
res <- bril(df, method = "Projection", testNormal = "Mardia", maxIterations = 5)

# Plot the result
options(jupyter.plot_mimetypes = "image/svg+xml")
theme_set(theme_bw(base_size = 16))


ggplot(df, aes(x = X, y = Y)) +
  geom_point(size = 2, shape = 1) +
  geom_point(aes(x = res$mode[1], y = res$mode[2]),
    colour = "red",
    shape = 3, size = 3, stroke = 2
  ) +
  coord_fixed(ratio = 1, expand = TRUE)

Process a complete calibration procedure

Apply the same procedure as above for each of the targets of one or several sessions, plotting the final result with each sample displayed in a color corresponding to the target that was presented at the time of the recording.

# Select one or several sessions to process
sessions <- c("ded005a01")

# Analyze each session
for (session in sessions) {
  dfCalibration <- data.frame()
  dfModes <- c()

  # Iterate through each target within the session
  for (target in sort(unique(metadata$target[metadata$session == session]))) {

    # Filter the corresponding trials
    selectedIndices <- metadata$session == session & metadata$target == target
    df <- data.frame(
      X = as.vector(t(x[selectedIndices, ])),
      Y = as.vector(t(y[selectedIndices, ]))
    )

    # Remove trailing NA fill
    df <- df[!is.na(df$X) | !is.na(df$Y), ]

    # Remove overflowed values and error codes from eye-tracker
    df <- df[df$X != -32768 & df$Y != -32768, ]
    df <- df[df$X != 32767 & df$Y != 32767, ]

    # Draw 1000 random samples and estimate the mode
    df <- sample_n(df, 1000)
    res <- bril(df, method = "L2", testNormal = "Mardia", maxIterations = 5)

    # Update structures
    df$Target <- paste("Target", target)
    dfCalibration <- rbind(dfCalibration, df)
    dfModes <- rbind(dfModes, res$mode)
  }

  # Plot samples and mode
  dfModes <- data.frame(dfModes)
  colnames(dfModes) <- c("X", "Y")
  rownames(dfModes) <- NULL

  g <- ggplot(dfCalibration, aes(x = X, y = Y, color = Target)) +
    geom_point(size = 1, shape = 1) +
    geom_point(
      data = dfModes, aes(x = X, y = Y),
      colour = "black", shape = 3, size = 2, stroke = 2
    ) +
    coord_fixed(ratio = 1, expand = TRUE) +
    ggtitle(paste("Session",session))
  print(g)

  # save image
  if (FALSE) {
    ggsave(g, filename = paste0("calibration_", session, ".jpg"), width = 7, height = 6)
  }
}