• In the right hand side of equation (1), S should be replaced by S_t, where S_t is the feature vector of frame t.
  • All the accuracies reported on the MSRC-12 dataset were actually obtained using the RBF Kernel, not the linear kernel as mentioned in the paper. In this set of experiments, the authors mistakenly assumed the default kernel setting for libsvm to be the linear kernel and did not set the kernel type explicitly. This is already reflected in the public code for these experiments. Using the linear kernel results in a reduction of about 1% from the reported accuracies. We apologize for this mistake.