Friday, April 3, 2015

below-chance classification accuracy: meaningful here?

Below-chance accuracy ... always exciting. It showed up today in an interesting way in a set of control analyses. For framing, this is a bit of the working memory task data from the HCP: 0-back and 2-back task blocks, using pictures of faces or places as the stimuli. This gives four examples (volumetric parameter estimate images) per person: 0-back with faces, 0-back with places, 2-back with faces, and 2-back with places.

The classification shown here was with linear SVM, two classes (all balanced, so chance is 0.5), with leave-16-subjects-out cross-validation. The cross-validation is a bit unusual since we're aiming for generalizability across people: I trained the classifiers on a pair of stimuli in the training people, then tested on a different pair of stimuli in the testing people. For example, one cross-validation fold is training on 0-back vs 2-back face in 144 people, then testing 0-back vs 2-back place with the 16 left-out people.

Anyway, a ROI-based classification analysis was performed, on 6 anatomic clusters (C1:C6), which are along the x-axis in the graphs. The left side graph shows a positive control-type analysis: as we expected, face vs. place is classified extremely well with these ROIs, but 0-back vs. 2-back is classified at chance. The right side graph shows some non-sensical, negative control-type analyses, all of which we expected to classify around chance. These are nonsense because we're training and testing on different classifications: for example, training a classifier to distinguish face vs place, then testing with all face stimuli, some of which were from 0-back blocks, others of which were from 2-back blocks.


The striking pattern is that the blue and green lines are quite far below chance, particularly in clusters C1 and C2, which classified face vs place nearly perfectly (ie in the left-side graph).

These ROIs classify face vs place very well. When trained on face vs place, but tested with 0-back vs. 2-back (red and purple lines), they classified basically at chance. This makes sense, because the classifiers learned meaningful ways to distinguish face and place during training, but then were tested with all face or all place stimuli, which they presumably would have classified according to picture type, not n-back. This gives a confusion matrix like the one shown here: all face test examples properly classified as face, for 10/20 = 0.5 accuracy.

Now, the classifiers did not learn to classify 0-back vs 2-back in these ROIs (left-side graph above). To get the observed below-chance accuracies, the confusion matrices would need to be something like this. Why? It is sort of plausible that the classifier could "know" that the place test examples are not face, and so split them equally between the two classes. But if the classifier properly classifies all the 2-back face examples here (as needed to get the below-chance accuracy), why wasn't 0-back vs 2-back properly classified before?

I'll keep looking at this, and save some of the actual confusion matrices to see how exactly the below-chance accuracies are being generated. It's not quite clear to me yet, but the striking pattern in the below-chance here makes me think that they actually might carry some meaning in this case, and perhaps give some more general insights. Any thoughts?

No comments:

Post a Comment