Sunday, December 8, 2013

Are juries good at spotting liars? (Part One)

A scene from Lie to Me (Fox)

Regular readers of this blog will know that I’m interested in the use brain-scanning technologies in forensic settings. In particular, in the use of brain-based lie detection tests and memory detection tests. As I have mentioned before, there has been a long-standing judicial reluctance to admit the results of such tests as evidence in a legal trial. This is most noticeable in the United States where, from its earliest incarnation, the polygraph lie detector test has been deemed inadmissible before federal courts (despite its consistent popularity among private and government employers).

For the most part, this judicial reluctance has been well-justified: lie detector tests have serious problems, not least of which is the high proportion of false positives to which they give rise. Still, the reluctance has been unfortunate in other respects. The main one being its failure to appreciate the distinction between different types of test format — the “classic” control question test vs. the concealed information test. The latter is a much more reliable test format with altogether more impressive results, particularly in its recent P300-based incarnation (If you want to know what this is, I suggest reading one of my articles on the topic, but rest assured nothing in this blog post depends on knowing more about the P300 test).

In light of this, it is no great surprise to find that certain authors and theorists have recently begun questioning the continuing judicial reluctance to admit this kind of evidence. I’ve discussed one of them before on the blog (Frederick Schauer). Today I want to discuss another: John B. Meixner. Meixner is both a psychophysiologist and a lawyer, involved in some of the leading empirical research into the P300 concealed information test, as well as in the debate about the legal ramifications of this research. In a recent paper entitled “Liar, Liar, Jury’s the Trier?”, Meixner argues that, although the P300 concealed information test may not yet be ready for use in legal trials, courts should not develop legal rules that would always block its use. This is because the P300 test might be far better than a typical jury in determining the credibility of a witness.

Over the next two posts, I want to consider the evidence Meixner marshalls in support of this thesis. On the whole, I am inclined to agree with him about the unreliability of the typical jury in determining credibility. Consequently, these posts are really just an excuse to summarise all the interesting empirical studies Meixner discusses in his paper. Nevertheless, I will start with some critical reflections on the argumentative framework lurking behind his central thesis.


1. Comparative Truthiness and the Admissibility of Evidence
Meixner presents his central thesis in a particular dialectical context. One of the peculiarities of the legal trial and its associated evidential rules, is that expert evidence — which includes scientific forensic evidence — is only supposed to be used when the court strays beyond its areas of competence and needs expert help in resolving factual questions. This means that there are certain factual questions that remain within the exclusive prerogative of the jury. A recent trend in the jurisprudence on brain-based lie detection (see the Wilson case LINK) is to deny the admissibility of this evidence on the grounds that it tries to answer one of those questions. The jury, it is argued, should have the exclusive prerogative to determine whether a witness is credible or not. Meixner challenges this trend by arguing that juries aren’t very good at determining witness credibility.

This dialectic interests me. One way of deciding which kinds of evidence should be admitted before a court of law is to look at their comparative truthiness. I borrow the term from Stephen Colbert. “Truthiness” means, roughly, the likelihood of the evidence getting you to truth of the factual matter being disputed before the court. “Comparative truthiness”, then, refers to whether the admission of the evidence is more likely to get you to the truth than the alternative (e.g. leaving the matter to the jury). If comparative truthiness were the sole criteria by which to determine the admissibility of evidence, then Meixner’s argument would make a lot of sense. After all, if that were true, leaving the matter within the exclusive prerogative of the jury would only be justified if the jury were more likely to get at the truth than the lie detection evidence. In other words:


  • (1) An evidential rule can only be justified at the expense of an alternative rule if it is more likely to get at the truth than the alternative rule.
  • (2) The evidential rule that leaves credibility assessments as the exclusive prerogative of the jury is not more likely to get at the truth than the alternative rule that (possibly) admits lie detection evidence.
  • (3) Therefore, the evidential rule that leaves credibility assessments to the jury cannot be justified.


[Interpretive note: you may wonder why I have phrased the argument in this awkward “negative” fashion (i.e. in terms of what cannot be justified rather than what can be justified). The reason for this is that phrasing it in a more positive way would incline us to the conclusion that lie detection evidence should be admitted. But that’s not quite what Meixner argues. He only argues that the courts shouldn’t create a rule that always and everywhere prevents the admission of lie detection evidence. Furthermore, Meixner doesn’t even offer a complete defence of premise (2) in his article; he simply offers evidence that could be used to defend it.]

The problem with the comparative truthiness argument is that it’s not clear that truthiness should be the sole criteria against which evidential rules are assessed. A trial can be justified in terms of its outputs, or its processes, or some combination of both. It could be that leaving credibility assessments up to the jury has additional virtues (e.g. it is more participative, less technocratic) that should be weighed against truthiness. This isn’t something that Meixner considers in any depth in his article (he mentions the problem in fn 79), but it seems like it would be necessary before we decide on the admissibility question. I’ll say no more about this now, but it might be worth bearing in mind as we proceed to consider the meat of Meixner’s argument.


2. Juries and Demeanour Evidence
It is commonly believed that credibility can be inferred from demeanour. The stereotype usually being that the calm, confident and assured witness is the truth-teller; whereas the shifty-eyed, twitchy and perspiring witness is the liar. More recently, the belief that microexpressions and other behavioural cues can be used to effectively determined credibility has crept into the collective consciousness, thanks in no small part to the work of Paul Ekman and the television series Lie to Me. Given all this, we might be inclined to think that juries could reliably assess credibility based solely on demeanour, either because everyone has the knack for doing this anyway, or because they can be trained to do so in relatively short period of time.

But as Meixner argues, the empirical evidence does not seem to support this view. Time after time, experimental studies of the ability to use demeanour cues to determine credibility reveals that most people aren’t very good at it, even if they have been trained in the basic techniques of behavioural lie detection. Consider the following four studies (which provide a small snapshot of what is out there):

Ekman and O’Sullivan 1991: This study tested subjects ability to determine credibility based on demeanour cues. It compared the performance of students, psychiatrists, judges, robbery investigators, federal polygraphers and Secret Service agents. Subjects had to watch ten one-minute videos in which college-aged women described their feelings about a nature film they were watching. Half of the women were actually watching a nature film; the other half were watching a gruesome film and were forced to lie about their feelings. It was found that for all groups, except Secret Service agents, accuracy was just above chance (ranging from 52% to 57%). Secret Service agents achieved accuracy rates of 64%. The latter result suggested that individuals could be trained to interpret demeanour cues.
Kassin and Fong 1999: This was the first in a series of studies, each of which had similar results. Here, eight student subjects were instructed to commit a mock crime, while another eight were instructed to commit a related innocent act. The students then underwent a five-minute interrogation and all of them were instructed to lie during the interrogation. These interrogations were videotaped and played to other students who had to guess who was telling the truth and who was lying. Half of those who watched the tapes were trained for thirty minutes in something known as the Reid Technique (the leading method for law enforcement officials) prior to viewing them. The results were fascinating. Those who had not been trained achieved an accuracy rate of 52%, whereas those who had been trained achieved an accuracy rate of 45%. This was despite the fact that those who had been trained claimed to be much more confident about their guesses than those who had not been.
Meissner and Kassin 2002: This was one of the follow-up studies to the previous one. It used the same protocol, but this time played the videos to highly trained and experienced police investigators. Interestingly, it was found that the police investigators were, like the students, no better than chance in determining who was telling the truth and who was lying. Again, this was despite the fact that the investigators were much more confident about their guesses than even the trained students had been in the previous study. One other point that emerged in this study (and indeed in the previous one) was that training seemed to lead to more false positives. In other words, trainees seemed to be biased in favour of thinking that people were lying to them.
Bond and DePaulo 2006: This is the most comprehensive meta-analysis of behavioural deception detection to date. It looked at the results from 206 studies, covering a total of 24,000 subjects. It found that the average accuracy rate for determining credibility based on demeanour cues was 54%. The meta-analysis did not, however, include tests involving subjects who had been trained.

The results here are pretty stark. Assuming that the average jury will not be extensively trained in behavioural deception detection, we can be pretty confident that they will be no better than chance in determining whether a witness is credible or not. Furthermore, even if they are to be trained, this may not lead to greater accuracy. They may simply become more confident in their determinations. This would certainly seem to be true if the amount of training they receive is limited (like the 30 minutes in the Kassin and Fong study), which is probably all we could expect given the time and resource constraints involved in conducting a legal trial. This would be a worrying development.

All of which would seem to support Meixner’s central thesis: there is no reason to create an evidential rule that leaves credibility assessment entirely up to the jury.

That said, these studies only looked at accuracy based on demeanour cues. It is possible that juries could determine credibility by looking at other factors, e.g. the consistency of a witness’ story. We’ll look at evidence in relation to factors of this sort in part two.

No comments:

Post a Comment