Tuesday, 2 January 2018

Perceptions of Probability in Genealogy

What do people understand when they're told that something is probably, or possibly, or likely true?
This Perceptions of Probability graphic is based on  responses of 46 people on the Reddit social network to the question: "What probability would you assign to the phrase "[phrase]"? The phrases are on the vertical axis, the probabilities on the horizontal.

Compare these to a logical hierarchy of levels of confidence suggested by Elizabeth Shown Mills in Evidence Explained (2nd Edition, 2012, page 19-20): certainly, probably, possibly, likely, apparently, and perhaps.

Possibly, apparently, and perhaps do not appear in the graphic. Top of Mills' list "certainly" is comparable to "almost certainly" at the top of the perceptions of probability. "Likely" and "probably" appear in reverse order in the two lists. The Reddit sample shows a wider variation in the understanding of "probably" than "likely" with the mode of the distribution at a greater assigned probability for "probably" than for "likely".

In view of the evident disparity in understanding of the descriptive terms clients of professional genealogists should be aware, and proactively made aware, that there is no consistently agreed hierarchy. It would be a contribution if some competent authority for genealogy would define such a set, with probability range equivalents, which align as best as possible with common understanding.


Toni said...

But only the participating genealogy community would understand/use it. I think the general population would keep their own opinion of the word.

Elizabeth Shown Mills said...

John, just to clarify, EE's "Levels of Confidence" don't quite match the way you cite them.

You state: "Compare these to a logical hierarchy of levels of confidence suggested by Elizabeth Shown Mills in Evidence Explained ... certainly, probably, possibly, likely, apparently, and perhaps."

Under no conditions would EE (or I) say that "likely" carries less weight than "possibly." EE 1.6 provides this heirarchy and explanation:

1.6 Levels of Confidence:

"In sound historical studies, statements about dates, events, identities, places, relationships, etc., are frequently prefaced by qualifiers. ... The use of these terms adheres to no universal scheme. Rather, the terms take on whatever sense writers create with their supporting details and interpretations. The following offers one set of parameters that can be applied in a logical hierarchy:

"Certainly: The author has no reasonable doubt about the assertion, based upon sound research and good evidence.

"Probably: The author feels the assertion is more likely than not, based upon sound research and good evidence.

"Likely: The author feels some evidence supports the assertion, but the assertion is far from proved.

"Possibly: The author feels the odds weigh at least slightly in favor of the assertion.

"Apparently: The author has formed an impression or presumption, typically based upon common experience, but has not tested the matter.(A presumption is not a blank check, however. In law, for example, Federal Rule 301* holds that the author of a presumption is still expected to produce evidence to meet or rebut the presumption.)

"Perhaps: The author suggests that an idea is plausible, although it remains to be tested."

On one other point, I'm confused by your Reddit list. Your first seven are: Almost certainly, highly likely, very good chance, PROBABLE, likely, "we believe," and PROBABLY. How are you defining the difference between "probable" and "probably"? What would account for two other levels of confidence coming between them?

Debbie Kennett said...


This chart is actually based on a study by the CIA and not a survey of Reddit users. For background information see the link here which includes a link to the original Reddit thread where the research was discussed:


The link was originally shared on Twitter by Simon Kuestenmacher.

JDR said...

Debbie: At https://www.reddit.com/r/dataisbeautiful/comments/3hi7ul/oc_what_someone_interprets_when_you_say_probably/ is says it's a poll inspired by a CIA study. As I read it it's not the CIA data but data you can examine at https://pastebin.com/byPieqz0/.

Debbie Kennett said...

Thanks John. You're right. I shouldn't look at these things in a hurry late at night!

JDR said...

Elizabeth: Thank you for the clarification. As for the Reddit terms, I have no explanation, just reporting on what they did. I do share your puzzlement.