Monday, 23 January 2017

Search Aid: Initial Letter of Surnames

With 26 letters in the alphabet halfway through is between M and N, right?

Not if you're looking at an alphabetic list by surname frequency.

In our research we often find ourselves looking for a particular name in an alphabetically ordered list where there's no index, just a block of pages or images and you have to guess how far in to go to find the name you want.

Here's a table you may find useful showing the cumulative percent of the way through to go to find the start of an initial surname letter.

It varies depending on nationality. Wales loves Jones and Jenkins, Scotland has a lot of names beginning with Mc. Those distributions are calculated from a table at the Surname Studies website. Ottawa figures are based on the 2009 phone book.

For example, if you're looking for Smith in a Scottish database the table shows that the S surnames start 78% of the way into the database.

EnglandWalesScotlandUKOttawa
Cum %Cum %Cum %Cum %Cum %
A00000
B32334
C148111414
D2212192122
E2620252629
F2925262830
G3227303233
H3731353639
I4639404544
J4740414644
K4951434946
L5152455149
M5657495558
N6363706469
O6564716670
P6667726772
Q7273767277
R7273767278
S7780817883
T8685908691
U9090939194
V9090939194
W9191949196
X10010010010099
Y10010010010099
Z100100100100100

Often you don't have the whole range to work with. Wouldn't it be nice if there was an app where you could enter page numbers and surname from the beginning and end of the range and get an estimate of the page for a particular surname  Hey RootsTech, that would be an innovation.

6 comments:

Ken Hanson said...

Can you explain more clearly?
Thank you

JDR said...

I added an example which should help.

Celia Lewis said...

Very interesting, John.
My Gillespie surnames will be usually 1/3rd of the way through, while the Pettygrove/Pettigrew surnames are around 2/3rds of the way. Cool. I tend to jump back and forth in such databases or indexes, but this might make my jumps a bit more logical and useful. Thanks for the post.

Chad said...

The table might be a bit clearer if you were to round to the nearest tenth of a percent instead of just to the nearest percent, at least for the last few rows. As the table is now, nearly all columns round to 100% by "X".

JDR said...

A good point Chad, to be weighed against giving a deceptive impression of accuracy.

Anonymous said...

When searcrhing for a surname ending in W through Z, I've always found the easiest way to find it is to start at the last page and work backwards.