21 August 2009

Determining what to digitize

Jill Hurst-Wahl on her Digitization 101 blog posted a short item Can you use Zipf's law to determine what to digitize? It reports on a conference presentation that draws the conclusion that "by digitizing representative portions of 20% of our collections, we could adequately serve 70% of our users."

Adequately serving 70% of users seems like a pretty modest ambition. You could throw away 80% of the physical library content if that's the level of satisfaction you aim to achieve.

Where the analysis does help is in setting priorities for digitization. The sheer magnitude of the digitization task can throw organizations into seemingly terminal paralysis, or at least motivate them to address other less thorny issues.

Take newspaper digitization. In looking at the impacts of a 1900 storm in Ontario I found that newspapers across the province carried much the same information, down to the death of a man in Windsor carried in an Ottawa newspaper. You don't need to look at each newspaper in the province to get a good (70%!) handle on the storm impacts. You do have to go to local papers to find detail, for example, that heavy rain fell with the storm in a community near Parry Sound which was very helpful in suppressing an extreme forest fire threat.

Digitizing major papers across the country with the latest technology to transform images to text, which has significantly improved over the years, would be an excellent start.

Having achieved that you could start filling in gaps and paint a more detailed picture of our heritage.

2 comments:

M. Diane Rogers said...

I don't think I can quite agree - although I agree with your point about setting priorities - but most of the Canadian newspaper digitization projects are not national ones - in BC, for instance, these are being done by commercial companies, universities and small museums and archives who have their own priorities.

After all, we do have some of the 'major papers' available, just as some examples, for Ontario, issues of the Toronto Star from 1894-2006 are already on-line and have been for a long time -thanks to the Star and to Cold North Wind. There is the Manitoba Free Press (and also free Manitoba newspapers at Manitobia.ca), the British Colonist, 1858-1910 (free) for BC and Bibliothèque et Archives nationales du Québec (BAnQ) did make newspaper digitization a priority. Yes, we need more from right across Canada, but we need money and the 'will' for this too.

And, for me it's not the major Canadian papers that would always be so high on my priority list - many are well known and easily available through Inter Library Loan on microfilm.

Canada's 'ethnic' and labour papers are higher on my list. A good start at digitizing ethnic papers has been made at Simon Fraser University though - but more $ is needed there now. Some of the historical labour papers are available on microfilm (if you know what to look for) and since many are long defunct, and many of the filming projects seem to have been good quality, these might fairly easily be digitized.

WJM said...

At least having some newspapers available provide some "hooks" to sink into your research subject.

Given the profusion of wire stories and "filler", you will often run across useful name and date references that can help you narrow (or broaden) your research, even if you are searching in the "wrong" geographical location. For example, I found a reference in the Winnipeg Free Press from the 1890s to a letter that had been received (without naming the sender or recipient) describing an earthquake in Labrador. It gave a location and date, and sure enough, a cross-check of the local Hudson's Bay Company Post Journal corroborated the story quite neatly.