Tuesday, 23 February 2010

More on The Ottawa Citizen digitized

Since the announcement of the availability of archives of the Ottawa Citizen online on Google in an article by Kelly Egan in the Citizen of 3 Feb. 2010 (p. B1) folks have been trying it out. This posting draws heavily, with permission, on material Prof. Bruce Elliott of Carleton University's History Department as well as my own trials.

Go to www.google.ca; click on NEWS, then on ADVANCED NEWS SEARCH (at top right), the on ARCHIVE SEARCH (for articles more than 30 days ago), then on ADVANCED NEWS SEARCH again. Under SOURCE type Ottawa Citizen, and enter your search terms under FIND RESULTS. There are four options: all words, exact phrase, at least one of the words, and without the words. All words seems to work best. Remember to be creative in coming up with keywords. The more letters in the words you search the more likely you will miss a hit owing to OCR errors.

The rationale for the order in which results appear is obscure, perhaps based on the number of times the search word occurs in the article or how close to the top of the article it occurs.

You can click on the graphs at the top of the results screen to narrow the date, or click on SEARCH OTHER DATES. Try the date range from 1800 to the present. Although the first issue of The Citizen under that title was supposedly published on 22 February 1851, Google's approach to assigning dates to papers has created earlier editions!

Just as you have may have gone to a handy newspaper to check the date, Google used dates found in the paper itself as the date to which it is indexed. The OCR can cause problems, and you will, for example, find issues from 1920 indexed in 1820 and articles from 1959 in 1950. I found one case where an incorrect year was published in the original paper which Google used for its index.

I also stumbled across a mirror image copy of two successive pages.

There are major gaps in the survival of the newspaper in the 1850s and 1870s, but the omissions in the Google news version are more extensive than that. For example, Ottawa Valley Marble Works in Arnprior ran an ad in every issue for several years 1859-61, with the name of the town appearing in each ad twice, once in upper case and once in lower. A search on MARBLE turns up a number of occurrences; a search on ARNPRIOR turns up none.

OCR problems are common. One needs to experiment with various search terms. There is a button "Flag this edition as unreadable". It would be great if Google adopted the National Library of Australia system for adding corrections.

Once you have a hit you can manoeuvre around the page by manipulating the blue rectangle in the thumbnail at top right, and browse back and forth within an issue of the paper. You can enlarge and reduce the image using the buttons on the toolbar. The FULL SCREEN button is the box with four arrows in the toolbar; it allows you to see a little more of the page at once.

"Archive Search Help" explains some of the subtleties and provides a "Get in touch" button that allows you to ask questions: a good step forward as it has been difficult to direct comments or questions to Google in the past.

There are inevitable gaps but it is a free resource, so we should be grateful for it as a 10% full glass rather than 90% empty.

No comments: