Friday, 5 June 2009

Paper of Record content on Google News Archive Search

Following much frustration Google have finally posted a clarification of the situation regarding the content the company purchased from Paper of Record (Cold North Wind). The good news is that much of the content is available through Google news archive search. Some of the material requires further processing, promised in the next three months. The bad news is that some titles, including the Sporting News, Temiskaming Speaker and The Perth Courier, will not be coming to Google.

The substantive part of the announcement is copied below.


Many of you have asked about our specific plans regarding newspapers originally digitized by Paper of Record. We wanted to give you an update on our progress, and to clear up some misunderstandings. As part of our ongoing efforts to make more old newspapers accessible and searchable online, we acquired a number of titles from Paper of Record. Most of the titles that we acquired from Paper of Record are online and fully searchable. In fact, in many cases we digitized the content again to improve the quality of the images and the OCR.

However, there are some titles provided by Paper of Record that currently are not live on Google News Archive Search.
For those titles we have the right to display, we're in the process of bringing them online and making them viewable via Google News Archive, along with other sources we've digitized or crawled from the web. That means that if you're looking for material originally acquired by Paper of Record, it likely falls into one of three groups:

*4.91M articles
representing 522 titles obtained from Paper of Record are now live on Google News Archive search. This includes previously live content as well as content added as of this week from Paper of Record, all free of charge. Please note that all articles from these titles may not be comprehensively available, but will otherwise be made available in browse-only mode within 3 months. The full list is here [2].

*~0.5M pages representing 381 titles
obtained from Paper of Record will be made available in browse-only mode within 3 months, also free of charge. The full title list is here [3]. Many of the images we obtained were of low quality, and we were therefore unable to get quality text after following the OCR process. We are working to put up content from these titles so that they can be browsed.

*Finally, for these 10 titles here [4], we don't have the rights to display these newspapers. We've reached out to the publishers who hold rights to these papers, but not all want to participate in Google's programs. To access these, you may need to travel to a library if you can't find an online source, or contact the publisher directly.

Click here [5] for more information on how to find specific titles in the archive. We will also be soon rolling out direct search which will allow the user to search and browse through newspaper titles directly.

We have heard the concerns voiced by members of the research community, and know how important this content is to users. We apologize for any inconvenience you may have experienced. We're committed to providing a comprehensive index of archived newspaper content and to making the content acquired by Paper of Record available to our users.



M. Diane Rogers said...

Still frustrating and even baffling, although I do appreciate Google finally giving us more info. My searches still don't bring up material I'm sure 'was there before' (at Paper of Record) for example, from the "Manitoban" and the "Winnipeg Times" or the New Westminster BC papers. But Manitoba searches do quickly bring up links to clippings/pages at the pay site, even for a 1922 Manitoba directory. No news there!

Anonymous said...

Not sure which papers they are not giving us access to, like Paper of Record did, but heard there were three.
One of them is the Temiskaming Speaker, New Liskeard, Ontario. Good news is that the Library in New Liskeard has the paper on microfilm, and anyone can access it for FREE. I suggest looking into what resources your local library has.