Saturday, 17 November 2018

Co-Lab CEF First World War Diaries Challenge: would OCR help?

60 images comprising the 1st Canadian Division – General Staff war diary for the month of November 1918 are now available for transcription through Library and Archives Canada Co-Lab at https://co-lab.bac-lac.gc.ca/eng/Challenges/Details/1014.

As you can see from the example image these are typed records so reading them is not much of an an issue. Transcription means they will become machine searchable.

Transcription is 6% completed but mostly for the pages with least text.

I wondered how well OCR would do on this text so tried it applying the text grab from the TechSmith product Snagit to part of the image. Here's what I got.

Training «ras oontlcued by all Units«
Th« G *0 «C • held a Con fere no« of Brigade Commanders at Dlrlalonal Headquarters at 1000 hours and in the Afternoon attended a Coofertooe of Dlrlalonal Commanders at Oorpa Headquarters. Information vrae received from Corps Headquarters that the Ca da disc Corps would beooma part of the Seoond Army to participate lo the General Advance of the Allied Analeo to the RHIIB. whlo^ advanoe would oomaenoe on or about Borember 17th« lo view of the position of the Division 1a the Hear Areas, it was therefore neoessary that the Division move forward at oooe In order to arrive In the Concentration Area West of M0I3 by the 17th Inst. Orders were accordingly Issued for the move to oomaaoe on the morning of the 13th Inst by route maroh and to oontlnue on the two days following*
Weather - Fine and warm«

UPDATE:  SADLY THE POLL BELOW HAS A GLITCH.

If LAC provided it:
Would a rough machine transcription be helpful if made available without any corrections
 
pollcode.com free polls

With the machine transcription as a guide would you be more likely to help with this Co-Lab challenge?
 
pollcode.com free polls

No comments: