Japanese Language Text Mining Workshop
University of Chicago, July 22 & 30, 2021Paula R. Curtis, Hoyt Long, Mark Ravina Welcome! Below you will find handout resources for our workshop, Japanese Language Text Mining. Please note that these materials are only accessible to participants signed up for the course and therefore the links below will not work for anyone else. Participants should not circulate them in any form beyond our workshop. We hope to make them open access in the future. We appreciate your understanding!
Resource HandoutsSession 1: Introduction to OCR
Session 1: Demos: ABBYY, KuroNet, & Tesseract
Session 2: Introduction to RegEx, Text Editors, & Corpus Prep
Session 2: Demo: Open Refine; Intro to Metadata Structuring & Organization
Session 2: Demo: Tesseract: An integrated workflow with analysis