Autoindexing: a solution or a sin?
Surprisingly, questions about using automated or semi-automated methods employing Machine Learning and AI for indexing books, especially of more complex or scientific nature, were not well received; the chair suggest that, ‘If a machine can replace me, I won’t have job’. She argued that the subtleties of language in a text such as a iran rcs data biography, where co-referencing is commonly used (e.g. Louise Corti, the Associate Director, her etc.) could not be easily indexed, displaying her lack of awareness of the sophisticated algorithms used in Natural Language Processing to successfully identify co-references.
Having just worked with my colleague, Jeannine Beeken on a recent CLARIN workshop where we introduced linguistic approaches and tools for oral history data, I advised them to try out the free Stanford CoreNLP to try out named entity recognition and coreferencing.
Curious? Try this for yourself. and paste into the search box in the demo version: Louise is a Service Director at the UK Data Service. She has been working there for 30 years. She often enjoys her role, especially when she gets invited to China
Further, a tale was recounted – more than once – of their community being promised an autoindexing system by IBM some 20 years ago, which never materialised. I was somewhat disappointed in this fear of the machine, yet it renewed my own aim to get the UK Data Service question and variable autoindexing project started soon!