Theme
The 20th CODH seminar welcomes Gilles-Maurice de Schryver, research professor of African linguistics at Ghent University and the former president of the European Association for Lexicograhy, and David Joffe, the creator of the computer-based dictionary making system called TLex. In this talk, they will discuss ChatGPT, the latest technical innovation of the chatbot developed by OpenAI, and its impact on lexicography especially on the process of dictionary compilation, its future, and beyond.
Archives
The YouTube video of the event is available at ROIS-DS CODH YouTube Channel
The following is the suggested citation of the content of the event.
de Schryver, Gilles-Maurice & David Joffe. 2023. The end of lexicography, welcome to the machine: On how ChatGPT can already take over all of the dictionary maker's tasks. Paper presented at the 20th CODH Seminar, ROIS-DS Center for Open Data in the Humanities, Tokyo, Japan, 27 February 2023. http://codh.rois.ac.jp/seminar/lexicography-chatgpt-20230227/
Program
Title | The end of lexicography, welcome to the machine: On how ChatGPT can already take over all of the dictionary maker's tasks |
Speaker |
Gilles-Maurice de Schryver (Ghent University / University of Pretoria) David Joffe (TshwaneDJe HLT) |
Date | 17:30-19:00 JST (UTC+9), February 27 (Monday), 2023 |
Schedule |
17:30-18:00 Introduction by Gilles-Maurice de Schryver 18:00-18:30 Live demonstration by David Joffe 18:30-19:00 Q & A (and probably further demos based on the prompts from the audience) |
Venue |
Online: Zoom Webinar Onsite: Room 2005 at National Institute of Informatics (Access) Registration is required for online and onsite. |
Language |
English 講演は英語のみで、日本語通訳はありません。 |
Co-Host |
|
Abstract
With the release of ChatGPT just two months ago, the world has been in a frenzy. Few of us ever really thought that we’d see this kind of level of AI system in our lifetime, but here it is. Our standpoint is that we need to adapt and be part of the revolution. The arrival of ChatGPT (and soon its rivals) is quite a big change to society in general. As a case study, we will present its impact on lexicography. Broadly speaking, we will illustrate how ChatGPT can already be brought in for all major steps in modern lexicography: (1) corpus creation aspects; (2) dictionary compilation proper; (3) publishing to various formats; (4) dictionary sales, marketing and customer service; and (5) metalexicography.
The main focus during this talk will be on level (2). More specifically, we will illustrate how one can ask for entries in structured XML, how dictionaries may be authored by ChatGPT via our prompts, and how we can achieve more direct OpenAI GPT integration into a dictionary writing system like TLex (using their APIs). Reformulated, we will illustrate how we can feed ChatGPT a list of headwords and have it automatically create a list of articles in a sort of crude batch mode. In the process, we will show how it can also start making helper lists of meanings about words. Given that ChatGPT also seems to ‘know’ TEI (that slightly horrible XML standard) as well as TMX, TBX and XLIFF (more important formats in lexicography), it is possible to ask either to create data in this format, or given some simple instructions to rearrange data into these formats.
Apart from showcasing all this technological wizardry for the field of lexicography, we will also devote some time to the ethical implications of ‘using’ it.
At the end of it all, the Preface to the dictionary may look like this:
Bio
Gilles-Maurice de Schryver
Gilles-Maurice de Schryver (°1971) has been research professor of African Linguistics at Ghent University since 2015, and extraordinary professor at the University of Pretoria since 2014. He holds an MSc in Microelectronic Engineering (1995), as well as an MA (1999) and PhD (2005) in African Languages and Cultures. In 2002 he co-founded TshwaneDJe HLT to develop lexicographic software, and in 2006 he was a founding member of the African Language Technology group. He is the author or co-author of close to 400 books, book chapters, journal articles and conference papers, mainly on Bantu corpus linguistics and lexicography in general. His publications also include award-winning dictionaries for Northern Sotho, Zulu and Xhosa, published with Oxford University Press, as well as various online dictionaries, amongst others the most popular one for Swahili. He is a two-term past President of Afrilex (2009-2013) and past President of Euralex (2018-2021). Earlier, he also served in other capacities on the executive boards of Afrilex (2001-2009), Euralex (2006-2014), Asialex (2007-2013) and Australex (2008-2013). Most recently, he has been co-facilitating the creation of Americalex (2019-2023), and currently sits on the board of Globalex (2022-2023).
David Joffe
David Joffe is the co-founder and owner of TshwaneDJe HLT, and the original creator of the industry-leading TLex dictionary writing system. He has a BSc Computer Science degree (Univ. of Pretoria, South Africa) and over 20 years' software development and project management experience. He previously built, amongst others, a flight simulator visualisation system for the South African Air Force, and mining training simulators for Anglo Platinum. He currently manages the development of TLex, in use by major publishers including Oxford University Press and Pearson, as well as tlTerm (terminology management software), tlTranslate (Translation Memory software), tlCorpus (concordance software), and tlDatabase (database software).
Registration
Past CODH Seminars
2024-06-06
2024-03-04
21th CODH Seminar - Digital History: Concepts and Practices
2023-02-27
2023-03-01
2023-01-22
2022-07-01
2022-03-28
2021-07-29
2021-02-18
14th CODH Seminar - 100 Recipes for IIIF Curation Platform
2021-01-22
13th CODH Seminar - Present and Future of Historical Big Data Research
2020-08-05
12th CODH Seminar (Online) - AI for Culture: From Japanese Art to Anime
2020-02-21
12th CODH Seminar - AI for Culture: From Japanese Art to Anime
2019-09-25
2019-03-11
10th CODH Seminar - Document Analysis and Character Recognition
2019-01-08
9th CODH Seminar - Computer Vision with Limited Labeled Data
2018-11-22
2018-07-31
7th CODH Seminar - Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer