Geografie 2025, 130, 271-297

https://doi.org/10.37040/geografie.2025.014

Digital innovations in historical climatology: Classifying weather and climatic extremes and their impacts on societies using machine learning on written documents

Michael KahleID, Rüdiger GlaserID

University Freiburg, Faculty of Environment and Natural Resources, Department of Physical Geography, Germany

Received April 2025
Accepted August 2025

This article explores how digital innovations – particularly machine learning and natural language processing – can streamline and enhance workflows in historical climatology. Traditionally reliant on time-consuming manual analysis of historical documents, the field now benefits from modern digital tools at each research stage, from source discovery to publication. Focusing on classifying large, unstructured textual data, the study examines methods ranging from manual keyword searches and Bayesian models to advanced large language models. Using the tambora.org corpus, it extracts and categorizes references to weather extremes like thunderstorms and heavy rainfall and their impacts on mobility. The paper compares these approaches in terms of accuracy, resource demands such as runtime performance and memory, and their ability to interpret historical language. It argues that digital methods – especially AI – can transform the extraction and classification of climate data from historical texts, offering significant advantages by assisting researchers in historical climatology.

Funding

This work was supported in part by the DEMUR Project within the DFG Priority Program “On the Way to the Fluvial Anthroposphere” (DFG-SPP 2361). This paper benefited from the participation in the Climate Reconstruction and Impacts from the Archives of Societies (CRIAS) working group of the Past Global Changes (PAGES) project.

References

74 live references