Big data

Aims and scope

The availability of Big Data is becoming increasingly common in many felds such as business, computer science, government, and the social and behavioral sciences including psychology.

There are four key characteristics that may qualify data as Big Data, namely Volume, Velocity, Variety, and Veracity. High-volume data refers to the size of the dataset. If it is too large, it can lead to problems with storage and analysis. High-velocity data means that the data are received at a high rate and/or have to be processed within a short period of time (e.g., real-time and interactive processing).

High-variety data are data consisting of many types of structured and unstructured data containing a mixture of text, pictures, videos, and numbers. Another characteristic of Big Data is the veracity, which indicates the importance of the quality (or truthfulness) of the data. Big Data with potential high relevance for psychology include social media data, health/physiological tracker data, geolocation data, dynamic public records, travel route data, and behavioral and genetic data.

The overall aim of this research area is to address methods and applications using Big Data in psychology. In line with the scientometric research tradition at ZPID, the current focus is on Metascience using Big Data methods. Research topics include:

  • Identification of research topics and trends in large text corpora
  • Investigating the scientific communication of psychology researchers on Twitter und Mastodon
  • Bibliometric studies using text mining and network analysis
  • Methodological and statistical issues in collecting, handling, processing, and analyzing Big Data in psychology
  • Leveraging machine learning to automate systematic reviews
  • Implications for research supporting infrastructure services in psychology (see PsychTopics)


Central publications

  • Bittermann, A., McNamara, D., Simonsmeier, B. A., & Schneider, M. (2023). The Landscape of Research on Prior Knowledge and Learning: A Bibliometric Analysis. Educational Psychology Review, 35, 58.

  • Bittermann, A. & Rieger, J. (2022). Finding Scientific Topics in Continuously Growing Text Corpora. In A. Cohan et al. (Eds.), Proceedings of the Third Workshop on Scholarly Document Processing (pp. 7–18), Gyeongju, Republic of Korea. Association for Computational Linguistics.

  • Bittermann, A., Batzdorfer, V., Müller, S. M., & Steinmetz, H. (2021). Mining Twitter to detect hotspots in psychology. Zeitschrift für Psychologie, 229(1), 3–14.

  • Batzdorfer, V., Steinmetz, H., Biella, M., & Alizadeh, M. (2021). Conspiracy theories on Twitter: Emerging motifs and temporal dynamics during the COVID-19 pandemic. Journal of Data Science and Analytics.

  • Bittermann, A. & Fischer, A. (2018). How to identify hot topics in psychology using topic modeling. Zeitschrift für Psychologie, 226, 3–13.

Cooperation projects


Digital Prompting Interventions

(with Dr. Jasmin Breitwieser, Dr. Maria Theobald, & Prof. Dr. Garvin Brod, DIPF | Leibniz Institute for Research and Information in Education)


We contribute bibliometric analyses to the Hessian collaborative project for translational psychotherapy research.
(with Prof. Dr. Winfried Rief, Philipps-Universität Marburg; Dr. Viktoria Ritter, Goethe-Universität Frankfurt)

Smartphone Sensing Panel Study

(with Prof. Dr. Markus Bühner & Dr. Ramona Schödel, Ludwig-Maximilians- University Munich)


The landscape of research on prior knowledge and learning

(with Prof. Dr. Michael Schneider & Dr. Bianca Simonsmeier, University of Trier; Prof. Dr. Danielle McNamara, Arizona State University, Tempe, USA)

Identification of research topics in continuously growing text corpora

(with Jonas Rieger, Dortmund University of Technology)

Using machine learning to automate systematic reviews

(with Dr. Tanja Burgard, ZPID)

Das Verhältnis von Laieninteressen und Forschungsthemen in der Psychologie

(zusammen mit Mark Jonas, Dr. Anita Chasiotis & Dr. Tom Rosman, ZPID)

Verschwörungstheorien auf Twitter (2021)

(zusammen mit Dr. Marco Biella, Eberhard Karls Universität Tübingen; Dr. Meysam Alizadeh, Harvard University, Cambridge, USA)

Der Forschungsbeitrag der deutschsprachigen Klinischen Psychologie (2021)

(zusammen mit Dr. Jan Richter, Universität Greifswald; Prof. Dr. Hanna Christiansen, Philipps-Universität Marburg; Dr. Lena Krämer, Albert-Ludwigs-Universität Freiburg; Dr. Veronika Kuhberg-Lasson, ZPID; Prof. Dr. Silvia Schneider, Ruhr-Universität Bochum)

Forschungsinteressen von Doktorandinnen und Doktoranden (2020)

(zusammen mit Dr. Andreas Fischer, Forschungsinstitut betriebliche Bildung, Nürnberg)

Flucht und Migration als Forschungsthema in der Psychologie (2019)

(zusammen mit Dr. Eva Klos, Hochschule Trier)

Hot Topics in der Psychologie (2018)

(zusammen mit Dr. Andreas Fischer, Forschungsinstitut betriebliche Bildung, Nürnberg)


4th Symposium on Big Data and Research Syntheses in Psychology (with a special focus on Machine Learning & Open Science), 8-10 of May 2023:
Videos and presentations

Research Synthesis and Big Data in Psychology, May 17-21, 2021, online:
Videos and presentations

Research Synthesis incl. Pre-Conference Symposium: Big Data in Psychology, May 27-31, 2019, Dubrovnik, Croatia:
Videos and presentations

Big Data in Psychology 2018, June 7-9, 2018, Trier, Germany:
Videos and presentations

Edited volumes and series

Call for Papers: Zeitschrift für Psychologie, Special Issue „Text Mining in Psychology“ (2024) 

Zeitschrift für Psychologie, Special Issue „Hotspots in Psychology“ (2022) 

Social Science Computer Review, Special Issue "Big Data in the Behaviorial Social Sciences" (2021) 

Visiting researchers

  • Dr. Jonas Rieger, Technische Universität Dortmund / Das Leibniz-Institut für Medienforschung │ Hans-Bredow-Institut (HBI)
    (December 2022)
  • Prof. Dr. Mike Cheung, National University of Singapore
    (June 2018)