Stellen-ID: 3091
Einrichtung: Human-Centered Data Science group
Kontaktperson: Frau Prof. Lisa Beinborn | lisa.beinborn(at)uni-goettingen.de | 0551 3921262
Besetzungsdatum: 01.09.2025
Bewerbungsfrist: 06.06.2025
The Human-Centered Data Science group led by Prof. Lisa Beinborn at the University of Göttingen (Public Law Foundation) invites applications for an open position:
Researcher (PhD candidate), all genders welcome,
Entgeltgruppe 13 TV-L, 100 %
Starting Date: September 2025 (open until filled),
Duration: 3 years
Research Field: Multilingual Natural Language Processing,
Location: Göttingen, Germany
The Human-Centered Data Science group (https://www.uni-goettingen.de/en/691017.html) is affiliated with the Institute of Computer Science and the Campus Institute Data Science (CIDAS) at the University of Göttingen. Our research is inter-disciplinary at its core, and we cooperate closely with colleagues from other faculties (e.g., psychology, linguistics). We take a human-centered perspective on natural language processing and focus on cross-lingual and cognitively inspired research questions.
Project: The Unit of Representation in Multilingual Language Modelling (URML)
Language technology is becoming increasingly influential as a tool for everyday tasks. Unfortunately, the field of natural language processing has historically been operating with a strong bias towards the English language. While increasing efforts are being made towards widening access to language technology, the performance of NLP tools remains vastly unequal for the different languages of the world. As models are first optimized for English and then applied to other languages, modeling choices do not necessarily scale well to typologically distant languages. A fundamental modelling choice is the tokenizer, which determines the central unit of representation for language processing. While English word boundaries can roughly be approximated by whitespaces, many languages allow for more complex morphological compositions. In this project, we plan to systematically compare different choices for the representational unit of multi-lingual language models (characters, bytes, pixels, phonemes) for a set of typologically diverse languages. We aim at developing new approaches for typologically-informed multilingual modelling that are more adaptable to new languages and work towards increasing cross-lingual fairness.
The project is a collaboration with Miryam de Lhoneux's group (https://people.cs.kuleuven.be/~miryam.delhoneux/) at the University of Leuven and is funded by the DFG (see call for the related PostDoc position).
Profile
In this position, you have the chance to pursue a PhD degree. You are expected to:
- conduct innovative research in the context of the project.
- collaborate with the project partners in Belgium, both digitally and by organizing research meetings in Leuven and Göttingen.
- communicate research results in peerreviewed proceedings and journals and present them at international research conferences.
- take an active role in the co-supervision of student theses related to the project.
- engage in the activities and events of the research group.
The ideal candidate
- has obtained a very good master’s degree in computational linguistics, computer science, cognitive science, machine learning, or a related discipline.
- has gained experience with natural language processing research and demonstrates a strong interest in the project outlined above.
- can independently acquire and process new knowledge.
- is a team player with good communication skills and an interdisciplinary mindset.
- has obtained strong analytical and programming skills and is committed to further developing them (experience with large-scale experiments on GPUs is beneficial).
- shows very good command of written and spoken English (knowledge of German and other languages is beneficial)
The University of Göttingen is an equal opportunities employer and places particular emphasis on fostering career opportunities for women. Qualified women are therefore strongly encouraged to apply in fields in which they are underrepresented. The university has committed itself to being a family-friendly institution and supports their employees in balancing work and family life. The University is particularly committed to the professional participation of severely disabled employees and therefore welcomes applications from severely disabled people. In the case of equal qualifications, applications from people with severe disabilities will be given preference. A disability or equality is to be included in the application in order to protect the interests of the applicant.
Application Documents
Send the following information to lisa.beinborn(at)uni-goettingen.de as a single PDF:
- Letter of motivation (1 page, including a clear indication why this particular position is interesting for you and what makes you a qualified candidate)
- Detailed CV
- Certificates (including a transcript of grades)
- A page containing a link to an example of your work (your most recent thesis, a publication, a code repository) and a short explanation on how this work represents your profile. If the document is not publicly available, please attach it to the pdf.
Application Deadline: 6th June, 2025
Please note:
With submission of your application, you accept the processing of your applicant data in terms of data-protection law. Further information on the legal basis and data usage is provided in the https://uni-goettingen.de/GDPR
Die Ausschreibung ist auch im Internet unter https://www.uni-goettingen.de/de/644546.html?filters=%7B%22vollzeit%22:[],%22befristet%22:[],%22gruppe%22:[],%22besoldGrp%22:[]%7D&details=3091 abrufbar.