Yifan Wang

Hi! I'm Yifan Wang, a third-year PhD student and member of the RTG: Neuroexplicit Models of Language, Vision and Action project at Saarland University. I am co-supervised by Prof. Dr. Vera Demberg and Prof. Dr. Isabel Valera.

Prior to this, I obtained my master's degree in Language Science and Technology at Saarland University and my bachelor's degree in Germanistics at Shanghai Jiao Tong University. I also spent an exchange semester at Heidelberg University in 2018/19.

My current research focuses on the societal impacts of NLP models, with a particular emphasis on detecting and mitigating hate speech and social biases in large language models. I am also interested in advancing the interpretability of deep neural networks and enabling effective personalization in AI systems.

In my free time, I enjoy walking around the city. As probably the worst cook in Germany, I am always willing to join a food adventure. You can check out some failed fried eggs here.

Publications

Yifan Wang, Jinyi Mu, Mayank Jobanputra, Yu Wang, Ji-Ung Lee, Soyoung Oh, Isabel Valera, Vera Demberg. Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling. arXiv preprint (2026). paper
Yifan Wang, Mayank Jobanputra, Ji-Ung Lee, Soyoung Oh, Isabel Valera, Vera Demberg. Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection? International Conference on Learning Representations (ICLR 2026). paper
Yifan Wang, Sukrut Rao, Ji-Ung Lee, Mayank Jobanputra, Vera Demberg. B-cos LM: Efficiently Transforming Pre-trained Language Models for Improved Explainability. Transactions on Machine Learning Research (TMLR 2025). paper
Mayank Jobanputra, Alisa Kovtunova, Brisca Balthes, Fedor Grigoryevich Pogulskiy, Yifan Wang, Stefan Borgwardt, Vera Demberg. ProofTeller: Exposing recency bias in LLM reasoning and its side effects on communication. Proceedings of the International Joint Conference on Natural Language Processing & Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL 2025). paper
Mayank Jobanputra, Nils Philipp Walter, Maitrey Mehta, Blerta Veseli, Evan Parker Kelly Chapple, Sneha Chetani, Yifan Wang, Ellie Pavlick, Antonio Vergari, Vera Demberg. Can LLMs subtract numbers? Proceedings of the 3rd Workshop on Mathematical Natural Language Processing (MathNLP 2025). paper
Yifan Wang, Vera Demberg. RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation Framework. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024). paper
Yifan Wang, Vera Demberg. A Parameter-Efficient Multi-Objective Approach to Mitigate Stereotypical Bias in Language Models. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP 2024). paper
Dongqi Liu, Yifan Wang, Jia Loy, Vera Demberg. SciNews: From Scholarly Complexities to Public Narratives--A Dataset for Scientific News Report Generation. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). paper
Dongqi Liu, Yifan Wang, Vera Demberg. Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023). paper

Academic Services

Conference reviewer: ACL ARR 2025, 2026 (Outstanding Reviewer, Oct 2025)
Journal reviewer: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)