Curriculum Vitae
Kun SUN εε€
Full Tenured Professor of Computational Linguistics
Director, Institute of AI and Language Science Β· Tongji University, Shanghai
160K+
HuggingFace Downloads
15+
Corpora (11+ Languages)
πΌ Academic Positions
Director, Institute of AI and Language Science
Tongji University, Shanghai
11.2025 β presentFull Tenured Professor of Computational Linguistics
Tongji University, Shanghai
09.2024 β presentAdjunct Professorial Research Fellow
Fudan University, Shanghai
03.2024 β presentSenior Scientist / Assistant Professor (Akademischer Rat)
University of TΓΌbingen, Germany
10.2017 β 02.2025Associate Professorial Research Fellow
Zhejiang University, Hangzhou
08.2016 β 08.2018Associate Professor
Zhejiang International Studies University, Hangzhou
08.2013 β 08.2016Lecturer
Taizhou University, Taizhou
08.2007 β 08.2013π Education
Habilitation of Computational Linguistics (Professor Qualification)
University of TΓΌbingen, Germany
10.2017 β 06.2024Ph.D., (Computational) Linguistics
East China Normal University, Shanghai
09.2009 β 06.2012M.A., Corpus Linguistics
East China Normal University, Shanghai
09.2004 β 07.2007B.A., English Linguistics (minor: Statistics)
Anhui Normal University, Wuhu
09.2000 β 07.2004Other Academic Training:
- 10th Fall School in Computational Linguistics, Stuttgart University (Sept 2019)
- European Summer School in Logic, Language and Information, University of Latvia (Aug 2019)
- Summer School of Tech & Media Communication, Open University & Cranfield University, U.K. (2014)
- Visiting PhD Student, Center for Chinese Languages, Peking University (02.2011 β 01.2012)
- Visiting Scholar, Dept. of Linguistics, University of Manchester, U.K. (09.2010 β 01.2011)
π¬ Research Interests
Digital Humanities & Computational Text Analysis
Formal and computational models of discourse structure; computational measurement of coherence, cohesion, and information flow; graph-based and network representations of discourse; discourse dependency and cross-framework conversion (RST, PDTB, dependency); multilingual discourse parsing; distant reading and large-scale diachronic analysis of literary and scholarly texts; computational modeling of lexical semantic change and language evolution.
Computational Linguistics & NLP
Fine-tuning and adaptation of LLMs for philological and humanistic research; neural-symbolic prompting & reasoning; hyper-dimensional & structured semantic representations; emotion and personality detection (text + multimodal); automatic essay scoring and feedback systems; machine translation evaluation for literary and scholarly texts.
AI Methods, Ethics & Didactics
Efficient training & inference in LLMs; evaluations and benchmarking of AI models; cognitive-inspired reasoning in LLMs; critical assessment of AI biases, cultural tendencies, and societal impacts; development of DH-related curricula integrating AI literacy and ethical reflection.
Computational Cognitive Science
Computational theories of cognition and learning; mechanistic and predictive models of human cognition; computational modeling of multilingual and multimodal language processing; neuro-computational models linking behavior, eye-tracking, EEG, MEG, and fMRI signals; discriminative and error-driven learning models.
Computational Social Science
Sociolinguistic stratification and historical corpus analysis; sentiment analysis; AI-aided social network modeling; AI safety, ethics, and social impacts.
Data Analysis & Statistical Methods
Large-scale data collection, curation, and mining; advanced statistical modeling (mixed-effects regression, GAM/GAMM, Bayesian hierarchical modeling); time-series and longitudinal models; causal inference; model comparison and robustness analysis.
π° Grants & Funding
As Principal Investigator:
China National Science Grant
Attention-aware computational metrics for human multimodal language processing Β· No. 62512398 Β· 2025
Β₯500K
National Social Science Fund of China
Investigating Chinese discourse structure through topic chain and event knowledge Β· No. 15YY038 Β· 2015β2020
Β₯200K
China Postdoctoral Science Foundation, Outstanding Fund
Computational-cognitive studies on run-on sentences in Chinese Β· No. 2018T110581 Β· 2017β2020
Β₯150K
Social Science Fund of Zhejiang Province
The textual function of topic chains and its application in translation Β· No. 13NDYB145 Β· 2013β2015
Β₯30K
Zhejiang Provincial Social Science Association
A computational approach to commas in Chinese texts Β· No. 2012N067 Β· 2012β2014
Β₯9K
Education Fund of Zhejiang Province
"One-stop" English teaching platform via multimedia network Β· No. 2014SCG090 Β· 2012β2015
Β₯5K
As Co-PI / Member:
National Social Science Fund of China
The complex structure of news texts in Chinese Β· No. 18BYY184 Β· Co-PI
Β₯200K
European Research Council (ERC) Advanced Project
Wide Incremental learning with Discrimination nEtworks Β· No. 742545 Β· Member
β¬2.5M
π Selected Recent Publications
* indicates corresponding author Β· Full list: Google Scholar
Peer-Reviewed Journal Papers (2022β2026):
[J1] Journal of Artificial Intelligence Research (2026)
Sun, K. & Wang, R. A novel dependency framework for enhancing discourse data analysis.
[J2] IEEE Trans. Cognitive Development and System (2026)
Sun, K. & Wang, R. The roles of contextual semantic relevance metrics in human visual processing.
[J3] Linguistics (2026)
Sun, K., Wang, R., & Baayen, H. Semantic coherence predicts reading fixation durations across languages beyond surprisal and lexical factors.
[J4] Language and Cognition (2026)
Sun, K. The ebb and flow of discourse connectives: Stylistic change or cognitive decline?
[J5] Neurocomputing (2025)
Sun, K. & Wang, R. Breaking myths in LLM scaling and emergent abilities with a comprehensive statistical analysis.
[J6] Neural Networks (2025)
Wang, R. & Sun, K.* A pipeline of neural-symbolic integration to enhance spatial reasoning in large language models.
[J7] Cognitive Science (2024)
Sun, K. & Wang, R. Computational sentence-level metrics for predicting human sentence comprehension.
[J8] Cognition (2024)
Sun, K. & Liu, H. Attention-aware semantic relevance predicting Chinese sentence reading.
[J13] PNAS (2022) Featured by MIT Technology Review
Sun, K. Colloquialization as a key factor in historical changes of rational and emotional words.
π οΈ Technical Skills
Programming:
Python (advanced) R (advanced) PyTorch (advanced) LaTeX (advanced) Linux Shell (advanced) JavaScript (intermediate) HTML (intermediate)
Experimentation:
Eye-tracker EEG Electromagnetic Articulography Online experiments E-Prime
Natural Languages:
Chinese (native) English (fluent) Japanese (intermediate) German (intermediate) Latin (preliminary)
π» Software, Databases & Corpora Developed
Software & Tools
- Automatic converter of discourse dependency from discourse corpora
- Analyzer of discourse complexity (syntactic and discourse levels)
- Unified annotation tool for mining PDTB and RST corpora
- Toolkit for visualizing discourse networks
- Toolkit for computing attention-aware computational measures across multiple languages
- AI agent for ZH-EN & EN-ZH literary and scholarly translation
- Synthesis pipeline for automatic linguistic annotations applicable to philological research
Databases
- Historical frequencies for discourse connectives across languages (1820β2010)
- Norms of historical psychosemantic dimensions in English
- Sentimental properties of onomatopoeia in 28 languages
Corpora & LLMs
- Corpus of English hyphenated compounds
- Balanced corpus of discourse dependency in 11 languages
- Corpus of Chinese textual "run-on" sentences with multi-layer annotations
- Fine-tuned LLMs for detecting personalities 160K downloads
- Fine-tuned LLMs for detecting mental health
- LLMs for grading English composition for IELTS and L2 120K downloads
- Generative LLMs for diagnosing English writings for IELTS
- Synthesis software of automatic linguistic analysis using LLMs
π Teaching
Tongji University (2025/2026 Winter & Summer Semester)
Introduction to Computational Linguistics Advanced Natural Language Processing Methods in Language Sciences Project-based Digital Humanities
University of TΓΌbingen (2019β2023)
Text Linguistics and Discourse Processes Β· Computational Models in Linguistic Research Β· Language Changes & Variations Β· Quantitative Methods in Experimental Linguistics Β· Transformer-Based Language Models
Zhejiang International Studies University (2013β2016)
Contrastive Analysis of Chinese and English Β· Computer-Aided Translation Β· Corpus Linguistics Β· Language and Society Β· English Reading & Writing
Taizhou University (2007β2009)
Introduction to Linguistics Β· English Reading & Writing Β· L2 Learning Strategies
π
Awards & Achievements
2016 "Zhijiang Young Scholar of Social Sciences", Zhejiang Province
2015β17 "Advanced Research Worker" Award, Zhejiang International Studies University
2016 Second Prize, "Teaching Achievement", Zhejiang International Studies University
2014 "Outstanding Teacher", Zhejiang International Studies University
2010β12 "National First Class Scholarship", East China Normal University
π Journal Editorial Roles & Academic Service
Editorial Boards:
Review Editor Β· Frontiers in Psychology Editorial Board Β· Scientific Reports Editorial Board Β· BMC Psychology
Reviewing: Reviewer for 80+ international SSCI/SCI journals including Psychological Science, Nature Human Behavior, Cognitive Science, Linguistics, Big Data & Society, PLoS ONE, IEEE Access. Reviewer for top AI conferences: ACL, EMNLP, ICML. Reviewer for National Social Science Funding of China and National Science Funding of Poland.
Service:
- Supervised approx. 50 bachelor and master theses since 2010
- Co-supervised two PhD dissertations (Huiyuan Jin, Yalan Wang), 07.2017β12.2021
- Organized the International Morphological Processing Conference, TΓΌbingen (Nov 2019)
- Organizer, Academic Forum, Zhejiang International Studies University (2013β2016)
- Temporary Coordinator, Collaborative Innovation Center, Zhejiang University (2016β2017)
π Professional Networks
Last updated: March 2026