Publications
๐ Publications
Research spanning computational linguistics, cognitive computation, discourse analysis, NLP, and digital humanities. Published in top-tier venues including PNAS, Cognition, Cognitive Science, Neural Networks, and Linguistics.
40+
International Journals
10+
Chinese CSSCI
2
Book Chapters
5+
Keynote & Conf.
International Journals Chinese CSSCI Journals Conference Papers Book Chapters Preprints & Under Review
* indicates corresponding author
๐ Selected Recent Publications
2026 [1] Sun, K., & Wang, R.* (2026). The ebb and flow of discourse connectives: Stylistic change or cognitive decline? Language and Cognition.
[2] Sun, K*., Wang, R.*, & Baayen, H. (2026). Breaking myths in LLM scaling and emergent abilities with a comprehensive statistical analysis. Neurocomputing.
2025 [3] Wang, R., & Sun, K.* (2025). DSPy-based neural-symbolic pipeline to enhance spatial reasoning in LLMs. Neural Networks.
[4] Sun, K.*, & Wang, R.* (2025). A novel dependency framework for enhancing discourse data analysis. Data Intelligence.
[5] Sun, K., Wang, R.*, & Baayen, H. (2025). Attention-aware measures of semantic relevance for predicting human reading behavior. Linguistics.
[6] Sun, K.*, & Wang, R.* (2025). Computational sentence-level metrics of reading speed and its ramifications for sentence comprehension. Cognitive Science, e70092.
[7] Sun, K.*, & Liu, H. (2025). Attention-aware semantic relevance predicting Chinese sentence reading. Cognition, 105991.
2023 [8] Sun, K.*, Wang, Q., & Lu, X. (2023). An interpretable measure of semantic similarity for predicting eye movements in reading. Psychonomic Bulletin & Review, 30, 1227โ1242.
[9] Liu, Y., Yan, Y., Xia, H., & Sun, K. (2023). Analysing the longitudinal course selection panel data (2014-2020) of K-12 teachers from Zhejiang province: A comprehensive study on in-service training needs. Professional Development in Education, 1โ21.
2022 [10] Sun, K.*, & Wang, R. (2022). The role of mutual information and semantic similarity in sentence processing: The case of dangling construction in Chinese. Journal of Cognitive Psychology, 35(2), 142โ165.
[11] Sun, K. (2022). Colloquialization as a key factor in historical changes of rational and emotional words. Proceedings of the National Academy of Sciences (PNAS), 119(26), e2205563119. โ
Featured by MIT Technology Review
[12] Sun, K.* & Lu, X. (2022). Predicting Chinese readers' perception of sentence boundaries in written Chinese. Reading & Writing, 35, 1889โ1910.
[13] Sun, K.* & Wang, R. (2022). Constructing a corpus of Chinese textual "run-on" sentences (CCTRS): Discourse corpus benchmark with multi-layer annotations. International Conference on Natural Language and Speech Processing, ACL, 265โ276.
[14] Wang, J., Tang, C., Wan, Z., Zhang, W., Sun, K., & Zomaya, A.Y. (2022). Efficient and effective one-step multi-view clustering. IEEE Transactions on Neural Networks and Learning Systems.
2021 [15] Sun, K.*, & Wang, R.* (2021). Using the relative entropy of linguistic complexity to assess L2 language proficiency development. Entropy, 23(8), 1080.
[16] Sun, K.*, Xiong, W., & Wang, R. (2021). Investigating genre distinctions through discourse distance and discourse network. Corpus Linguistics and Linguistic Theory, 17(3), 599-624.
[17] Sun, K.*, Liu, H., & Xiong, W. (2021). The evolutionary pattern of language in scientific writings: A case study of Philosophical Transactions of Royal Society (1665-1869). Scientometrics, 1695โ1724.
2020 [18] Sun, K.*, & Baayen, H. (2020). Hyphenation as an efficient compounding strategy in English. Language Sciences, 83(1), 101326.
2019 [19] Sun, K.*, & Xiong, W. (2019). A computational model for measuring discourse complexity. Discourse Studies, 21(6), 690-712.
[20] Sun, K. (2019). Teaching English-Chinese textual translation strategies: A topic-chain approach. Babel: International Journal of Translation, 65(2), 286โ315.
[21] Sun, K., & Wang, R.* (2019). Frequency distributions of punctuation marks in English: Evidence from large-scale corpora. English Today, 35(4), 23-35.
[22] Sun, K. (2019). The integration functions of topic chains in Chinese discourse. Acta Linguistica Asiatica, 9(1), 29-57.
2018 [23] Sun, K. (2018). Approaching the double-nominal construction in Mandarin Chinese through the semantic-cognitive interaction. Studia Linguistica, 72(3), 687โ724.
[24] Sun, K. & Zhang, L.* (2018). Quantitative aspects of PDTB-style discourse relations across languages. Journal of Quantitative Linguistics, 25(4), 342-371.
๐จ๐ณ Chinese Publications (Selected)
[25] ๅญๅค. (2015). ไธญๅฝๅคๆ็นๅพไธๆ ็นๅ้ ๆบ็โไธๆฌงๆดฒๆ ็นไผ ็ปๅฏนๆฏ. ไธญๅฝ่ฏญๆ (CSSCI), 2015ๅนด็ฌฌ6ๆ. ไบบๅคงๅคๅฐ่ตๆ 2016 ไธญๅฝ็คพไผ็งๅญฆๆๆ 2016
[26] ๅญๅค. (2015). ๆฑ่ฏญ่ฏ้ข้พ่็ดใ็ปๆไธ็ฏ็ซ ๅ่ฝ. ่ฏญ่จๆๅญฆไธ็ ็ฉถ (CSSCI), 2015ๅนด็ฌฌ5ๆ. ไบบๅคงๅคๅฐ่ตๆ 2016
[27] ๅญๅค. (2014). ๆฑ่ฏญ่ฏ้ข้พ็็น็นไธๆฌ่ดจ. ๆฑ่ฏญๅญฆไน (CSSCI), 2014ๅนด็ฌฌ5ๆ.
[28] ๅญๅค. (2013). ่ฏ้ข้พๅบ็จไบ่ฑๆฑ็ฟป่ฏๆจกๅผไธ็ญ็ฅ็ ็ฉถ. ๅค่ฏญไธๅค่ฏญๆๅญฆ (CSSCI), 2013ๅนด็ฌฌ1ๆ.
[29] ๅญๅค. (2012). ๅฏน็คพไผ็งๅญฆ"่ฏญ่จ่ฝฌๅ"็ฐ่ฑก็ๆ่โโๅ
ผ่ฎบ'็คพไผ็งๅญฆ'ๅ'ไบบๆๅญฆ็ง'็ๅฐๅขใๅฑๆบไธๅฏน็ญ. ๅๅ็ๅทฅๅคงๅญฆๅญฆๆฅ, 2012ๅนด็ฌฌ5ๆ. ไบบๅคงๅคๅฐ่ตๆ 2013
[30] ๅญๅค. (2011). ไธญๅฝๅคไปฃๅ
ตๅจ่ฑ่ฏๅๆข๏ผไปฅใไธๅฝๆผไนใ่ฑ่ฏๆฌไธบไพ. Translation Quarterly (็ฟป่ฏๅญฃๅ), 59, 51โ83.
[31] ๅญๅค. (2010). ๅฝไปฃๅฝๅคๆ ็น็ฌฆๅท็ ็ฉถ. ๅฝไปฃ่ฏญ่จๅญฆ (CSSCI), 2010ๅนด็ฌฌ2ๆ.
[32] ๅญๅค. (2007). ่่ไธ็ฟป่ฏ. ไธๆตท็ฟป่ฏ (CSSCI), 2007ๅนด็ฌฌ2ๆ.
๐ค Important Conference Papers
[C1] Sun, K., & Wang, R. (2025). Enhancing Personality Detection Models with Continuous Outputs Through Mixed Strategy Training. Submitted to the 39th AAAI 2025 Conference (passed first round), Feb, Philadelphia, US.
arXiv.2406.16223 โ
arXiv.2406.16223 โ
[C2] Sun, K., & Wang, R. (2023). A Groundbreaking dependency framework for streamlining discourse corpora. The 18th Linguistic Annotation Workshop, Dec, Malta.
[C3] Sun, K., & Wang, R. (2023). Attention-aware sentence-level metrics predicting human sentence comprehension. The 5th China-Germany Intelligent Robotics Conference, Nov, Tรผbingen. Keynote Speaker
[C4] Sun, K., & Wang, R. (2022). Constructing a corpus of Chinese textual "run-on" sentences (CCTRS). 5th International Conference on Natural Language and Speech Processing, December, Trento, Italy.
ACL Anthology โ
ACL Anthology โ
[C5] Sun, K. & Nixon, J. (2020). Surprisal and semantic information in the prediction of language processing: Evidence from EEG data. AMLaP-Asia 2020, Hong Kong.
๐ฌ Selected Preprints & Under Review
[P1] Sun, K., & Wang, R. (2024). The roles of contextual semantic relevance metrics in human visual processing. Under review.
arXiv.2403.19233 โ
arXiv.2403.19233 โ
[P2] Sun, K., & Wang, R. (2024). Tracking neural dynamics of language comprehension: Semantic integration and lexical expectation during naturalistic discourse reading across extensive EEG channels. Under revision.
[P3] Sun, K., & Wang, R. (2024). Differential contributions of machine learning and statistical analysis to language and cognitive sciences. Revised version submitted.
arXiv.2404.14052 โ
arXiv.2404.14052 โ
[P4] Sun, K., & Wang, R. (2024). Computational sentence-level metrics for predicting human sentence comprehension. Minor revisions required by Cognitive Science.
arXiv.2403.15822 โ
arXiv.2403.15822 โ
[P5] Sun, K., & Wang, R. (2025). Automatic essay multi-dimensional scoring with fine-tuning and multiple regression. Submitted to AI Conference.
arXiv.2406.01198 โ
arXiv.2406.01198 โ
[P6] Sun, K., & Wang, R. (2025). Textual similarity as a key metric in machine translation quality estimation. Under review by Journal of Big Data.
arXiv.2406.07440 โ
arXiv.2406.07440 โ
[P7] Sun, K. (2025). The ebb and flow of discourse connectives: Stylistic change or cognitive decline? Minor revision.
bioRxiv โ
bioRxiv โ
[P8] Sun, K., Wang, R., & Sรธgaard, A. (2025). Comprehensive reassessment of large-scale evaluation outcomes in LLMs: A multifaceted statistical approach. Minor revisions.
arXiv.2403.15250 โ
arXiv.2403.15250 โ
๐ Book Chapters
[B1] Sun, K., & Wang, R. (2023). Decoding Chinese discourse: An exploration through the multi-layer annotated 'run-on' sentences corpus. In Signals and Communication Technology (Springer). (to appear).
Springer Series โ
Springer Series โ
[B2] Sun, K. (2021). An investigation of the cognitive and linguistic factors influencing Chinese readers' perception of sentence boundaries in Mandarin. In Comparative Punctuation, pp. 215-235. Berlin: De Gruyter.
๐ Selected Other Publications
[S1] Wang, R., & Sun, K. (2020). Review of sensory linguistics: Language, perception and metaphor. Folia Linguistica, 54(1), 269โ275.
[S2] Sun, K. (2020). The opposition of surprisal and semantic information in the prediction of language processing: Evidence from eye-tracking data. The 5th Usage-Based Linguistics Conference, Tel Aviv University, Israel.
[S3] Sun, K. (2019). A regression model for simulating and predicting the use of periods by Chinese natives Invited Keynote. Conference of Punctuation Seen Internationally, Regensburg, Germany.
[S4] Sun, K. (2015). The complexity of zero anaphora in Chinese discourse. The 19th International Conference on Asian Language Processing, Suzhou, China.
[S5] Sun, K. (2009). The Review of Contrastive Linguistics: History and Philosophy. Languages in Contrast, 2, 291-295.