Publications

๐Ÿ“š Publications

Research spanning computational linguistics, cognitive computation, discourse analysis, NLP, and digital humanities. Published in top-tier venues including PNAS, Cognition, Cognitive Science, Neural Networks, and Linguistics.

40+
International Journals
10+
Chinese CSSCI
2
Book Chapters
5+
Keynote & Conf.
International Journals Chinese CSSCI Journals Conference Papers Book Chapters Preprints & Under Review

* indicates corresponding author

๐Ÿ“„ Selected Recent Publications

2026
[1] Sun, K., & Wang, R.* (2026). The ebb and flow of discourse connectives: Stylistic change or cognitive decline? Language and Cognition.
[2] Sun, K*., Wang, R.*, & Baayen, H. (2026). Breaking myths in LLM scaling and emergent abilities with a comprehensive statistical analysis. Neurocomputing.
2025
[3] Wang, R., & Sun, K.* (2025). DSPy-based neural-symbolic pipeline to enhance spatial reasoning in LLMs. Neural Networks.
[4] Sun, K.*, & Wang, R.* (2025). A novel dependency framework for enhancing discourse data analysis. Data Intelligence.
[5] Sun, K., Wang, R.*, & Baayen, H. (2025). Attention-aware measures of semantic relevance for predicting human reading behavior. Linguistics.
[6] Sun, K.*, & Wang, R.* (2025). Computational sentence-level metrics of reading speed and its ramifications for sentence comprehension. Cognitive Science, e70092.
[7] Sun, K.*, & Liu, H. (2025). Attention-aware semantic relevance predicting Chinese sentence reading. Cognition, 105991.
2023
[8] Sun, K.*, Wang, Q., & Lu, X. (2023). An interpretable measure of semantic similarity for predicting eye movements in reading. Psychonomic Bulletin & Review, 30, 1227โ€“1242.
[9] Liu, Y., Yan, Y., Xia, H., & Sun, K. (2023). Analysing the longitudinal course selection panel data (2014-2020) of K-12 teachers from Zhejiang province: A comprehensive study on in-service training needs. Professional Development in Education, 1โ€“21.
2022
[10] Sun, K.*, & Wang, R. (2022). The role of mutual information and semantic similarity in sentence processing: The case of dangling construction in Chinese. Journal of Cognitive Psychology, 35(2), 142โ€“165.
[11] Sun, K. (2022). Colloquialization as a key factor in historical changes of rational and emotional words. Proceedings of the National Academy of Sciences (PNAS), 119(26), e2205563119. โ˜… Featured by MIT Technology Review
[12] Sun, K.* & Lu, X. (2022). Predicting Chinese readers' perception of sentence boundaries in written Chinese. Reading & Writing, 35, 1889โ€“1910.
[13] Sun, K.* & Wang, R. (2022). Constructing a corpus of Chinese textual "run-on" sentences (CCTRS): Discourse corpus benchmark with multi-layer annotations. International Conference on Natural Language and Speech Processing, ACL, 265โ€“276.
[14] Wang, J., Tang, C., Wan, Z., Zhang, W., Sun, K., & Zomaya, A.Y. (2022). Efficient and effective one-step multi-view clustering. IEEE Transactions on Neural Networks and Learning Systems.
2021
[15] Sun, K.*, & Wang, R.* (2021). Using the relative entropy of linguistic complexity to assess L2 language proficiency development. Entropy, 23(8), 1080.
[16] Sun, K.*, Xiong, W., & Wang, R. (2021). Investigating genre distinctions through discourse distance and discourse network. Corpus Linguistics and Linguistic Theory, 17(3), 599-624.
[17] Sun, K.*, Liu, H., & Xiong, W. (2021). The evolutionary pattern of language in scientific writings: A case study of Philosophical Transactions of Royal Society (1665-1869). Scientometrics, 1695โ€“1724.
2020
[18] Sun, K.*, & Baayen, H. (2020). Hyphenation as an efficient compounding strategy in English. Language Sciences, 83(1), 101326.
2019
[19] Sun, K.*, & Xiong, W. (2019). A computational model for measuring discourse complexity. Discourse Studies, 21(6), 690-712.
[20] Sun, K. (2019). Teaching English-Chinese textual translation strategies: A topic-chain approach. Babel: International Journal of Translation, 65(2), 286โ€“315.
[21] Sun, K., & Wang, R.* (2019). Frequency distributions of punctuation marks in English: Evidence from large-scale corpora. English Today, 35(4), 23-35.
[22] Sun, K. (2019). The integration functions of topic chains in Chinese discourse. Acta Linguistica Asiatica, 9(1), 29-57.
2018
[23] Sun, K. (2018). Approaching the double-nominal construction in Mandarin Chinese through the semantic-cognitive interaction. Studia Linguistica, 72(3), 687โ€“724.
[24] Sun, K. & Zhang, L.* (2018). Quantitative aspects of PDTB-style discourse relations across languages. Journal of Quantitative Linguistics, 25(4), 342-371.

๐Ÿ‡จ๐Ÿ‡ณ Chinese Publications (Selected)

[25] ๅญ™ๅค. (2015). ไธญๅ›ฝๅคๆ–‡็‰นๅพไธŽๆ ‡็‚นๅˆ›้€ ๆœบ็†โ€”ไธŽๆฌงๆดฒๆ ‡็‚นไผ ็ปŸๅฏนๆฏ”. ไธญๅ›ฝ่ฏญๆ–‡ (CSSCI), 2015ๅนด็ฌฌ6ๆœŸ. ไบบๅคงๅคๅฐ่ต„ๆ–™ 2016 ไธญๅ›ฝ็คพไผš็ง‘ๅญฆๆ–‡ๆ‘˜ 2016
[26] ๅญ™ๅค. (2015). ๆฑ‰่ฏญ่ฏ้ข˜้“พ่Œƒ็•ดใ€็ป“ๆž„ไธŽ็ฏ‡็ซ ๅŠŸ่ƒฝ. ่ฏญ่จ€ๆ•™ๅญฆไธŽ็ ”็ฉถ (CSSCI), 2015ๅนด็ฌฌ5ๆœŸ. ไบบๅคงๅคๅฐ่ต„ๆ–™ 2016
[27] ๅญ™ๅค. (2014). ๆฑ‰่ฏญ่ฏ้ข˜้“พ็š„็‰น็‚นไธŽๆœฌ่ดจ. ๆฑ‰่ฏญๅญฆไน  (CSSCI), 2014ๅนด็ฌฌ5ๆœŸ.
[28] ๅญ™ๅค. (2013). ่ฏ้ข˜้“พๅบ”็”จไบŽ่‹ฑๆฑ‰็ฟป่ฏ‘ๆจกๅผไธŽ็ญ–็•ฅ็ ”็ฉถ. ๅค–่ฏญไธŽๅค–่ฏญๆ•™ๅญฆ (CSSCI), 2013ๅนด็ฌฌ1ๆœŸ.
[29] ๅญ™ๅค. (2012). ๅฏน็คพไผš็ง‘ๅญฆ"่ฏญ่จ€่ฝฌๅ‘"็Žฐ่ฑก็š„ๆ€่€ƒโ€”โ€”ๅ…ผ่ฎบ'็คพไผš็ง‘ๅญฆ'ๅ’Œ'ไบบๆ–‡ๅญฆ็ง‘'็š„ๅ›ฐๅขƒใ€ๅฑๆœบไธŽๅฏน็ญ–. ๅŽๅ—็†ๅทฅๅคงๅญฆๅญฆๆŠฅ, 2012ๅนด็ฌฌ5ๆœŸ. ไบบๅคงๅคๅฐ่ต„ๆ–™ 2013
[30] ๅญ™ๅค. (2011). ไธญๅ›ฝๅคไปฃๅ…ตๅ™จ่‹ฑ่ฏ‘ๅˆๆŽข๏ผšไปฅใ€Šไธ‰ๅ›ฝๆผ”ไน‰ใ€‹่‹ฑ่ฏ‘ๆœฌไธบไพ‹. Translation Quarterly (็ฟป่ฏ‘ๅญฃๅˆŠ), 59, 51โ€“83.
[31] ๅญ™ๅค. (2010). ๅฝ“ไปฃๅ›ฝๅค–ๆ ‡็‚น็ฌฆๅท็ ”็ฉถ. ๅฝ“ไปฃ่ฏญ่จ€ๅญฆ (CSSCI), 2010ๅนด็ฌฌ2ๆœŸ.
[32] ๅญ™ๅค. (2007). ่€่ˆไธŽ็ฟป่ฏ‘. ไธŠๆตท็ฟป่ฏ‘ (CSSCI), 2007ๅนด็ฌฌ2ๆœŸ.

๐ŸŽค Important Conference Papers

[C1] Sun, K., & Wang, R. (2025). Enhancing Personality Detection Models with Continuous Outputs Through Mixed Strategy Training. Submitted to the 39th AAAI 2025 Conference (passed first round), Feb, Philadelphia, US.
arXiv.2406.16223 โ†’
[C2] Sun, K., & Wang, R. (2023). A Groundbreaking dependency framework for streamlining discourse corpora. The 18th Linguistic Annotation Workshop, Dec, Malta.
[C3] Sun, K., & Wang, R. (2023). Attention-aware sentence-level metrics predicting human sentence comprehension. The 5th China-Germany Intelligent Robotics Conference, Nov, Tรผbingen. Keynote Speaker
[C4] Sun, K., & Wang, R. (2022). Constructing a corpus of Chinese textual "run-on" sentences (CCTRS). 5th International Conference on Natural Language and Speech Processing, December, Trento, Italy.
ACL Anthology โ†’
[C5] Sun, K. & Nixon, J. (2020). Surprisal and semantic information in the prediction of language processing: Evidence from EEG data. AMLaP-Asia 2020, Hong Kong.

๐Ÿ”ฌ Selected Preprints & Under Review

[P1] Sun, K., & Wang, R. (2024). The roles of contextual semantic relevance metrics in human visual processing. Under review.
arXiv.2403.19233 โ†’
[P2] Sun, K., & Wang, R. (2024). Tracking neural dynamics of language comprehension: Semantic integration and lexical expectation during naturalistic discourse reading across extensive EEG channels. Under revision.
[P3] Sun, K., & Wang, R. (2024). Differential contributions of machine learning and statistical analysis to language and cognitive sciences. Revised version submitted.
arXiv.2404.14052 โ†’
[P4] Sun, K., & Wang, R. (2024). Computational sentence-level metrics for predicting human sentence comprehension. Minor revisions required by Cognitive Science.
arXiv.2403.15822 โ†’
[P5] Sun, K., & Wang, R. (2025). Automatic essay multi-dimensional scoring with fine-tuning and multiple regression. Submitted to AI Conference.
arXiv.2406.01198 โ†’
[P6] Sun, K., & Wang, R. (2025). Textual similarity as a key metric in machine translation quality estimation. Under review by Journal of Big Data.
arXiv.2406.07440 โ†’
[P7] Sun, K. (2025). The ebb and flow of discourse connectives: Stylistic change or cognitive decline? Minor revision.
bioRxiv โ†’
[P8] Sun, K., Wang, R., & Sรธgaard, A. (2025). Comprehensive reassessment of large-scale evaluation outcomes in LLMs: A multifaceted statistical approach. Minor revisions.
arXiv.2403.15250 โ†’

๐Ÿ“– Book Chapters

[B1] Sun, K., & Wang, R. (2023). Decoding Chinese discourse: An exploration through the multi-layer annotated 'run-on' sentences corpus. In Signals and Communication Technology (Springer). (to appear).
Springer Series โ†’
[B2] Sun, K. (2021). An investigation of the cognitive and linguistic factors influencing Chinese readers' perception of sentence boundaries in Mandarin. In Comparative Punctuation, pp. 215-235. Berlin: De Gruyter.

๐Ÿ“ Selected Other Publications

[S1] Wang, R., & Sun, K. (2020). Review of sensory linguistics: Language, perception and metaphor. Folia Linguistica, 54(1), 269โ€“275.
[S2] Sun, K. (2020). The opposition of surprisal and semantic information in the prediction of language processing: Evidence from eye-tracking data. The 5th Usage-Based Linguistics Conference, Tel Aviv University, Israel.
[S3] Sun, K. (2019). A regression model for simulating and predicting the use of periods by Chinese natives Invited Keynote. Conference of Punctuation Seen Internationally, Regensburg, Germany.
[S4] Sun, K. (2015). The complexity of zero anaphora in Chinese discourse. The 19th International Conference on Asian Language Processing, Suzhou, China.
[S5] Sun, K. (2009). The Review of Contrastive Linguistics: History and Philosophy. Languages in Contrast, 2, 291-295.