Frequency Analysis of Complex Sentence Structures in English Written Texts: A Corpus-Based Study Using Brown and LOB Corpora
DOI:
https://doi.org/10.31185/eduj.Vol62.Iss1.4737Keywords:
Quantitative analysis; language variation; complex sentence; Brown and LOB corpusAbstract
Abstract: Complex sentences constitute a universal feature of human language, characterised by their intricate syntactic structures and diverse types of embedded clauses, thereby establishing them as a focal point in linguistic research. This study employs the Brown and LOB written English corpora to conduct a quantitative analysis of complex sentence frequency across four dimensions: overall usage, genre, language variety (American versus British English), and the types and numbers of subordinate clauses. Key findings reveal that genre significantly influences complex sentence frequency, with the highest usage observed in religious texts, literature, biographies, and essays—attributed to their formal or serious tone—whilst fiction exhibits the lowest frequency. In contrast, language variety demonstrates no statistically significant impact. Notably, both the type and number of subordinate clauses within complex sentences significantly affect their frequency. Contrary to prior studies, relative clauses (in both American and British English) do not rank as the least frequent, thereby challenging existing linguistic assumptions.
Downloads
References
Bi, P., & Jiang, J. (2020). Syntactic complexity in assessing young adolescent EFL learners’ writings: Syntactic elaboration and diversity. System, 91, 102248. Retrieved from https://doi.org/10.1016/j.system.2020.102248 DOI: https://doi.org/10.1016/j.system.2020.102248
Fan, Y., Li, B., Sataer, Y., Gao, M., Shi, C., Cao, S., & Gao, Z. (2023). Hierarchical clause annotation: building a clause-level corpus for semantic parsing with complex sentences. Applied Sciences, 13(16), 9412. Retrieved from https://doi.org/10.3390/app13169412 DOI: https://doi.org/10.3390/app13169412
Hirvela, A., (2022). Connecting reading & writing in second language writing instruction. University of Michigan Press. Retrieved from https://2cm.es/1fL3L
Kim, J., Maddela, M., Kriz, R., Xu, W., & Callison-Burch, C. (2021). BiSECT: Learning to split and rephrase sentences with bitexts. arXiv preprint arXiv:2109.05006. Retrieved from https://doi.org/10.48550/arXiv.2109.05006 DOI: https://doi.org/10.18653/v1/2021.emnlp-main.500
Liu, J., Yang, N., & Wu, M. (2025). A Quantitative and Comparative Study of Syntactic Complexity of Subclasses of English Nominal Clauses. Journal of Quantitative Linguistics, 1-34. Retrieved fromh ttps://doi.org/10.1080/09296174.2025.2498797 DOI: https://doi.org/10.1080/09296174.2025.2498797
Liu, H., & Sun, M. (2022). A review of syntactic complexity in second language writing (2010-2020). Foreign Studies, 10(1), 28-32. Retrieved from https://doi.org/10.1016/j.jslw.2015.06.008 DOI: https://doi.org/10.1016/j.jslw.2015.06.008
Lu, T. (2020, August). Analysis on linguistics research directions in the age of big data.Vol. 1606, No. 1, p.012008). IOP Publishing. Retrieved from DOI 10.1088/1742-6596/1606/1/012008 DOI: https://doi.org/10.1088/1742-6596/1606/1/012008
Niklaus, C. (2022). From complex sentences to a formal semantic representation using syntactic text simplification and open information extraction. Springer Nature. Retrieved from https://2cm.es/1fL9Q DOI: https://doi.org/10.1007/978-3-658-38697-9
Downloads
Published
Issue
Section
License
Copyright (c) 2026 م.م. رفاه مزبان بادان

This work is licensed under a Creative Commons Attribution 4.0 International License.
