Frequency Analysis of Complex Sentence Structures in English Written Texts: A Corpus-Based Study Using Brown and LOB Corpora

Authors

  • Ass.Lect Rafah Mazban Badan Misan Education Directorate

DOI:

https://doi.org/10.31185/eduj.Vol62.Iss1.4737

Keywords:

Quantitative analysis; language variation; complex sentence; Brown and LOB corpus

Abstract

Abstract: Complex sentences constitute a universal feature of human language, characterised by their intricate syntactic structures and diverse types of embedded clauses, thereby establishing them as a focal point in linguistic research. This study employs the Brown and LOB written English corpora to conduct a quantitative analysis of complex sentence frequency across four dimensions: overall usage, genre, language variety (American versus British English), and the types and numbers of subordinate clauses. Key findings reveal that genre significantly influences complex sentence frequency, with the highest usage observed in religious texts, literature, biographies, and essays—attributed to their formal or serious tone—whilst fiction exhibits the lowest frequency. In contrast, language variety demonstrates no statistically significant impact. Notably, both the type and number of subordinate clauses within complex sentences significantly affect their frequency. Contrary to prior studies, relative clauses (in both American and British English) do not rank as the least frequent, thereby challenging existing linguistic assumptions.

Downloads

Download data is not yet available.

References

Bi, P., & Jiang, J. (2020). Syntactic complexity in assessing young adolescent EFL learners’ writings: Syntactic elaboration and diversity. System, 91, 102248. Retrieved from https://doi.org/10.1016/j.system.2020.102248 DOI: https://doi.org/10.1016/j.system.2020.102248

Fan, Y., Li, B., Sataer, Y., Gao, M., Shi, C., Cao, S., & Gao, Z. (2023). Hierarchical clause annotation: building a clause-level corpus for semantic parsing with complex sentences. Applied Sciences, 13(16), 9412. Retrieved from https://doi.org/10.3390/app13169412 DOI: https://doi.org/10.3390/app13169412

Hirvela, A., (2022). Connecting reading & writing in second language writing instruction. University of Michigan Press. Retrieved from https://2cm.es/1fL3L

Kim, J., Maddela, M., Kriz, R., Xu, W., & Callison-Burch, C. (2021). BiSECT: Learning to split and rephrase sentences with bitexts. arXiv preprint arXiv:2109.05006. Retrieved from https://doi.org/10.48550/arXiv.2109.05006 DOI: https://doi.org/10.18653/v1/2021.emnlp-main.500

Liu, J., Yang, N., & Wu, M. (2025). A Quantitative and Comparative Study of Syntactic Complexity of Subclasses of English Nominal Clauses. Journal of Quantitative Linguistics, 1-34. Retrieved fromh ttps://doi.org/10.1080/09296174.2025.2498797 DOI: https://doi.org/10.1080/09296174.2025.2498797

Liu, H., & Sun, M. (2022). A review of syntactic complexity in second language writing (2010-2020). Foreign Studies, 10(1), 28-32. Retrieved from https://doi.org/10.1016/j.jslw.2015.06.008 DOI: https://doi.org/10.1016/j.jslw.2015.06.008

Lu, T. (2020, August). Analysis on linguistics research directions in the age of big data.Vol. 1606, No. 1, p.012008). IOP Publishing. Retrieved from DOI 10.1088/1742-6596/1606/1/012008 DOI: https://doi.org/10.1088/1742-6596/1606/1/012008

Niklaus, C. (2022). From complex sentences to a formal semantic representation using syntactic text simplification and open information extraction. Springer Nature. Retrieved from https://2cm.es/1fL9Q DOI: https://doi.org/10.1007/978-3-658-38697-9

Downloads

Published

2026-02-10

Issue

Section

Articles

How to Cite

Ass.Lect Rafah Mazban Badan. (2026). Frequency Analysis of Complex Sentence Structures in English Written Texts: A Corpus-Based Study Using Brown and LOB Corpora. Journal of College of Education, 62(1), 505-518. https://doi.org/10.31185/eduj.Vol62.Iss1.4737