Complex(ity) picture emerges: Indices of linguistic complexity in the writing of heritage learners of Russian

by Olesya Kisselev (University of Texas, San Antonio)

The present paper reports on the results of a corpus-based study which investigates the concept of linguistic complexity (LC) in the writing of learners of Russian as a heritage language. LC has been defined in applied linguistics research literature as a range of basic and sophisticated structures (lexical and syntactic) available and accessible to the language users (Wolfe-Quintero et al. 1998; Ortega 2003). Operationalized through various indexes such as sentence length, normalized counts of sentences, clauses, and T-units, specific syntactic structures (e.g., subordination, nominalization structures, etc.), counts of unique words, word forms and lemmas, ratio of less/more frequent vocabulary, etc., the construct of LC has been well established as a developmental index in second language writing/speaking, and as such, as a valid assessment measure (Wolfe-Quintero et al. 1998; Ortega 2003; Osborne 2011; Lu 2011; Yang et al. 2015; Lu and Ai 2015). However, the construct has not been applied to heritage language production, and it remains unclear whether LC indices may reliably correlate with such important parameters of heritage language development as language proficiency level or literacy abilities.

The current study seeks to address this gap in current research; it attempts to establish which automatically extracted LC indices may correlate with independently established writing proficiency levels in Russian heritage writing. To this end, 82 essays written by heritage learners of Russian from across the U.S. were first subjected to the Writing Proficiency Test, and then to a set of corpus-based analyses that established a set of lexical complexity indices (specifically, average word length per essay, counts of words, word types and lemmas, counts of sophisticated word types by frequency measures as well as lexical standards, type/token and type/lemma ratios, and number of misspelled words) and syntactic complexity measures (specifically, number of sentences per essay, sentence length, number of clauses per sentence, types of clauses, number of phrases per sentence, and sentence types).

The presentation will review the results of the analyses and discuss a complex picture of correlations between specific indices of LC and writing proficiency levels.


Published: Sunday, May 24, 2020