1. Introduction
Language development, which refers to characteristics of a learner’s output that reveal some
point or stage along a developmental continuum (Wolfe-Quintero, Inagaki & Kim, 1998), moves
along three dimensions: fluency, accuracy and complexity. As opposed to the other two
dimensions, linguistic complexity, consisting of lexical complexity (also called lexical richness)
and syntactic complexity, is most relevant to change and the opportunities for development and
growth in the interlanguage system and thus will be the research focus of the current study.
Need for the study
Though a great number of studies have been carried out to investigate lexical richness or
syntactic complexity separately at home and abroad (Hunt, 1970; Crowhurst, 1980,1983; Laufer,
1991, 1994, 1995, 1998; Bardovi-Harlig, 1992; Vermeer, 2000; Wu Xudong & Chen Xiaoqing,
2000; Ortega, 2003; Liu Donghong, 2003; Yu Hua, 2004; Wen Qiufang, 2006a, b; Qin Xiaoqing,
2007), studies on the developmental tendency of the lexical richness and syntactic complexity
from a longitudinal perspective as well as the interaction between lexical richness and
complexity (Morris & Crump, 1982) are scanty and far from conclusive. What’s more, as Wen
(2006a) claims, lexical characteristics and syntactic characteristics have been heavily explored in
EFL writing (Engber, 1995; James, 2002; Laufer, 1991,1998; Shaw & Liu,1998; Li Jingquan &
Cai Jingting, 2001; Ni Lan,2000; Wen Qiufang, etc., 2003, 2004 ) while similar researches on
the spoken data of EFL learners are much rarer (Vermeer, 2000; Wen Qiufang, 2006a, b ).
Accordingly, a longitudinal study on the changes in Chinese L2 learners’ vocabulary and syntax
is necessary.
This line of study should be undertaken in the Chinese context also because the corollary of
it will have significant practical implications for L2 lexis and syntax instruction. It is known that
Chinese L2 teachers lay more emphasis on grammatical accuracy than on complexity both in
instructing and assessing writing, which leads to L2 learners’ more frequent use of simple
vocabulary and syntactic structures, a detriment to their language development. At present, we
still lack a clear picture of the developmental patterns of lexical richness and syntactic
complexity for Chinese L2 learners, which will undoubtedly shed light on Chinese L2 teaching.
Research purpose
This study is undertaken with the aim of exploring the developmental patterns of L2
learners’ lexical richness and syntactic complexity. Specifically, the purpose of the present study
is three-fold: firstly, to reveal the developmental patterns of L2 learners’ lexical richness and
syntactic complexity across three years; secondly, to compare the growth rates of lexical richness
and syntactic complexity in their oral output at the two intervals; thirdly, to examine the
relationship between the L2 learners’ lexical richness and their syntactic complexity in three
years respectively.
2. Literature review
In the field of Second Language Acquisition (SLA) research, language competence can be
studied from different aspects. As for productivity, language competence can move along two
dimensions: lexical complexity (also called lexical richness) and syntactic complexity.
Additionally, according to Wolfe-Quintero et al. (1998), complexity means that a wide variety or
a wide range of both basic and sophisticated structures and words are available and can be
accessed quickly. In Wolfe-Quintero’s definition, the first half refers to syntactic complexity
while the latter refers to lexical richness. This chapter consists of three parts. The first part
focuses on lexical richness, the second part on syntactic complexity and the third part on
problems in the previous studies.
Lexical richness
Many scholars (Linnarud, 1986; Nihanani, 1981; Hyltenstam, 1988; Engber, 1995) have
done some researches on lexical richness. Laufer (1994) defined lexical richness as consisting of
lexical variance, lexical density, lexical sophistication and lexical originality.
Several types of ratio measures have been utilized in research on second language lexical
development in writing. Lexical variance was measured by a type/token ratio (Laufer, 1991).
Lexical density was calculated by dividing the number of types by the number of lexical tokens
(Engber, 1995). Lexical sophistication was measured by the ratio of the advanced lexemes to the
total number of words, as done in Engber (1995). Lexical originality was calculated by dividing
the number of tokens unique to a writer by the total number of tokens (Linnarud, 1986).
Among these measures, lexical variation measure and lexical sophistication measure are
most frequently used. Wolfe-Quintero et al. (1998) noted that lexical complexity was manifest in
writing primarily in terms of the range (lexical variation) and size (lexical sophistication) of a
second language writer’s productive vocabulary. They concluded that measures of lexical
variation and sophistication appeared to best relate to second language development. Although
lexical variation and sophistication measures have not been systematically investigated in many
studies or for many program levels, they did offer promise as indicators of language
development. This thesis aims to review lexical variance and lexical sophistication as two
indicators of lexical richness.
Lexical variance
In Linnarud’s (1986) study, lexical variance was defined as the total number of different
lexical items or word types divided by the total number of lexical words in a text. The subjects
fell into two groups: the L2 learner group - 17-year-old Swedish learners (L2 high school
juniors), and the native speaker group at the same school level. They were asked to write a
picture description essay in 40 minutes. Linnarud (1986) compared the compositions in lexical
variance between the two groups. She found a clear difference in lexical variance between the L2
learners and the native speakers: the L2 learners lacked lexical variation. She also had each
composition holistically scored in order to examine whether there was a significant relationship
between lexical variance and L2 writing quality. As a result, no relationship was found between
the holistic scores and this measure for both the L2 learner group and the native speaker group.
In Nihanani’s (1981) study, lexical variance was defined as the total number of different
lexical items divided by the total number of lexical words in a text. Nihanani (1981) collected the
take-home essays written by L2 university students. She counted each lexical variance score
based on the given definition and had each essay holistically scored. The same result as
Linnarud’s (1986) was found: there was no significant relationship between the holistic scores
and lexical variance.
In Hyltenstam’s (1988) study, the L2 learners were second year high school students. They
were asked to write a summary and response to a 20-minute film without time limit. Unlike
Nihanani (1981) and Linnarud (1986), Hyltenstam (1988) controlled for the text length when
calculating a lexical variance score. However, Hyltenstam (1988) found a similar result: there
was no relationship between lexical variance and L2 writing quality.
In Engber (1995) and Linnarud (1986), lexical variance was defined in the same way.
However, Engber (1995) found a different result. In her study, the L2 learners were students at
an intermediate to high-intermediate levels of language proficiency. They were required to write
on the same topic within 35 minutes. The topic was chosen from a pool of topics that had been
proven to be suitable for eliciting responses at different levels. She used a holistic scoring
scheme to measure the quality of each composition. The quality scores were then compared with
the quantitative measures of lexical variance. Her calculation of a lexical variance score was
unique: she divided every essay into 126-word segments, each segment was treated as a separate
unit and an average lexical variance score for the essay was then calculated as the ratio of the
sum of the different words per segment to the sum of the total number of lexical words per
segment. She calculated the measure of lexical variance first with lexical errors included and
then with errors eliminated, and found moderately high, statistically significant correlations
between the writing quality and either of both measures. A comparison of the means for these
two measures showed a higher correlation for lexical variation without error (r = ) than for
that with error (r = ).
Wolfe-Quintero et al. (1998, p. 109) held that this measure captured the intuition that
second language writers at a higher proficiency level will command a larger vocabulary size and
will be able to use significantly more lexical word types than writers at a lower proficiency level.
Lexical sophistication
A number of researchers (Laufer, 1991; Linnarud, 1986; Liu Donghong, 2003) used lexical
sophistication to measure how many low frequency or advanced words were used in a text.
Linnarud (1986) defined lexical sophistication as the number of sophisticated lexical words
divided by the total number of lexical words in a text and sophisticated lexical words as those
English words that were generally introduced at grade 9 and above in the Swedish educational
system. He found that native language writers used significantly more sophisticated words than
second language writers ( versus ), but found a low correlation between the ratio of
sophisticated words and the holistic ratings of the compositions. The low correlation may be
understandable, since the students were at a lower language proficiency level and had no
command of a large active vocabulary.
Laufer (1994) defined lexical sophistication as the ratio of the total number of sophisticated
word types divided by the total number of word types. She analyzed four different measures of
sophistication on pre- and post-compositions by two advanced university classes. In two of the
analyses, she counted sophisticated words as words not on a 2000-word frequency list and words
on a university-level word list, and found the measures significant for both groups. In the other
two analyses, she counted sophisticated words as words not on any of her frequency lists, and
found no significant effect.
Liu Donghong (2003) used the Lexical Frequency Profile in calculating lexical
sophistication scores. Unlike Linnarud (1986), she defined lexical sophistication as the number
of sophisticated words divided by the total number of words tokens in a text. In her study,
advanced words were defined as words in AWL and Off-list (beyond 2, 000). Her subjects were
57 second-year college students at a Chinese university. They were required to write on a given
topic within 30 minutes. After the compositions were collected, holistic rating was used on a
15-point scale, according to the criteria of College English Test Band Four in China. Before
obtaining advanced words by running VocabProfile (Nation and Heatley, 1994), software for
word frequency statistics, she deleted misspelled words from advanced words, for the
VocabProfile package counts misspelled words as off-list words. In addition, she counted
different inflected forms of a sophisticated word as one word type and so repetitive counting of
the same words (lexemes) was avoided. As a result, Liu Donghong (2003) found that lexical
sophistication did not affect L2 writing quality. Liu Donghong’s (2003) result seemed to be
justifiable, too, since her students were second-year non-English majors, who could not freely
use a lot of advanced words and so displayed little difference in using sophisticated words.
Besides, Laufer (1991) defined lexical sophistication as the percentage of "advanced words" in
the text.
To conclude, lexical sophistication explains lexical richness in terms of the size of a
learner’s productive vocabulary (Wolfe-Quintero, et al., 1998, p. 101). The size is reflected by
the use of advanced words (low frequent words) in a text in that, high frequency words, used by
both low and high level learners, cannot show the “size” difference between them while low
frequency words are not shared by learners of different proficiency levels equally, ., high level
students tend to use more low frequency words than low level students.
Syntactic complexity
In Ortega’s (2003) study, syntactic complexity (also called syntactic maturity or linguistic
complexity) referred to the range of forms that surfaced in language production and the degree of
sophistication of such forms. This construct is important in second language research because of
the assumption that language development entails, among other processes, the growth of an L2
learner’s syntactic repertoire and her or his ability to use that repertoire appropriately in a variety
of situations.
Syntactic complexity measures are of two types: those that analyze the clauses, sentences,
or T-units in terms of each other (., clauses per sentence, dependent clauses per T unit, T units
per sentence); and those that analyze the presence of specific grammatical structures in relation
to clauses, T units, or sentences (., passives per sentences, Kameen, 1979; complex nominals
per T-unit, Cooper, 1976).
In the past two decades, these various measures of syntactic complexity were used by many
researchers (Cragg & Nation, 2006; Nippold, Hesketh, & Duthie, 2005; Nippold, Mansfield, &
Billow, 2007; Ortega, 2003; Wolfe-Quintero et al, 1998). Wolfe-Quintero et al. (1998) looked
cumulatively at the strength of the T-unit, mean length of clause, clauses per T-unit, dependent
clauses per clause and many other indices of syntactic complexity and concluded that clauses per
T-unit (C/T) and dependent clauses per clause or per T-unit (DC/C or DC/T) were the most
satisfactory measures, because they were associated linearly and consistently with their programs
or proficiency levels. However, compared with dependent clauses per T-unit (DC/T), dependent
clauses per clause (DC/C) was more frequently applied in previous experimental studies.
Therefore, in this study, we adopt an advanced T-unit complexity ratio, the clauses plus verb
phrases per T-unit measure ((C+VP)/T), which was derived from C/T, and dependent clauses per
clause (DC/C) as two indices of syntactic complexity.
T-unit complexity ratio
Hunt (1965) first developed the T-unit as a measure of children’s syntactic maturity in
writing, defining the T-unit as a minimal terminable unit consisting of a main (independent)
clause plus whatever subordinate clauses and phrases that happen to be attached to and
embedded within it. Following Hunt (1965, 1970), T-unit is used as the production unit in this
study.
The T-unit complexity ratio is to measure how grammatically complex the writing of a
learner is, under the assumption that the more clauses per T-unit there are, the more complex the
writing is (Wolfe-Quintero, 1998). However, the previous studies based on it found mixed
results. Some of them found a significant relationship between proficiency and the T-unit
complexity ratio while others did not.
Hirano (1991) found a relationship between program level and clauses per T-unit, but not
between CELT scores and clauses per T-unit. Cooper (1976) and Monroe (1975) found a
relationship between school level and clauses per T-unit. Flahive and Snow (1980) found a
relationship between holistic ratings and clauses per T-unit for the first, second, third, and sixth
program levels, but not for the fourth or fifth levels. Bardovi-Harling and Bofman (1989) and
Perkins (1980) did not find a relationship between clauses per T-unit and pass/fail ratings of
advanced learners, nor did Ishikawa (1995) find a relationship between clauses per T-unit and
pre- and post-tests with two groups of beginning learners. Casanave (1994) found an overall
increase in clauses per T-unit after three semesters of journal writing, but did not test the
differences statistically. Neither Kameen (1979) nor Sharma (1980) found a relationship between
clauses per T-unit and low-intermediate versus advanced groups. Beers & Nagy (2007)
examined the relationship of clauses per T-unit with rated quality for two genres of text produced
by middle school students. A sample of 41 seventh and eighth grade students composed a
narrative and persuasive essays. Texts were rated for quality and coded for clauses per T-unit.
Clauses per T-unit was positively correlated with quality for narratives, but negatively correlated
with quality for essays.
Generally speaking, T-unit complexity ratio (C/T) is a comparatively reliable index of
syntactic complexity among all of the developmental indices. However, it is found that it
neglects verb phrases, another kind of grammatical structures reflecting syntactic complexity as
well. As a consequence, an advanced T-unit complexity ratio (C+VP)/T is proposed and will be
adopted in the present study to measure syntactic complexity.
DC/C
The dependent clause ratio is a measure that examines the degree of embedding in a text,
by counting the number of dependent clauses as a percentage of the total number of clauses
(DC/C). It should be pointed out that few researchers defined clearly what they meant by
dependent clauses in their studies except Kameen (1979), who implied in his discussion that they
included adverbial, adjective, and nominal clauses.
Among previous related studies, Hirano (1991)’s study found that this measure significantly
differentiated all three program levels based on CELT score ranges, but only weakly correlated
with CELT scores themselves. Such a result was found for many measures, which means that the
actual scores were not directly related to a measure such as this but that writers with the same
proficiency range did have something in common on this and other measures. Her three groups
ranged from average of .18 (low) to .25 (mid) to .33 (high) dependent clauses per T-unit.
However, Kameen (1979) did not find a significant difference between two groups based on
holistic ratings of their writing (.40 dependent clauses per clause for the good writers and 37 for
the poor writers). Kameen (1979) suggests that good writers produce longer T-units as a result of
using more words rather than more clauses, most likely because they reduce clauses to
prepositional, infinitive and participle phrases.
Problems in the previous studies
Although researches in lexical richness and syntactic complexity increase in number and
come up with a lot of interesting results, there are still some problems in the previous studies.
First of all, most of the extant studies on lexical richness and/or syntactic complexities are
cross-sectional ones (Crowhurst, 1980, 1983; Bardovi-Harlig, 1992; Wu Xudong & Chen
Xiaoqing, 2000; Liu Donghong, 2003; Yu Hua, 2004; Qin Xiaoqing, 2007) and longitudinal ones
are much rarer (Wen Qiufang 2006a, b).
What’s more, in recent years, researchers at home and abroad show an increasing interest in
L2 learners’ writing performance (Engber 1995; James 2002; Laufer 1991, 1998; Shaw & Liu
1998; Li Jinquan; Cai Jinting, 2001; Ni Lan, 2000), but only few of them (Altman, 1997; Wen
Qiufang 2003, 2004) focus on the oral performance of L2 learners.
Additionally, in Wolfe-Quintero (1998)’s synthesis of literature review on all the previous
studies of developmental indexes, it was concluded that C/T and DC/C are two discriminant
indicators of syntactic complexity with high construct validity. However, both of the two mainly
focus on the degree of subordinating and diametrically neglect verbal phrases, including
participles, gerunds and infinitives, which could reflect complexity of syntactic constructions in
oral or written data as well. Thus, a better developmental index, like(C+VP)/T may be preferable
to analyze L2 learners’ syntactic complexity.
Lastly, quite a few studies investigate the relationship among three dimensions of language
development: fluency, accuracy and complexity or the relationship between any two of them (Yu
Hua, 2004; Qin Xiaoqing, 2007), or compare the lexical richness and syntactic complexity of
Chinese L2 learners with those of international L2 learners (Li Changsheng, 2007) or with native
speakers (Wen Qiufang, 2006a; Zhang Ping, 2007), and yet the dynamic and interactive research
on the developmental patterns of lexical and syntactic complexity and the interaction between
them from a longitudinal perspective is still non-existent, whether at home or abroad.
To sum up, the previous empirical studies are rather fragmentary, making it hard to draw
consistent general conclusions, which will justify the need for the present study.
3. Methodology
Research questions
The current study investigates the developmental patterns of L2 learners’ lexical richness
and syntactic complexity along their three years’ learning, different growth rates of them and the
relationship between them in the three years. The specific research questions are as follows:
(1) Do the L2 learners increase their lexical richness and syntactic complexity in three
years?
(2) Are there any great differences in the growth rates of the L2 learners’ lexical richness
and syntactic complexity at the first interval (from Year One to Year Two) and the second
interval (from Year Two to Year Three)?
(3) Is there any relationship between the L2 learners’ lexical richness and their syntactic
complexity in each year?
Variables and operational definitions
Lexical richness
Lexical richness is measured in terms of two most revealing indices: lexical variance (LV)
and lexical sophistication (LS) in this study.
Lexical variance (LV) is defined as the type/token ratio (TTR), ., the ratio in percentage
between the different lexemes (types) in the test and the total number of words (tokens) (Laufer,
1991; 1994a, b). When this study counted types, the different inflectional forms of a word were
regarded as one lexeme, for instance, ‘run, runs, running and ran’ were counted as the same
lexeme ‘run’. For this purpose, the online lemmatizer (be adopted to process all the transcribed
spoken data. However, few words in the same form but with different meanings were
lemmatized in a wrong way by the online lemmatizer, so they were corrected with the aid of
manual checking. For example, the word “means” is likely to be the third person singular of the
verb “mean”, which means “to convey or denote some facts or opinions”, or the noun which
refers to “a method or way of doing something” as well. These exceptional cases entail careful
manual check. At last, the TTR values of each sample will be standardized on a 100-word basis
(the minimal length of the transcribed monologue is 119 running words). This procedure was
followed to level out the effect of text length on the type-token ration.
The formula is
No. of types
LV =
No. of tokens
No. of advanced lexemes (types)
LS =
No. of tokens
Syntactic complexity
Syntactic complexity is defined as great length and subordination of T-unit. Approaches to
syntactic complexity in this study are of ratio type instead of frequency one, for it has been
pointed out that frequency measures may be doubtful because of the lack of a fixed delimiter and
quite a few related experimental studies could not lend their support to them. Therefore, based on
the literature review, the modified T-unit complexity ratio ((C + VP) / T) and dependent clause
ratio (DC/C) with high construct validity were used as measures of complexity in syntactic
development. The formulas are shown as follows:
CV/T = (C + VP) / T
DC/C = DC/C
Notes: T= T-units; C=clauses; VP= verbal phrases; DC=Dependent clauses.
The terms in the formulas need explanation. T-unit is used as the basic unit of ratio analysis
of syntactic complexity in the present study. T-units rather than C-units are used because the task
performance is monologic and contains few elided utterances (See Foster, Tonkyn &
Wigglesworth, 2000, for a discussion of the relative merits of using T-units or C-units).
Following Hunt (1970), a T-unit is seen as one main or independent clause plus whatever
subordinate or dependent clauses are attached to or embedded with it. As for the number of
T-unit, it can be thought of as nothing more than a representation of the independent clauses in
each written or spoken sample, since each T-unit consists of one independent clause (Hunt,
1965).
A clause is operationalized as a structure with an overt subject and a finite verb (Hunt,
1965) in this study. This definition of clause includes independent / main clauses, as well as
three types of dependent/subordinate clauses: adverbial clauses, adjective/relative clauses, and
nominal clauses. Following Bardovi-Harlig and Bofman (1989)’s definition of Verb phrase (VP),
it can be classified as three types: participle, gerund, and infinitive in this study while dependent
clauses (DC) are defined as adverbial, adjectival, and nominal (Kameen, 1979).
In counting these units, this study made a modification. As in the oral tests, there are some
repeated fillers or false starts on account of hesitation, self-correction, etc., which may affect the
measurement of syntactic complexity, the researcher excluded them from each oral sample when
tagging the transcribed texts.
Data collection
The participants in this study were 50 English majors who were enrolled in a key university
in 2001 and asked to complete an oral task by producing a three-minute monologue after three
minutes’ preparation in a language lab. Their spoken English data was collected three times in
December of the year 2001, 2002 and 2003, and then transcribed for further data analysis. The
topics for their oral tasks were all argumentative, rather similar in nature and relative to their
college life. The reasons for not repeating exactly the same topics over long periods of
longitudinal study is that the potential for diminished interest (and even demotivation or
boredom), as well as practice effects, among participants, would be a clear danger to the validity
of the data (Ortega & Iberri-Shea, 2005). The collecting time, topics and the running words of
oral data in each year are described in detail in table .
Table Description of the oral data
COLLECTIN
G TIME
TOPICS FOR ORAL TESTS
OVERALL TOKENS
(RUNNING WORDS)
OF ORAL DATA
Year 1
(Num. 50)
December,
2001
Is it appropriate for a college student to
rent an apartment and live outside the
campus?
13,681
Year 2
(Num. 50)
December,
2002
Make critical comments on the use of
electronic dictionaries among college
students.
13,985
Year 3
(Num. 50)
December,
2003
Do you think it is appropriate for college
students to get married? Give your
opinions and reasons.
14,016
Data analysis
Analysis of the transcribed oral data consists of four stages: applying Wordsmith to
calculate the value of lexical variance in each essay and Range 32 to obtain that of advanced
lexemes and the overall tokens for lexical sophistication in the same essay; tagging indexes
concerning syntactic complexity including T-units, clauses, verb phrases and dependent clauses;
computation of lexical sophistication and syntactic complexity measures according to the
corresponding formulae; calculation of the growth rates of four developmental indices, .,
dividing the value of each index in one year by that of the preceding year.
After attaining all the lexical richness and syntactic complexity indices of the data sets, the
researcher applied a multivariate analysis and T-test in SPSS to compare the differences of
L2 learners’ lexical and syntactic complexity in three years and of their growth rates in the two
consecutive periods (Year1-Year 2; Year 2-Year 3). Then Pearson correlation analysis was made
to find out whether there was a significant relationship between the L2 learners’ lexical richness
and that of their syntactic complexity in the three years.
4. Results and discussion
Results
The present study attempted to answer three questions, as were raised in the methodology
part: (1) Do the L2 learners increase their lexical richness and syntactic complexity across three
years? (2) Are there any great differences in the growth rates of the L2 learners’ lexical richness
and syntactic complexity at the first interval (from Year One to Year Two) and the second
interval (from Year Two to Year Three)? (3) Is there any relationship between the L2 learners’
lexical richness and their syntactic complexity in each year? In order to achieve the purpose, this
study collected the L2 learners’ oral data at three developmental years, counted different
developmental indexes, which were processed by SPSS .
Differences in lexical richness and syntactic complexity across three years
The first question was answered by presenting the descriptive statistics and making multiple
comparisons. Descriptive statistics are shown in Table .
As indicated in the table, the means of each variable in three years are approximate.
Comparatively speaking, however, the mean of each variable at Year 3 is a little larger than that
in the other two years.
Table Descriptive statistics
Grade Mean Std. Deviation N
1 .5335 .04594 50
2 .5329 .04390 50LV
3 .5712 .03708 50
1 .0481 .01297 50
2 .0510 .01298 50LS
3 .0599 .01605 50
1 .39003 50
2 .32082 50
CV/T
3 .48132 50
1 .3344 .08688 50
2 .3282 .08877 50DC/C
3 .4044 .09522 50
Notes: LV stands for lexical variance, LS for lexical sophistication, CV/T for syntactic complexity measure by
T-unit plus verbal phrase complexity ratio, and DC/C for syntactic complexity measured by dependent clause
ratio.
A GLM multivariate analysis was made to examine further whether there was a significant
difference in the four developmental indexes in the case of the same L2 learners in different
years. The results are displayed in Table .
Table A Multiple comparison of different indexes
95% Confidence Interval
Year Year Mean difference Std. Error Sig. Lower
Bound
Upper
BoundLV
1
2
3
*
.84957
.84957
.998
.000
2
1
3
*
.84957
.84957
.998
.000
3 12
*
*
.84957
.84957
.000
.000
1 23
*
.28145
.28145
.589
.000
.4060
2
1
3
*
.28145
.28145
.589
.009
.9860
3
1
2
*
.8816*
.28145
.28145
.009
.009
.4755
.1856
1 2
3
.0378
*
.08056
.08056
.896
.000
2 1
3
*
.08056
.08056
.896
.000
.1614
CV/T
3 1
2
.6466*
.6844*
.08056
.08056
.000
.000
.4474
.4852
.8458
.8836
1 2
3
.0062
*
.01807
.01807
.943
.001
.0509
2 1
3
*
.01807
.01807
.943
.000
.0385
DC/C
3 1
2
.0762*
.0762*
.01807
.01807
.001
.000
.0253
.0315
.1147
.1209
Note: The mean difference is significant at the .05 level.
As indicated in Table , there is no significant difference in all the four developmental
indexes between Year 1 and Year 2 (p>.05) while all the four developmental indexes present a
significant difference between Year 3 and either of the other two years (p<.05). In other words, the
L2 learners did not increase their lexical richness and syntactic complexity linearly as they
progressed from Year 1 through Year 2 to Year 3. Their lexical and syntactical development was
in Year 3, an advanced L2 learning stage.
Differences in the growth rates of lexical richness and syntactic complexity at the
two intervals
The second question was answered by comparing the respective growth rates of lexical
richness and grammatical complexity (represented by four indexes) at the first and second
intervals. Table displays the results, which are explained in reference to paired samples t-test
statistics.
Table A comparison of growth rates of lexical richness and syntactic complexity
LS
Notes: LV stands for lexical variance, LS for lexical sophistication, CV/T for syntactic complexity
measured by T-unit plus verbal phrase complexity ratio, and DC/C for syntactic complexity
measured by dependent clause ratio.
Although no statistically great difference exists in the growth rate among the four indexes
(p >.05) at the first interval, there is tendency that the L2 learners’ lexical sophistication
increases fastest and their lexical variance and CV/T show no sign of growth, with DC/C in
between. The obvious growth occurs at the second interval, though the growth rate of the indexes
presents a wide difference. Syntactic complexity represented by CV/T and DC/C grows much
faster than lexical variance (p<.05). Moreover, as illustrated in the table , syntactic
complexity tends to grow faster than lexical sophistication at the second interval, though no
significant difference is found statistically between them (p>.05). Therefore, there is a greater
increase in syntactic complexity than in lexical richness in the second period between Year Two
and Year Three. If viewed in a global way, the growths of lexical richness and syntactic
complexity exhibit the following features: firstly, though lexical variance exhibits discontinuity
from the first period to the second period like other three developmental indices, it invariably
progresses slowest. Contrastingly, lexical sophistication shows a steadily high growth rate. In
addition, despite the fact that lexical sophistication tends to present the fastest growth in the first
period, the two indices representing syntactic development outperforms it in the second period.
Relationship between lexical richness and syntactic complexity across three grades
The third question was answered by examining whether there was any relationship between
the L2 learners’ lexical richness and their syntactic complexity in the three years or grades.
Year
Mean
CV/T
DC/C
LV
Concretely, a correlation analysis was made to find out whether there was a significant
relationship between the L2 learners’ lexical richness and their syntactic complexity in the three
years. The statistical result is displayed in Table .
Table Correlations between lexical richness and syntactic complexity
Grade 1 Grade 2 Grade 3
CV/T DC/C CV/T DC/C CV/T DC/C
Pearson Correlation
LV
Sig. (2-tailed)
.068
.640
.012
.936
.147
.307
.074
.609
.448
.921
Pearson Correlation
LS
Sig. (2-tailed)
.156
.279
.125
.385
.070
.631
.592
.398
.124
As revealed in Table , there is no significant correlation between lexical richness
(represented by lexical variance and lexical sophistication) and syntactic complexity (represented
by CV/T and DC/C) (p>.05) in each grade. It is concluded that the L2 learners’ development of
lexical richness is independent of their development of syntactic complexity.
Discussion
Section reported the statistical results, which can be summarized in three aspects. To
start with, the development of the L2 learners’ lexical richness and grammatical complexity is
non-linear. Concretely speaking, there is no significant difference in lexical richness and
grammatical complexity between the L2 learners’ first two years while there is greater
development in lexical richness and grammatical complexity in the third year than in the first
two years. Furthermore, the growth rates of lexical richness and grammatical complexity indexes
are not the same at the L2 learners’ developmental stages. At the first interval, there is no
obvious difference between the growth rate of lexical richness and that of syntactic complexity.
At the second interval, however, syntactic complexity grows faster than lexical variance, but at
the same rate as lexical sophistication. Finally, across the three years, the L2 learners’ lexical
richness develops independent of their syntactic complexity, and vice versa. This section
provides tentative explanations.
Non-linear development of lexical richness and syntactic complexity
Although it was expected that the L2 learners increased their lexical richness and syntactic
complexity as years progressed, it was found statistically that the learners did not show any sign
of lexical richness or syntactic development from Year 1 to Year 2 but rather displayed their
greater lexical richness and syntactic complexity at Year 3 alone. There might be several reasons
for this.
To start with, as the learners own a smaller vocabulary size in the first and second year, two
elementary learning stages, than in the third year, an advanced learning stage, they could have
rather limited word choices in the first two years, partly contributing to no difference in lexical
variance between the first and second years but a great difference between the third and either of
the first two years.
Secondly, non-basic vocabulary is more difficult in the sense that it is only acquired in later
stages of the language acquisition process (Larsen-Freeman, 2006). The use of non-basic items
in spontaneous speech reveals a high proficiency. This is maybe a plausible account for no
significant growth of L2 learners’ lexical sophistication in the first period (Year 1-Year 2) but in
the second period (Year 2-Year 3).
Thirdly, the learners produced more simple sentences in both the first and second years than
the third year, contributing to no difference in syntactic complexity between the first and second
years but a great difference between the third and either of the first two years.
Finally, the non-linear developmental patterns of L2 learners’ lexical richness and syntactic
complexity in this study was also proved in Wen (2006)’s study, where the period between Year
2 and Year 3 saw the most noticeable progress. This could possibly be accounted for by “the
prime period for learning” hypothesis (Wen, 2006), which states that the period between Year 2
and Year 3 is an optimal stage for language development during four years’ undergraduate study.
As the teaching methods and learning environment in the university are quite different from
those in the high school, the students need to adapt to them during the first year at university. As
a consequence, the obvious growth of lexical and syntactic complexity occurs in the later period.
Lexical richness in no relation to syntactic complexity
With regard to the relationship between lexical richness and syntactic complexity, the
present study yields a surprising yet enlightening result: they progress independent of each other
in three years. This is consistent with Li (2007)’s study focusing on the written data. This may
indicate that Chinese L2 learners do not develop their lexical richness and syntactic complexity
simultaneously or in balance, no matter in their written English or spoken English. However, Li
(2007) also concluded that, compared with Swedish L2 learners, Chinese L2 learners exhibited
comparable lexical richness but quite a large gap in syntactic complexity in writing, which
indicates that more attention should be drawn to syntactic development. Different from that, the
present study shows that in L2 learners’ spoken performance lexical sophistication enjoys a
fastest growth in the first period while syntactic complexity measured by (C+VP)/T and DC/C
outperforms it in the second interval. Therefore, L2 learners’ growth of lexical richness and
syntactic complexity in the oral output reveals its unique features, which are quite distinct from
those in their written production.
Generally speaking, in spoken performance, L2 learners put more weight on lexical
development in the first interval, in particular the expansion of advanced vocabulary while in the
second interval they apparently turn more attention to the growth of syntactic complexity, thus
indicating separate developmental trajectories of lexical richness and syntactic complexity in
different periods.
Concerning lexical development, lexical sophistication constantly displays a relatively fast
and steady growth while lexical variance progresses slowest among four developmental indices,
which corresponds to the fact that Chinese L2 learners usually attach much importance to
acquisition of a large quantity of words, in particular those less frequent words while ignoring
the diversity of lexical choices. Besides, the finding that lexical sophistication and lexical
variation do not develop in tandem may also suggest that the growth of lexical sophistication
does not necessarily mean varied word choices, especially in spoken English. It is a common
phenomenon that L2 learners with abundant advanced vocabulary may only have a limited active
word repertoire and thus could not diversify their use of L2 vocabulary.
As for the development of syntactic complexity represented by (C+VP)/T and DC/C,
something enlightening could be discerned in this study. Though the developmental curves of the
two syntactic indices in table do not show much wide differences, it could still be observed
that DC/C grows at a faster pace than (C+VP)/T in the first period while (C+VP)/T surpasses it
in the second interval. This indicates that the development of syntactic complexity is featured
much by subordination in the first period while verbal phrases, including participles, gerunds and
infinitives, present a faster and more noticeable growth in the second interval, for, though both
(C+VP)/T and DC/C measure complexity through subordination, (C+VP)/T also draws upon
great varieties of verbal phrases.
5.Conclusion
Major findings of the study
The findings of the statistic analyses in Chapter Four can be summarized as follows:
First, it is found that both lexical richness and syntactic complexity of L2 learners’ oral
output show non-linear progression along their three years’ learning. Specifically speaking, only
the third year witnesses a significant growth of their lexical richness and syntactic complexity.
Second, lexical richness and syntactic complexity develop at a different pace in two periods.
In the first interval between Year One and Year Two, there is no apparent difference between the
growth rate of lexical richness and that of syntactic complexity. In the second interval between
Year Two and Year Three, syntactic complexity grows at a faster rate than lexical variance, but
at the same rate as lexical sophistication.
Lastly, there is no significant correlation between lexical richness and syntactic complexity
in each year. It is concluded that the L2 learners’ development of lexical richness is independent
of their development of syntactic complexity.
Implications
Theoretical and pedagogical implications elicited from the findings of this study are
discussed in the following two sections.
Theoretical implications
This study enriches the research on the development of L2 lexical richness and syntactic
complexity in speaking from two perspectives.
Firstly, few longitudinal studies in this area were carried out in the context of China,
especially on oral output (Wen, 2006a). Consequently, this study proves to be meaningful by
furthering our understanding of the dynamic patterns of lexical richness and syntactic complexity
of Chinese L2 learners in their oral performance.
Secondly, based on C/T, this study proposes a new developmental index for measuring
syntactic complexity—(C+VP)/T, which takes verbal phrases into account besides the degree of
subordinating when measuring syntactic complexity, and is proved valid and reliable in the
present study. It opens up a new way for assessing syntactic complexity for future research.
Pedagogical implications
The present study also provides insights into the acquisition of vocabulary and syntax for
learners in China as well as the teaching and testing of oral English.
One implication is related to the result that lexical richness and syntactic complexity neither
develop simultaneously nor at the same rate. Such a result indicates that L2 learners should pay
equal attention to both lexical and syntactic development and balance the growth of the two. In
particular, at the beginning stage, L2 learners should also lay emphasis on the growth of their
syntactic complexity besides that of their lexical richness.
Another implication comes from the fact that lexical variance progresses at the slowest pace
among four developmental indices (LV, LS, (C+VP)/T, DC/C) no matter in the first period or in
the second period. The efficiency and precision of the students’ acquisition process may be
improved by encouraging increased class and /or individual student awareness of lexical choices.
Students should also learn to increase their stock of lexical choices and try to use more varied
words in oral production.
What is also worth mentioning is that L2 learners should attach much importance to the
acquisition and application of verbal phrases in the development of syntactic complexity in
addition to that of variegated subordinate clauses.
The last implication is for teachers. This study indicates that better spoken performance is
featured by greater diversity of lexical choices, more use of relatively sophisticated words and
syntactic constructions. So, teachers are hereby advised to devote more attention to such aspects
of vocabulary and syntax in L2 learners’ oral production through some activities, such as timely
comments on students’ oral performance, giving them more opportunities to communicate with
native speakers and introducing to them diversified syntactic constructions and vocabulary and
the like.
References
Bardovi-Harlig, K., & Bofman, T. (1989). Attainment of syntactic and morphological
accuracy by advanced language learners. Studies in Second Language Acquisition, 11,
17-34.
Bardovi-Harlig, K. (1992). A second look at T-unit analysis: Reconsidering the sentence.
TESOL Quarterly, 26, 390-395.
Beers, S. F., & Nagy, W. E. (2007). Syntactic complexity as a predictor of adolescent writing
quality: Which measures? Which genre? (online), pp. 1-16. Netherlands: Springer.
Casanave, C. (1994). Language development in students' journals. Journal of Second
Language writing, 3, 179-201.
Cooper, T. C. (1976). Measuring written syntactic patterns of second language learners of
German. The Journal of Educational Research, 69, 176-183.
Cragg, L., & Nation, K. (2006). Exploring written narrative in children with poor reading
comprehension. Educational Psychology, 26, 55–72.
Crowhurst, M. (1980). Syntactic complexity and teachers’ quality ratings of narrations and
arguments. Research in the Teaching of English, 14, 223–231.
Crowhurst, M. (1983). Syntactic complexity and writing quality: A review. Canadian
Journal of Education, 8, 1–16.
Engber, C. A. (1995). The relationship of lexical proficiency to the quality of L2
compositions. Journal of Second Language Writing, 4 (2), 139-155.
Flahive, D. E., & Snow, B. G.. (1980). Measures of syntactic complexity in evaluating ESL
compositions. In J. W. Oller & K. Perkins (Eds.), Research in language testing,
-176. Rowley, MA: Newbury House.
Foster, P., Tonkyn, A., & Wigglesworth, G.. (2000). Measuring spoken Language: A unit
for all reasons. Applied Linguistics, 21, 354-75.
Gitsaki, C. (1999). Second language lexical acquisition: a study of the development of
collocational knowledge. San Francisco-London-Bethesda: International Scholars
Publications.
Hirano, K. (1991). The effect of audience on the efficacy of objective measures of EFL
proficiency in Japanese university students’. Annual Review of English Language
Education in Japan, 2, 21-30.
Hunt, K. W. (1965). Grammatical structures written and three grade levels. Champaign, IL:
National Council of Teachers of English.
Hunt, K. W. (1970). Syntactic maturity in school children and adults. Monographs of the
Society for Research in Child Development, 35 (134), pp. 1-67.
Hyltenstam, K. (1988). Lexical characteristics of near-native second-language learners of
Swedish. Journal of Multilingual and Multicultural Development, 9, 67-84.
Ishikawa, S. (1995). Objective measurement of low-proficiency EFL narrative writing.
Journal of Second Language Writing, 4, 51-70.
James, M. (2002). Process writing and vocabulary development: Comparing Lexical
Frequency Profiles across drafts. System, 30, 225-235.
Kameen, P. T. (1979). Syntactic skill and ESL writing quality. In C. Yorio, K. Perkins, & J.
Schachter (Eds.), On TESOL’ 79: The learner in focus, pp343-364. Washington, D. C.:
TESOL.
Larsen –Freeman, D. (2006). The Emergence of Complexity, Fluency, and Accuracy in the
Oral and Written Production of Five Chinese Learners of English. Applied Linguistics,
27 (4), 590-619.
Laufer, B. (1991). The Development of L2 Lexis in the Expression of the Advanced
Learner. The Modern Language Journal, 75, 440-448.
Laufer, B. (1994). The lexical profile of second language writing: Does it change over time?
RELC Journal, 25, 21-33.
Laufer, B., & Nation, P. (1995). Vocabulary size and use: lexical richness in L2 written
production. Applied Linguistics, 16(3), 307-322 .
Laufer, B. (1998). The development of passive and active vocabulary in a second language:
same or different? Applied Linguistics, 19 (2), 255- 271.
Linnarud, M. (1986). Lexis in Composition: A Performance Analysis of Swedish L2
learner’s Written English. Malmö: CWK Gleerup.
Morris, T. N., & Crump, . (1982). Syntactic and Vocabulary Development in the
Written Language of Learning Disabled and Non-Learning Disabled Students at Four
Age Levels. Learning Disability Quarterly, 5(2), 163-172.
Nation, P., & Heatley, A. (1994). VocabProfile: A program for analyzing vocabulary in
texts. Wellington, New Zealand: Victoria University of Wellington.
Nation, P., & Coxhead, A. (2002). RANGE..
Nihanani, N. K. (1981). The quest for the L2 index of development. RELC Journal, 12,
50-56.
Nippold, M. A., Hesketh, L. J., & Duthie, J. K. (2005). Conversational versus expository
discourse: A study of syntactic development in children, adolescents, and adults.
Journal of Speech, Language, and Hearing Research, 48, 1048–1064.
Nippold, M. A., Mansfield, T. C., & Billow, J. L. (2007). Peer conflict explanations in
children, adolescents, and adults: Examining the development of complex syntax.
American Journal of Speech-Language Pathology, 16, 179–188.
Ortega, L. (2003). Syntactic Complexity Measures and their Relationship to L2 Proficiency:
A Research Synthesis of College-level L2 Writing. Applied Linguistics, 24 (4),
492-518.
Ortega, L., & Iberri-Shea, G. (2005).Longitudinal research in second language acquisition:
recent trends and future directions. Annual Review of Applied Linguistics, 25, 26-45.
Perkins, K. (1980). Using objective methods of attained writing proficiency to discriminate
among holistic evaluation. TESOL Quarterly, 14, 61-69.
Sharma, A. (1980). Syntactic maturity: Assessing writing proficiency in a second language.
In R. Silverstein (Ed.), Occasional Papers in Linguistics, 6, pp. 318-325. Carbondale:
Southern Illionois University.
Shaw, P., & E. Liu. (1998). What develops in the development of second-language writing?
Applied Linguistics, 19, 225-254.
Vermeer, A. (2000). Coming to grips with lexical richness in spontaneous speech data.
Language Testing, 17 (1), 65-83.
Wolfe-Quintero, K., Inagaki, S. & Hae-Young, K. (1998). Second Language Development in
Writing: Measures of Fluency, Accuracy, & Complexity. Manoa: University of Hawai’i.
李长生,2007,中国大学生是否真需要词汇复杂度:一项基于SWECCL学习者语料库
的研究。《江苏外语教学研究》第1期, 26-30。
刘东虹,2003,词汇量在英语写作中的作用。《现代外语》第2期,180-197。
李景泉,蔡金亭,2001,中国学生英语写作中的冠词误用现象——项基于语料库的研
究。《解放军外国语学院学报》第6期, 58-62。
倪岚,2000,英语专业二年级学生写作词汇的研究。《国外外语教学》第2期, 38-41。
秦晓晴,文秋芳,2007,中国大学生英语写作能力发展规律与特点研究。北京:中国
社会科学出版社。
文秋芳,2006,英语专业学生口语词汇变化的趋势与特点。《外语教学与研究》第3期,
189-195。
文秋芳,2006,英语专业学生口语词汇进步模式研究。《外语电化教学》第4期,3-8。
吴旭东,陈晓庆, 2000, 中国英语学生课堂环境下词汇能力的发展,《现代外语》第4
期, 349-360。
俞华,2004,中国学生英语写作中的句法运用—基于语料库的分析。硕士研究生论文,
广东外语外贸大学。
张萍,2007,不同二语学习者词汇复杂度的语料库对比研究。《中国外语》第4卷,第3
期,54-59。