Beyond Corpora: Elicitation as a tool in second
language word formation studiesGregLessardFrench Studies Queen's University lessardg@qsilver.queensu.caMichaelLevisonComputing and Information Service Queen's
University at Kingstonlevison@cs.queensu.ca1999University of VirginiaCharlottesville, VAACH/ALLC 1999editorencoderSaraA.SchmidtBackgroundIn this paper, we will be concerned with the study of second language (L2)
linguistic creativity as it manifests itself on the lexical level,
particularly with respect to word formation. More specifically, we want to
measure the extent to which anglophone learners of French of varying degrees
of experience are capable of judging the relative productivity of different
suffixes, and how this performance compares to that of native speakers.The measures of productivity of word-formation devices which have been used
to date, such as lexicographical data (Dubois 1962) and corpus data (Baayen
and Renouf 1996, Baayen and Lieber 1991) do not lend themselves well to the
study of L2 productivity. Among other things, L2 corpora show relatively
small amounts of productivity (Lessard, Levison, Maher and Tomek 1994,
Boeder et alii 1993). Despite issues of reliability and interpretation (see
Birdsong 1989, Gass 1994, Coppetiers 1987), elicitation appears to offer a
potentially useful measure.However, to our knowledge, little research has been done on elicitation to
test lexical productivity in French, particularly among L2 learners. Hawkins
(1985) forms one exception, but he was concerned primarily with the class of
past participle markers, which are at the borderline of affixation and
derivation.The research described here draws upon and extends previous work on native
speaker judgements of lexical productivity, including Aronoff and
Schvaneveldt 1978, Gorska 1982, Anshen and Aronoff 1991, Levison and
Lessard, 1995a, Fowler and Liberman 1995 and Keane and Costello 1997. It
should be noted however that very little previous work used a computational
environment, whereas this is central to the work presented here, and is
based on the VINCI natural language generation environment.Experiments and resultsThe results discussed in the paper represent the third of three stages. In the first, judgements of acceptability were elicited from native speakers
of French for derived forms in -able, -age, -ment, -tion and -ure. In a
nutshell, a French lexicon sorted by frequency was used to provide verbs of
the first conjugation. Suffixes were added automatically, and at random. A
randomized subset of the resulting derived forms was presented to each
subject, along with the base form of each verb. Subjects were required to
identify which verbs were known to them (almost all, in the case of the
native speakers) and which derived forms they found acceptable. Results
defined a continuum of relative acceptability with -able at the top of the list, followed by -age, -ment, -tion and -ure in that order. It is
important to note that the variable being measured was neither the
correctness of the judgements (whether an existing derived form corresponded
to those seen as acceptable) nor the individual lexical items, but rather
the ranking of suffixes in terms of the number of derived forms they were
seen to be capable of producing.In the second stage, the same protocol was applied to non-native speakers
ranging from some with high school training in French to some with
significant university level studies in French. Results of these tests
showed that as knowledge of verb bases decreased, non-native speakers showed
increasing discrepancy in their judgements with respect to native speaker
rankings.The third stage addressed problems found in the second: the absence of an
external measure on which to rank the linguistic skills of the subjects
tested, and the relatively high level of linguistic competence of almost all
the subjects tested. In response to these problems, the experiment was
repeated using the same protocols. Subjects tested were students in oral
French classes at Queen's University. Five levels were represented, ranging
from 016 to 320, where 016 represented those with essentially no previous
knowledge of French, while 320 represented those with near-native
proficiency. Placement in these classes had been done on the basis of an
oral interview.As an illustration, the results for three of the five suffixes are reproduced
in the following table. In the table, the columns suffix ok and suffix not ok
represent the average of all responses for each class on the basis of 20
questions. In principle, each row should sum to 20, however because some
students responded to less than 20 questions, some small variations are
found. Verb known and verb
not known represent subjects' claimed knowledge of the base
verb.SuffixCourseVerb KnownVerb not KnownSuffix okSuffix not okSuffix okSuffix not ok01622.252.7512.750173.751.58.756-able1186.52.5742199.43.72.43.83209.14.12.13.7016231.5130174.753.258.53.5-ment11854.59.512198.74.32.73.63204.68.12.14.60161.54.252.511.750171.754.753.59.75-ure1182.59262191.310.91.35.73202.91015.4The table shows that in all cases, the number of base verbs known rises with
level, suggesting that knowledge of verbal base and placement interview
results are measuring comparable things. As well, bearing in mind that the
acceptance rates by native speakers for derived forms based on -able, -ment and -ure were 77%, 39% and 10% respectively, we see in
the non-native speaker data a gradual convergence on native speaker-like
judgements. Thus, in the case of -able, while
016 students find little to choose between accepting of rejecting derived
forms for verbs they know (2 acceptances and 2.25 rejections) and reject
strongly derived forms for verbs they don't know (2.75 acceptances versus
12.75 rejections), more advanced students tend to accept derived forms for
verbs they know (in the case of 219, 9.4 acceptances versus 4 rejections)
while rejecting somewhat derived forms in -able
for verbs they don't know.Conclusions and future directionsThis data hides considerable variation which will be elaborated during the
presentation. However, it confirms that the measure is robust, even with
students with relatively lower skill levels in French. In the paper, more
detailed analyses will be presented. As well, an extended range of measuring
instruments will be discussed, based on contextualizing examples to be
evaluated in a sentence generated on the fly, other types of measures (see
Feldman 1995 for examples) and scales (see Bard, Robertson and Sorace
1996).ReferencesF.AnshenM.AronoffMorphological Productivity and Phonological
TransparencyCanadian Journal of Linguistics26163-72,1981M.AronoffR.SchvaneveldtTesting morphological productivityAnnals of the New York Academy of Sciences318106-1141978H.BaayenR.LieberProductivity and English derivation: a corpus-based
studyLinguistics294801-8441991H.BaayenA.RenoufChronicling the Times: Productive Lexical Innovations
in an English NewspaperLanguage72169-961996E.BardD.RobertsonA.SoraceMagnitude Estimation of Linguistic
AcceptabilityLanguage72132-671996D.BirdsongMetalinguistic Performance and Interlinguistic
CompetenceBerlinSpringer-Verlag1989P.BroederG.ExtraR.van HoutK.
>VoionmaaWord formation processes in talking about
entitiesC.PerdueAdult language acquisition: cross-linguistic
perspectivesvol. 2CambridgeCambridge University Press199341-72R.CoppetiersCompetence Differences between Native and Near-Native
SpeakersLanguage633544-5731987J.DuboisÈtude sur la dérivation suffixale en francais moderne
et contemporain; essai d'interprétation des mouvements observés dans
le domaine de la morphologie des mots construitsParisLarousse1962L.B.FeldmanMorphological aspects of language processingHillsdale, NJLawrence Erlbaum1995A.FowlerI.LibermanThe Role of Phonology and Orthography in Morphological
AwarenessL.B.FeldmanMorphological aspects of language processingHillsdale, NJLawrence Erlbaum1995157-188S.GassThe Reliability of Second-Language Grammaticality
JudgementsE.TaroneS.GassA.CohenResearch Methodology in Second-Language
AcquisitionHillsdale, NJLawrence Erlbaum1994303-322ElzbietaGorskaA way of testing the productivity of word formation
rules (WFRs)?Studia Anglica Posaniensa141169-1741982R.HawkinsErrors in the Use of French Past Participles By Foreign
Speakers and Their Implications for a Model of MorphologyLingua67171-1881985M.KeaneF.CostelloWhere Do "Soccer Moms" Come From? Cognitive Constraints
on Noun-Noun Compounding in EnglishProceedings, Computational Models of Creative Cognition
Conference, Dublin1997G.LessardM.LevisonLexical Creativity in L2 FrenchInternation Review of Applied Linguistics(in press)G.LessardM.LevisonD.MaherI.TomekModelling Second Language Learner CreativityJournal of Artificial Intelligence in Education54455-4801994G.LessardM.LevisonExperiments in Word CreationACH/ALLC 95, Santa Barbara Conference Abstracts199574-77