Corpora and Semantic Analysis

Suzanne Kemmer
Rice University

 

 

Corpora have been used for various purposes in Linguistics and language teaching. In this talk I discuss how linguistic corpora can be used to arrive at semantic analyses of specific linguistic units and larger networks of units.

I begin with an introduction to the usage-based conception of language (Langacker 1988, 2000) and contrast its essential nature with the nature of other approaches. An overview of methodologies for uncovering semantic properties of linguistic units follows, including structuralist methods, generativist methods, and what I term "Stage I" Cognitive Linguistic analyses (polysemy and metaphor analyses of the 1980s-90s).

The use of corpus frequency data is then introduced, and its theoretical importance in the usage based model explained. Brief examples are given of the use of frequency data in modern "Stage II" Cognitive analyses (2000s) of various types: the semantic analysis of lexical items (e.g. deep vs. shallow); the semantics of constructions (e.g. the English make causative); and the interaction of lexical items and constructions (e.g. with respect to English motion verbs and prepositions in motion constructions).

The basic relation between frequency in a corpus and semantic acceptability judgments is described within the usage-based theory, and some predictions are offered. The phenomenon of constructional coercion is interpreted in terms of this framework. After a concluding summary of the main points, some advantages of integrating corpus linguistic methodology with cognitive semantics are briefly sketched.