Multidimensional scaling for typological analysis and its implications for typology, cognition and the brain

William Croft

University of Manchester


A fundamental fact about grammatical structure is that it is highly variable both across languages and within languages. Typological analysis has drawn language universals from grammatical variation through implicational universals, implicational hierarchies, and more recently, semantic maps. With larger-scale crosslinguistic studies and high levels of grammatical variation, these methods are inadequate, and the most sophisticated of these, semantic maps, while theoretically well-motivated in typology, is not mathematically well-defined. Multidimensional scaling (MDS) offers a powerful, formalized tool that allows linguists to infer language universals from highly complex and large-scale datasets. The Optimal Classification nonparametric unfolding algorithm is applied to crosslinguistic data (Keith Poole, to appear; this work is a joint collaboration). MDS is compared to previous work, including Haspelmath's semantic map analysis of indefinite pronouns, Levinson et al''s MDS analysis of spatial adpositions, and Dahl's analysis of tense and aspect.

The success of MDS analysis in uncovering language universals has major consesquences for understanding the relationship between language universals, cognition and the brain. MDS works best with large-scale datasets, implying the centrality of grammatical variation in inferring language universals and the importance of examining as wide a range of grammatical behavior as possible both within and across languages.Our results demonstrate that neither an extreme absolutist position - all languages are cut from the same cloth - or an extreme relativist position - languages are organized in radically different ways - is tenable. Nevertheless, the relationship between language and cognition implied by our MDS analyses is very indirect. The clusters revealed by MDS are universal cognitive categories, not universal linguistic categories. Linguistic categories are constrained by the structure of cognition but vary widely within that structure.

The MDS analysis also demonstrates that complex linguistic behavior can be analyzed successfully by low-dimensional models. This is also true of psychological, political and economic behavior. Human beings are able to reduce the immense complexity of the world, including their languages, into a small, manageable number of conceptual dimensions and configurations. The convergence of these different types of behavior on low-dimensional models implies this is probably a fundamental cognitive ability based in brain structure.

Finally, the MDS analyses imply a plausible model of language learning. A child develops a low-dimensional model of (dis)similarities between situations, presumably through a combination of innate abilities and interaction with her environment. As the child comprehends linguistic expressions used to describe these situations, she begins to approximate the cutting lines for the words and constructions of her language. As the child is exposed to more and more linguistic expressions and the situations they describe, the cutting line defining the word or construction category is more precisely placed in the conceptual space. The structure of the space and the positioning of the cutting line allows the child to use the word or construction productively for new situations that are similar in the right ways to the known points on the right side of the word or construction's cutting line.