Word Grammar New Perspectives on a Theory of Language Structure
This page intentionally left blank
Word Grammar New Perspectives on a Theory of Language Structure
edited by Kensei Sugayama and Richard Hudson
continuum
Continuum The Tower Building 11 York Road
15 East 26th Street New York
London SE1 7NX
NY 10010
© Kensei Sugayama and Richard Hudson 2005 All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publishers.
First published 2006 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. ISBN:
0-8264-8645-2 (hardback)
Library of Congress Cataloguing-in-Publication Data To come Typeset by BookEns Ltd, Royston, Herts. Printed and bound in Great Britain by MPG Books Ltd, Bodmin, Cornwall
The problem of the word has worried general linguists for the best part of a century. -P. H. Matthews
This page intentionally left blank
Contents Contributors Preface Kensei Sugayama
xi xiii
1
Introduction 1. What is Word Grammar? Richard Hudson 1. A Brief Overview of the Theory 2. Historical Background 3. The Cognitive Network 4. Default Inheritance 5. The Language Network 6. The Utterance Network 7. Morphology 8. Syntax 9. Semantics 10. Processing 11. Conclusions
3 5 7 12 13 15 18 21 24 27 28
Part I Word Grammar Approaches to Linguistic Analysis: Its explanatory power and applications
33
2.
3.
Case Agreement in Ancient Greek: Implications for a theory of covert elements Chet Creider and Richard Hudson 1. Introduction 2. The Data 3. The Analysis of Case Agreement 4. Non-Existent Entities in Cognition and in Language 5. Extensions to Other Parts of Grammar 6. Comparison with PRO and pro 7. Comparison with Other PRO-free Analyses 8. Conclusions Understood Objects in English and Japanese with Reference to Eat and Taberui A Word Grammar account Kensei Sugayama 1. Introduction 2. Word Grammar
3
35 35 35 41 42 46 49 50 52
54 54 56
viii
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
4. Taberu in Japanese 5. Conclusion
58 60 63
The Grammar of Be To: From a Word Grammar point of view
67
3. Eat in English
4.
Kensei Sugayama 1. Introduction and the Problem 2. Category of Be 3. Modal Be in Word Grammar 4. Morphological Aspects 5. Syntactic Aspects 6. Semantics of the Be To Construction 7. Should To be Counted as Part of the Lexical Item? 8. A Word Grammar Analysis of the Be To Construction 9. Conclusion 5.
6.
Linking in Word Grammar
Jasper Holmes 1. Linking in Word Grammar: The syntax semantics principle 2. The Event Type Hierarchy: The framework; event types; roles and relations 3. Conclusion
103 114
Word Grammar and Syntactic Code-Mixing Research
117
Eva Eppler 1. Introduction 2. Constituent Structure Grammar Approaches to Intra-Sentential Code-Mixing 3. A Word Grammar Approach to Code-Mixing 4. Word Order in Mixed and Monolingual 'Subordinate' Clauses 5. Summary and Conclusion 7.
67 68 69 70 71 72 75 77 81 83
Word Grammar Surface Structures and HPSG Order Domains Takafumi Maekawa 1. Introduction 2. A Word Grammar Approach 3. An Approach in Constructional HPSG: Ginzburg and Sag 2000 4. A Linearization HPSG Approach 5. Concluding Remarks
83
117 118 121 128 139 145 145 146 154 160 165
Part II Towards a Better Word Grammar
169
8.
Structural and Distributional Heads
171
Andrew Rosta 1. Introduction 2. Structural Heads
171 172
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE 3. Distributional Heads 4. Thai-Clauses 5. Extent Operators 6. Surrogates versus Proxies 7. Focusing Subjuncts: just, only, even 8. Pied-piping 9. Degree Words 10. Attributive Adjectives 11. Determiner Phrases 12. The type of Construction 13. Inside-out Interrogatives 14. 'Empty Categories' 15. Coordination 16. Correlatives 17. Dependency Types 18. Conclusion 9.
Factoring Out the Subject Dependency Nikolas Gisborne 1. Introduction 2. Dimensions of Subjecthood 3. The Locative Inversion Data 4. Factored Out Subjects 5. Conclusions
Conclusion Kensei Sugayama Author Index Subject Index
ix
172 174 174 177 179 181 181 182 182 184 185 187 189 191 191 199 204 204 205 210 216 222 225 227 229
This page intentionally left blank
Contributors RICHARD HUDSON is Professor Emeritus of Linguistics at University College London. His research interest is the theory of language structure; his main publications in this area are about the theory of Word Grammar, including Word Grammar (1984, Oxford: Blackwell); English Word Grammar (1990, Oxford: Blackwell) and a large number of more recent articles. He has also taught sociolinguistics and has a practical interest in educational linguistics. Website: www. phon. ucl. ac. uk/home/dick/home. hrm Email: dick@linguistics. ucl. ac. uk KENSEI SUGAYAMA, Professor of English Linguistics at Kobe City University of Foreign Studies. Research interests: English Syntax, Word Grammar, Lexical Semantics and General Linguistics. Major publications: 'More on unaccusative Sino-Japanese complex predicates in Japanese' (1991). UCL Working Papers in Linguistics 3; 'A Word-Grammatic account of complements and adjuncts in Japanese' (1994). Proceedings of the 15th International Congress of Linguists; 'Speculations on unsolved problems in Word Grammar' (1999). The Kobe City University Journal 50. 7; Scope of Modern Linguistics (2000, Tokyo: Eihosha); Studies in Word Grammar (2003, Kobe: Research Institute of Foreign Studies, KCUFS). Email: ken@inst. kobe-cufs. ac. jp CHET CREIDER, Professor and Chair, Department of Anthropology, University of Western Ontario, London, Ontario, Canada. Research interests: morphology, syntax, African languages. Major publications: Structural and Pragmatic Factors Influencing the Acceptability of Sentences with Extended Dependencies in Norwegian (1987, University of Trondheim Working Papers in Linguistics 4); The Syntax of the Nilotic Languages: Themes and variations (1989, Berlin: Dietrich Reimer); A Grammar of Nandi (1989, with J. T. Creider, Hamburg: Helmut Buske); A Grammar of Kenya Luo (1993, ed. ); A Dictionary of the Nandi Language (2001, with J. T. Creider, Koln: Riidiger Koppe). Email: creider@uwo. ca ANDREW ROSTA, Senior Lecturer, Department of Cultural Studies, University of Central Lancashire, UK. Research Interests: all aspects of English grammar. Email: a. rosta@v21. me. uk NIKOLAS GISBORNE is a lecturer in the Department of Linguistics and English Language at the University of Edinburgh. His research interests are in lexical semantics and syntax, and their interaction in argument structure.
xii
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Website: www. englang. ed. ac. uk/people/nik. html Email: n. gisborne@ed. ac. uk JASPERW. HOLMES is a self-employed linguist who has worked with many large organizations on projects in lexicography, education and IT. Teaching and research interests include syntax and semantics, lexical structure, corpuses and other IT applications (linguistics in computing, computing in linguistics), language in education and in society, the history of English and English as a world language. His publications include 'Synonyms and syntax' (1996, with Richard Hudson, And Rosta, Nik Gisborne). Journal of Linguistics 32; 'The syntax and semantics of causative verbs' (1999). UCL Working Papers in Linguistics 11; 'Re-cycling in the encyclopedia' (2000, with Richard Hudson), in B. Peeters (ed. ) The Lexicon-Encyclopedia Interface (Amsterdam: Elsevier); 'Constructions in Word Grammar' (2005, with Richard Hudson) in Jan-Ola Ostman and Mirjam Fried (eds) Construction Grammars: Cognitive Grounding and Theoretical Extensions (Amsterdam: Benjamins). Email: jasper. holmes@gmail. com EVA EPPLER, Senior Lecturer in English Language and Linguistics, School of Arts, University of Roehampton, UK. Research Interests: morpho-syntax of German and English, syntax-pragmatics interface, code-mixing, bilingual processing and production, sociolinguistics of multilingual communities. Recent main publication: '"... because dem Computer brauchst' es ja nicht zeigen. ": because + German main clause word order' International Journal of Bilingualism 8. 2 (2004), pp. 127-44. Email: evieppler@hotmail. com TAKAFUMI MAEKAWA, PhD student, Department of Language and Linguistics, University of Essex. Research Interests: Japanese and English syntax, HeadDriven Phrase Structure Grammar and lexical semantics. Major publication: 'Constituency, Word Order and Focus Projection' (2004). The Proceedings of the llth International Conference on Head-Driven Phrase Structure Grammar. Center for Computational Linguistics, Katholieke Universiteit Leuven, August 3-6. Email: maekawa@btinternet. com
Preface This volume comes from a three-year (April 2002-March 2005) research project on Word Grammar supported by the Japan Society for the Promotion of Science, the goal of which is to bring together Word Grammar linguists whose research has been carried out in this framework but whose approaches to it reflect differing perspectives on Word Grammar (henceforth WG). I gratefully acknowledge support for my work in WG from the Japan Society for the Promotion of Science (grant-in-aid Kiban-Kenkyu C (2), no. 14510533 from April 2002-March 2005). The collection of papers was planned so as to introduce the readers into this theory and to include a diversity of languages, to which the theory is shown to be applicable, along with critique from different theoretical orientations. In September 1994 Professor Richard Hudson, the founder of Word Grammar, visited Kobe City University of Foreign Studies to give a lecture in WG on a part of his lecturing trip to Japan. His talks were centred on advances in WG at that time, which refreshed our understanding of the theory. Professor Hudson has been writing in a very engaging and informative way for about two quarters of a century in the world linguistics scene. Word Grammar is a theory of language structure which Richard Hudson, now Emeritus Professor of Linguistics at University College London, has been building since the early 1980s. It is still changing and improving in detail, yet the main ideas remain the same. These ideas themselves developed out of two other theories that he had tried: Systemic Grammar (now known as Systemic Functional Grammar), due to Michael Halliday, and then DaughterDependency Grammar, his own invention. Word Grammar fills a gap in the study of dependency theory. Dependency theory may not belong to the mainstream in the Western World, especially not in America, but it is gaining more and more attention, which it certainly deserves. In Europe, dependency has been better known since the French linguist Lucien Tesniere's study in the 1950s (cf. Hudson, this volume). I just mention here France, Belgium, Germany and Finland. Dependency theory now also rules Japan in the shape of WG. Moreover, the notion of head, the central idea of dependency, has been introduced into virtually all modern linguistic theories. In most grammars, dependency and constituency are used simultaneously. However, this adduces the risk of making these grammars too powerful. WG's challenge is to eliminate constituency from grammar except in coordinate structures, although certain dependency grammars, especially the German ones, refuse to accept constituency for coordination. Richard Hudson's first book was the first attempt to write a generative (explicit) version of Systemic Grammar (English Complex Sentences: An
xiv
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Introduction to Systemic Grammar, North Holland, 1971); and his second book was about Daughter-Dependency Grammar (Arguments for a Non-transformational Grammar, University of Chicago Press, 1976). As the latter tide indicates, Chomsky's transformational grammar was very much 'in the air', and both books accepted his goal of generative grammar but offered other ideas about sentence structure as alternatives to his mixture of function-free phrase structure plus transformations. In the late 1970s when Transformational Grammar was immensely influential, Richard Hudson abandoned Daughter-Dependency Grammar (in spite of its drawing a rave review by Paul Schachter in Language 54, 348-76). His exploration of various general ideas that hadn't come together became an alternative coherent theory called Word Grammar, first described in the 1984 book Word Grammar and subsequently improved and revised in the 1990 book English Word Grammar. Since then the details have been worked out much better and there is now a workable notation and an encyclopedia available on the internet (cf. Hudson 2004). The newest version of Word Grammar is now on its way (Hudson in preparation). The time span between the publication of Richard Hudson's Word Grammar (1984) and this volume is more than two decades (21 years to be precise). The intervening years have seen impressive developments in this theory by the WG grammarians as well as those in other competitive linguistic theories such as Minimalist Programme, Head-driven Phrase Structure Grammar (HPSG), Generalized Phrase Structure Grammar (GPSG), Lexical Functional Grammar (LFG), Construction Grammar and Cognitive Grammar. Here are the main ideas, most of which come from the latest version of the WG homepage (Hudson 2004), together with an indication of where they came from: • It is monostratal - only one structure per sentence, no transformations. (From Systemic Grammar). • It uses word-word dependencies - e. g. a noun is the subject of a verb. (From John Anderson and other users of Dependency Grammar, via Daughter-Dependency Grammar; a reaction against Systemic Grammar where word-word dependencies are mediated by the features of the mother phrase. ) • It does not use phrase structure - e. g. it does not recognize a noun phrase as the subject of a clause, though these phrases are implicit in the dependency structure. (This is the main difference between DaughterDependency Grammar and Word Grammar. ) • It shows grammatical relations/functions by explicit labels - e. g. 'subject' and 'object'. (From Systemic Grammar). • It uses features only for inflectional contrasts - e. g. tense, number but not transitivity. (A reaction against excessive use of features in both Systemic Grammar and current Transformational Grammar. ) • It uses default inheritance, as a very general way of capturing the contrast between 'basic' or 'underlying' patterns and 'exceptions' or 'transformations' - e. g. by default, English words follow the word they depend on, but exceptionally subjects precede it; particular cases 'inherit' the default pattern
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
xv
unless it is explicitly overridden by a contradictory rule. (From Artificial Intelligence). • It views concepts as prototypes rather than 'classical' categories that can be defined by necessary and sufficient conditions. All characteristics (i. e. all links in the network) have equal status, though some may for pragmatic reasons be harder to override than others. (From Lakoff and early Cognitive Linguistics, supported by work in sociolinguistics). • It presents language as a network of knowledge, linking concepts about words, their meanings, etc. - e. g. twig is linked to the meaning 'twig', to the form /twig/, to the word-class 'noun', etc. (From Lamb's Srratificational Grammar, now known as Neurocognitive Linguistics). • In this network there are no clear boundaries between different areas of knowledge - e. g. between 'lexicon' and 'grammar', or between 'linguistic meaning' and 'encyclopedic knowledge'. (From early Cognitive Linguistics - and the facts). • In particular, there is no clear boundary between 'internal' and 'external' facts about words, so a grammar should be able to incorporate sociolinguistic facts - e. g. the speaker of jazzed is an American. (From Sociolinguistics). In this theory, word-word dependency is a key concept, upon which the syntax and semantics of a sentence build. Dependents of a word are subcategorized into two types, i. e. complements and adjuncts. These two types of dependents play an important role in this theory of grammar. Let me give you a flavour of the syntax and semantics in WG, as shown in Figure 1:
Figure 1
xvi
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Contributors to this volume are primarily WG grammarians across the world who participated in the research organized by myself, and I am also grateful for being able to include critical work by Maekawa of the University of Essex, who is working in a different paradigm. All the papers here manifest what I would characterize as theoretical potentialities of WG, exploring how powerful WG is to offer analyses for linguistic phenomena in various languages. The papers we have collected come from varying perspectives (formal, lexical-semantic, morphological, syntactic, semantic) and include work on a number of languages, including English, Ancient Greek, Japanese and German. Phenomena studied include verbal inflection, case agreement, extraction, construction, code-mixing, etc. The papers in this volume span a variety of topics, but there is a common thread running through them: the claim that word-word dependency is fundamental to our analysis and understanding of language. The collection starts with a chapter on WG by Richard Hudson which serves to introduce the newest version of WG. The subsequent chapters are organized into two sections: Part I: Word Grammar Approaches to Linguistic Analysis: its explanatory power and applications Part II: Towards a Better Word Grammar Part I contains seven chapters, which contribute to recent developments in WG and explore how powerful WG is to analyze linguistic phenomena in various languages. They deal with formal, lexical, morphological, syntactic and semantic matters. In this way, these papers give a varied picture of the possibilities of WG. In Chapter 2, Creider and Hudson provide a theory of covert elements, which is a hot issue in linguistics. Since WG has hitherto denied the existence of any covert elements in syntax, it has to deal with claims such as the one that covert case-bearing subjects are possible in Ancient Greek. As the authors say themselves, their solution is tantamount to an acceptance of some covert elements in syntax, though in every case the covert element can be predicted from the word on which it depends. The analysis given is interesting because the argument is linked to dependency. It is more sophisticated than the simple and undefined Chomskyan notion of PRO element. In Chapter 3, Sugayama joins Creider and Hudson in detailing an analysis of understood objects in English and Japanese, albeit at the level of semantics rather than syntax. He studies an interesting contrast between English and Japanese concerning understood objects. Unlike English and most other European languages, Japanese is quite unique in allowing its verbs to miss out their complements on the condition that the speaker assumes that they are known to the addressee. The reason seems to be that in the semantic structure of the sentences, there has to be a semantic argument which should be, but is not, mapped onto syntax as a syntactic complement. The author adduces a WG solution that is an improvement on Hudson's (1990) account. Sugayama shares with the preceding chapter an in-depth lexical-semantic
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
xvii
analysis in order to address the relation between a word and the construction. In Chapter 4, he attempts to characterize the be to construction within the WG framework. He has shown that a morphological, syntactic and semantic analysis of be in the be to construction provides evidence for the category of be in this construction. Namely, be is an instance of modal verb in terms of morphology and syntax, while the sense of the whole construction is determined by the sense of 'to'. In Chapter 5, Holmes in a very original approach develops an account for the linking of syntactic and semantic arguments in the WG approach. Under the WG account, both thematic and linking properties are determined at both the specific and the general level. This is obviously an advantage. In Chapter 6, Eppler draws on experimental studies concerning the codemixing and successfully extends WG to an original and interesting area of research. Constituent-based models have difficulties accounting for mixing between SVO and SOV languages like English and German. A dependency (WG) approach is imperative here. A word's requirements do not project to larger units like phrasal constituents. The Null-Hypothesis, then, formulated in WG terms, assumes that each word in a switched dependency satisfies the constraints imposed on it by its own language. The material is taken from English/German conversations of Jewish refugees in London. Maekawa continues the sequence in this collection towards more purely theoretical studies. In Chapter 7, he looks at three different approaches to the asymmetries between main and embedded clauses with respect to the elements in the left periphery of a clause: the dependency-based approach within WG, the Constructional HPSG approach, and the Linearization HPSG analysis. Maekawa, a HPSG linguist, argues that the approaches within WG and the Constructional HPSG have some problems in dealing with the relevant facts, but that Linearization HPSG provides a straightforward account of them. Maekawa's analysis suggests that linear order should be independent to a considerable extent from combinatorial structure, such as dependency or phrase structure. Following these chapters are more theoretical chapters which help to improve the theory and clarify what research questions must be undertaken next. Part II contains two chapters that examine two theoretical key concepts in WG, head and dependency. They are intended to help us progress a few steps forward in revising and improving the current WG, together with Hudson (in preparation). The notion of head is a central one in most grammars, so it is normal that it is discussed and challenged by WG and other theorists. In Chapter 8, Rosta distinguishes between two kinds of head and claims that every phrase has both a distributional head and a structural head, although he agrees that normally the same word is both distributional and structural head of a phrase. Finally, Gisborne's Chapter 9 then challenges Hudson's classification of dependencies. The diversification of heads (different kinds of dependency) plays a role in WG as well. Gisborne is in favour of a more fine-grained account of
xviii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE dependencies than Hudson's 1990 model. He focuses on a review of the subject-of dependency, distinguishing between two kinds of subjects, which seems promising. Gisborne's thesis is that word order is governed not only by syntactic information but also by discourse-presentational facts. I hope this short overview will suggest to the prospective reader that our attempt at introducing a dependency-based grammar was successful. By the means of this volume, we hope to contribute to the continuing cooperation between linguists working in WG and those working in other theoretical frameworks. We look forward to future volumes that will further develop this cooperation. The editors gratefully acknowledge the work and assistance of all those contributors whose papers are incorporated in this volume, including one nonWG linguist who contributed papers from his own theoretical viewpoint and helped shape the volume you see here. Last but not least, neither the research in WG nor the present volume would have been possible without the general support of both the Japan Society for the Promotion of Science and the Daiwa Anglo-Japanese Foundation, whose assistance we gratefully acknowledge here. In addition, we owe a special debt of gratitude to Jenny Lovel for assisting with preparation of this volume in her normal professional manner. We alone accept responsibility for all errors in the presentation of data and analyses in this volume. Kensei Sugayama References Hudson, R. A. (1971), English Complex Sentences: An Introduction to Systemic Grammar. Amsterdam: North Holland. — (1976), Arguments for a Non-transformational Grammar. Chicago: University of Chicago Press. — (1984), Word Grammar. Oxford: Blackwell. — (1990), English Word Grammar. Oxford: Blackwell. — (2004, July 1-last update), 'Word Grammar', (Word Grammar), Available: www. phon. ucl. ac. uk/home/dick/wg. htm (Accessed: 18 April 2005). — (in preparation), Advances in Word Grammar. Oxford: Oxford University Press. Pollard, C. and Sag, LA. (1987), Information-Based Syntax and Semantics. Stanford: CSLI. Schachter, P. (1978), 'Review of Arguments for a Non-Transformational Grammar'. Language, 17, 348-76. Sugayama, K. (ed. ) (2003), Studies in Word Grammar. Kobe: Research Institute of Foreign Studies, KCUFS. Tesniere, Lucien (1959), Elements de Syntaxe Structurale. Paris: Klincksieck.
Introduction
This page intentionally left blank
1 What is Word Grammar? RICHARD HUDSON
Abstract
The chapter summarizes the Word Grammar (WG) theory of language structure under the following headings: 1. A brief overview of the theory; 2. Historical background; 3. The cognitive network: 3. 1 Language as part of a general network; 3. 2 Labelled links; 3. 3 Modularity; 4. Default inheritance; 5. The language network; 6. The utterance network; 7. Morphology; 8. Syntax; 9. Semantics; 10. Processing; and 11. Conclusions. 1.
A Brief Overview of the Theory
Word Grammar (WG) is a general theory of language structure. Most of the work to date has dealt with syntax, but there has also been serious work in semantics and some more tentative explorations of morphology, sociolinguistics, historical linguistics and language processing. The only areas of linguistics that have not been addressed at all are phonology and language acquisition (but even here see van Langendonck 1987). The aim of this article is breadth rather than depth, in the hope of showing how far-reaching the theory's tenets are. Although the roots of WG lie firmly in linguistics, and more specifically in grammar, it can also be seen as a contribution to cognitive psychology; in terms of a widely used classification of linguistic theories, it is a branch of cognitive linguistics (Lakoff 1987; Langacker 1987; 1990; Taylor 1989). The theory has been developed from the start with the aim of integrating all aspects of language into a single dieory which is also compatible with what is known about general cognition. This may turn out not to be possible, but to the extent that it is possible it will have explained the general characteristics of language as 'merely' one instantiation of more general cognitive characteristics. The overriding consideration, of course, is the same as for any other linguistic theory: to be true to the facts of language structure. However, our assumptions make a great deal of difference when approaching these facts, so it is possible to arrive at radically different analyses according to whether we assume that language is a unique module of the mind, or that it is similar to other parts of cognition. The WG assumption is that language can be analysed and explained in the same way as other kinds of knowledge or behaviour unless there is clear evidence to the contrary. So far this strategy has proved productive and largely successful, as we shall see below.
4
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
As the theory's name suggests, the central unit of analysis is the word, which is central to all kinds of analysis: • Grammar. Words are the only units of syntax (section 8), as sentence structure consists entirely of dependencies between individual words; WG is thus clearly part of the tradition of dependency grammar dating from Tesniere (1959; Fraser 1994). Phrases are implicit in the dependencies, but play no part in the grammar. Moreover, words are not only the largest units of syntax, but also the smallest. In contrast with Chomskyan linguistics, syntactic structures do not, and cannot, separate stems and inflections, so WG is an example of morphology-free syntax (Zwicky 1992: 354). Unlike syntax, morphology (section 7) is based on constituent-structure, and the two kinds of structure are different in others ways too. • Semantics. As in other theories words are also the basic lexical units where sound meets syntax and semantics, but in the absence of phrases, words also provide the only point of contact between syntax and semantics, giving a radically 'lexical' semantics. As will appear in section 9, a rather unexpected effect of basing semantic structure on single words is a kind of phrase structure in the semantics. • Situation. We shall see in section 6 that words are the basic units for contextual analysis (in terms of deictic semantics, discourse or sociolinguistics). Words, in short, are the nodes that hold the 'language' part of the human network together. This is illustrated by the word cycled in the sentence / cycled to UCL, which is diagrammed in Figure 1.
Figure 1
WHAT IS WORD GRAMMAR? 5
Table 1 Relationships in cycled related concept G
relationship of C to cycled
notation in diagram
the word / the word to the morpheme {cycle} the word-form {cycle+ed} the concept 'ride-bike' the concept 'event e' the lexeme CYCLE the inflection 'past' me now
subject post-adjunct stem whole sense referent cycled isa CYCLE
V '>a' straight downward line curved downward line straight upward line curved upward line triangle resting on CYCLE
speaker time
'speaker' 'time'
As can be seen in this diagram, cycled is the meeting point for ten relationships which are detailed in Table 1. These relationships are all quite traditional (syntactic, morphological, semantic, lexical and contextual), and traditional names are used where they exist, but the diagram uses notation which is peculiar to WG. It should be easy to imagine how such relationships can multiply to produce a rich network in which words are related to one another as well as to other kinds of element including morphemes and various kinds of meaning. All these elements, including the words themselves, are 'concepts' in the standard sense; thus a WG diagram is an attempt to model a small part of the total conceptual network of a typical speaker. 2.
Historical Background
The theory described in this article is the latest in a family of theories which have been called 'Word Grammar' since the early 1980s (Hudson 1984). The present theory is very different in some respects from the earliest one, but the continued use of the same name is justified because we have preserved some of the most fundamental ideas - the central place of the word, the idea that language is a network, the role of default inheritance, the clear separation of syntax and semantics, the integration of sentence and utterance structure. The theory is still changing and further changes are already identifiable (Hudson, in preparation). As in other theories, the changes have been driven by various forces - new data, new ideas, new alternative theories, new personal interests; and by the influence of teachers, colleagues and students. The following brief history may be helpful in showing how the ideas that are now called 'Word Grammar' developed during my academic life. The 1960s. My PhD analysis of Beja used the theory being developed by Halliday (1961) under the name 'Scale-and-Category' grammar, which later turned into Systemic Functional Grammar (Butler 1985; Halliday 1985). I spent the next six years working with Halliday, whose brilliantly wide-ranging analyses impressed me a lot. Under the influence of Chomsky's generative
6
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
grammar (1957, 1965), reinterpreted by McCawley (1968) as well-formedness conditions, I published the first generative version of Halliday's Systemic Grammar (Hudson 1970). This theory has a very large network (the 'system network') at its heart, and networks also loomed large at tihat time in the Stratificational Grammar of Lamb (1966; Bennett 1994). Another reason why stratificational grammar was important was that it aimed to be a model of human language processing - a cognitive model. The 1970s. Seeing the attractions of both valency theory and Chomsky's subcategorization, I produced a hybrid theory which was basically Systemic Grammar, but with the addition of word-word dependencies under the influence of Anderson (1971); the theory was called 'Daughter-Dependency Grammar' (Hudson 1976). Meanwhile I was teaching sociolinguistics and becoming increasingly interested in cognitive science (especially default inheritance systems and frames) and the closely related field of lexical semantics (especially Fillmore's Frame Semantics 1975, 1976). The result was a very 'cognitive' textbook on sociolinguistics (Hudson 1980a, 1996a). I was also deeply influenced by Chomsky's 'Remarks on nominalization' paper (1970), and in exploring the possibilities of a radically lexicalist approach I toyed with the idea of 'pan-lexicalism' (1980b, 1981): everything in the grammar is 'lexical' in the sense that it is tied to word-sized units (including word classes). The 1980s. All these influences combined in the first version of Word Grammar (Hudson 1984), a cognitive theory of language as a network which contains both 'the grammar' and 'the lexicon' and which integrates language with title rest of cognition. The semantics follows Lyons (1977), Halliday (19678) and Fillmore (1976) rather than formal logic, but even more controversially, the syntax no longer uses phrase structure at all in describing sentence structure, because everything that needs to be said can be said in terms of dependencies between single words. The influence of continental dependency theory is evident but the dependency structures were richer than those allowed in 'classical' dependency grammar (Robinson 1970) - more like the functional structures of Lexical Functional Grammar (Kaplan and Bresnan 1982). Bresnan's earlier argument (1978) that grammar should be compatible with a psychologically plausible parser also suggested the need for a parsing algorithm, which has led to a number of modest Natural Language Processing (NLP) systems using WG (Fraser 1985, 1989, 1993; Hudson 1989; Shaumyan 1995). These developments provided the basis for the next book-length description of WG, 'English Word Grammar' (EWG, Hudson 1990). This attempts to provide a formal basis for the theory as well as a detailed application to large areas of English morphology, syntax and semantics. The 1990s. Since the publication of EWG there have been some important changes in the theory, ranging from the general theory of default inheritance, through matters of syntactic theory (with the addition of 'surface structure', the virtual abolition of features and the acceptance of 'unreal' words) and morphological theory (where 'shape', 'whole' and 'inflection' are new), to details of analysis, terminology and notation. These changes will be described below. WG has also been applied to a wider range of topics than previously:
WHAT IS WORD GRAMMAR?
7
•
lexical semantics (Gisborne 1993, 1996, 2000, 2001; Holmes 2004; Hudson and Holmes 2000; Hudson 1992, 1995, forthcoming; Sugayama 1993, 1996, 1998); • morphology (Creider 1999; Creider and Hudson 1999); c• morphology (Creider 1999; Creider and Hudson 1999); • sociolinguistics (Hudson 1996a, 1997b; Eppler 2005); • language processing (Hudson 1993a, b, 1996b; Hiranuma 1999, 2001). Most of the work done since the start of WG has applied the theory to English, but it has also been applied to the following languages: Tunisian Arabic (Chekili 1982); Greek (Tzanidaki 1995, 1996a, b); Italian (Volino 1990); Japanese (Sugayama 1991, 1992, 1993, 1996; Hiranuma 1999, 2001); Serbo-Croatian (Camdzic and Hudson 2002); and Polish (Gorayska 1985). The theory continues to evolve, and at the time of writing a 'Word Grammar Encyclopedia' which can be downloaded via the WG website (www. phon. ucl. ac. uk/home/dick/wg. htm) is updated in alternate years. 3. The Cognitive Network 3. 1 Language as part of a general network The basis for WG is an idea which is quite uncontroversial in cognitive science: The idea is that memory connections provide the basic building blocks through which our knowledge is represented in memory. For example, you obviously know your mother's name; this fact is recorded in your memory. The proposal to be considered is that this memory is literally represented by a memory connection,... That connection isn't some appendage to the memory. Instead, the connection is the memory.... all of knowledge is represented via a sprawling network of these connections, a vast set of associations. (Reisberg 1997: 257-8)
In short, knowledge is held in memory as an associative network (though we shall see below that the links are much more precisely defined than the unlabelled 'associations' of early psychology and modern connectionist psychology). What is more controversial is that, according to WG, the same is true of our knowledge of words, so the sub-network responsible for words is just a part of the total 'vast set of associations'. Our knowledge of words is our language, so our language is a network of associations which is closely integrated with the rest of our knowledge. However uncontroversial (and obvious) this view of knowledge may be in general, it is very controversial in relation to language. The only part of language which is widely viewed as a network is the lexicon (Aitchison 1987: 72), and a fashionable view is that even here only lexical irregularities are stored in an associative network, in contrast with regularities which are stored in a fundamentally different way, as 'rules' (Pinker and Prince 1988). For example, we have a network which shows for the verb come not only that its meaning is
8
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
'come' but that its past tense is the irregular came, whereas regular past tenses are handled by a general rule and not stored in the network. The WG view is that exceptional and general patterns are indeed different, but that they can both be accommodated in the same network because it is an 'inheritance network' in which general patterns and their exceptions are related by default inheritance (which is discussed in more detail in section 4). To pursue the last example, both patterns can be expressed in exactly the same prose: (1) The shape of the past tense of a verb consists of its stem followed by -d. (2) The shape of the past tense of come consists of came. The only difference between these rules lies in two places: 'a verb' versus come, and 'its stem followed by -ed" versus came. Similarly, they can both be incorporated into the same network, as shown in Figure 2 (where the triangle once again shows the 'isa' relationship by linking the general concept at its base to the specific example connected to its apex):
Figure 2 Once the possibility is accepted that some generalizations may be expressed in a network, it is easy to extend the same treatment to the whole grammar, as we shall see in later examples. One consequence, of course, is that we lose the formal distinction between 'the lexicon' and 'the rules' (or 'the grammar'), but this conclusion is also accepted outside WG in Cognitive Grammar (Langacker 1987) and Construction Grammar (Goldberg 1995). The only parts of linguistic analysis that cannot be included in the network are the few general theoretical principles (such as the principle of default inheritance). 3. 2 Labelled links It is easy to misunderstand the network view because (in cognitive psychology) there is a long tradition of 'associative network' theories in which all links have just the same status: simple 'association'. This is not the WG view, nor is it the view of any of the other theories mentioned above, because links are
WHAT IS WORD GRAMMAR?
9
classified and labelled - 'stem', 'shape', 'sense', 'referent', 'subject', 'adjunct' and so on. The classifying categories range from the most general - the 'isa' link - to categories which may be specific to a handful of concepts, such as 'goods' in the framework of commercial transactions (Hudson forthcoming). This is a far cry from the idea of a network of mere 'associations' (such as underlies connectionist models). One of the immediate benefits of this approach is that it allows named links to be used as functions, in the mathematical sense of Kaplan and Bresnan (1982: 182), which yield a unique value - e. g. 'the referent of the subject of the verb' defines one unique concept for each verb. In order to distinguish this approach from the traditional associative networks we can call these networks 'labelled'. Even within linguistics, labelled networks are controversial because the labels themselves need an explanation or analysis. Because of this problem some theories avoid labelled relationships, or reduce labelling to something more primitive: for example, Chomsky has always avoided functional labels for constituents such as 'subject' by using configurational definitions, and the predicate calculus avoids semantic role labels by distinguishing arguments in terms of order. There is no doubt that labels on links are puzzlingly different from the labels that we give to the concepts that they link. Take the small network in Figure 2 for past tenses. One of the nodes is labelled 'COME: past', but this label could in fact be removed without any effect because 'COME: past' is the only concept which isa 'verb: past' and which has came as its shape. Every concept is uniquely defined by its links to other concepts, so labels are redundant (Lamb 1996, 1999: 59). But the same is not true of the labels on links, because a network with unlabelled links is a mere associative network which would be useless in analysis. For example, it is no help to know that in John saw Mary the verb is linked, in some way or other, to the two nouns and that its meaning is linked, again in unspecified ways, to the concepts 'John' and 'Mary'; we need to know which noun is the subject, and which person is the see-er. The same label may be found on many different links - for example, every word that has a sense (i. e. virtually every word) has a link labelled 'sense', every verb that has a subject has a 'subject' link, and so on. Therefore the function of the labels is to classify the links as same or different, so if we remove the label we lose information. It makes no difference whether we show these similarities and differences by means of verbal labels (e. g. 'sense') or some other notational device (e. g. straight upwards lines); all that counts is whether or not our notation classifies links as same or different. Figure 3 shows how this can be done using first conventional attribute-value matrices and second, the WG notation used so far. This peculiarity of the labels on links brings us to an important characteristic of the network approach which allows the links themselves to be treated like the concepts which they link - as 'second-order concepts', in fact. The essence of a network is that each concept should be represented just once, and its multiple links to other concepts should be shown as multiple links, not as multiple copies of the concept itself. Although the same principle applies generally to attribute-value matrices, it does not apply to the attributes themselves. Thus
10
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figmre 3 there is a single matrix for each concept, and if two attributes have the same value this is shown (at least in one notation) by an arc that connects the two value-slots. But when it comes to the attributes themselves, their labels are repeated across matrices (or even within a single complex matrix). For example, the matrix for a raising verb contains within it the matrix for its complement verb; an arc can show that the two subject slots share the same filler but the only way to show that these two slots belong to the same (kind of) attribute is to repeat the label 'subject'. In a network approach it is possible to show both kinds of identity in the same way: by means of a single node with multiple 'isa' links. If two words are both nouns, we show this by an isa link from each to the concept 'noun'; and if two links are both 'subject' links, we put an isa link from each link to a single general 'subject' link. Thus labelled links and other notational tricks are just abbreviations for a more complex diagram with second-order links between links. These second-order links are illustrated in Figure 4 for car and bicycle, as well as for the sentence Jo snores.
Figure 4
WHAT IS WORD GRAMMAR?
11
This kind of analysis is too cumbersome to present explicitly in most diagrams, but it is important to be clear that it underlies the usual notation because it allows the kind of analysis which we apply to ordinary concepts to be extended to the links between them. If ordinary concepts can be grouped into larger classes, so can links; if ordinary concepts can be learned, so can links. And if the labels on ordinary concepts are just mnemonics which could, in principle, be removed, the same is true of the labels on almost all kinds of link. The one exception is the 'isa' relationship itself, which reflects its fundamental character.
3. 3 Modularity The view of language as a labelled network has interesting consequences for the debate about modularity: is there a distinct 'module' of the mind dedicated exclusively to language (or to some part of language such as syntax or inflectional morphology)? Presumably not if a module is defined as a separate 'part' of our mind and if the language network is just a small part of a much larger network. One alternative to this strong version of modularity is no modularity at all, with the mind viewed as a single undifferentiated whole; this seems just as wrong as a really strict version of modularity. However there is a third possibility. If we focus on the links, any such network is inevitably 'modular' in the much weaker (and less controversial) sense that links between concepts tend to cluster into relatively dense sub-networks separated by relatively sparse boundary areas. Perhaps the clearest evidence for some kind of modularity comes from language pathology, where abilities are impaired selectively. Take the case of Pure Word Deafness (Airman 1997: 186), for example. Why should a person be able to speak and read normally, and to hear and classify ordinary noises, but not be able to understand the speech of other people? In terms of a WG network, this looks like an inability to follow one particular link-type ('sense') in one particular direction (from word to sense). Whatever the reason for this strange disability, at least the WG analysis suggests how it might apply to just this one aspect of language, while also applying to every single word: what is damaged is the general relationship 'sense', from which all particular sense relationships are inherited. A different kind of problem is illustrated by patients who can name everything except one category - e. g. body-parts or things typically found indoors (Pinker 1994: 314). Orthodox views on modularity seem to be of little help in such cases, but a network approach at least explains how the non-linguistic concepts concerned could form a mental cluster of closely-linked and mutually defining concepts with a single super-category. It is easy to imagine reasons why such a cluster of concepts might be impaired selectively (e. g. that closely related concepts are stored close to each other, so a single injury could sever all their sense links), but the main point is to have provided a way of unifying them in preparation for the explanation. In short, a network with classified relations allows an injury to apply to specific relation types so that these relations are disabled across the board. The
12
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
approach also allows damage to specific areas of language which form clusters with strong internal links and weak external links. Any such cluster or shared linkage defines a kind of 'module' which may be impaired selectively, but the module need not be innate: it may be 'emergent', a cognitive pattern which emerges through experience (Karmiloff-Smith 1992; Bates et al. 1998). 4. Default Inheritance Default inheritance is just a formal version of the logic that linguists have always used: true generalizations may have exceptions. We allow ourselves to say that verbs form their past tense by adding -ed to the stem even if some verbs don't, because the specific provision made for these exceptional cases will automatically override the general pattern. In short, characteristics of a general category are 'inherited' by instances of that category only 'by default' - only if they are not overridden by a known characteristic of the specific case. Common sense tells us that this is how ordinary inference works, but default inheritance only works when used sensibly. Although it is widely used in artificial intelligence, researchers treat it with great caution (Luger and Stubblefield 1993: 386-8). The classic formal treatment is Touretzky (1986). Inheritance is carried by the 'isa' relation, which is another reason for considering this relation to be fundamental. For example, because snores isa 'verb' it automatically inherits all the known characteristics of 'verb' (i. e. of 'the typical verb'), including, for example, the fact that it has a subject; similarly, because the link between Jo and snores in Jo snores isa 'subject' it inherits the characteristics of 'subject'. As we have already seen, the notation for 'isa' consists of a small triangle with a line from its apex to the instance. The base of the triangle which rests on the general category reminds us that this category is larger than the instance, but it can also be imagined as the mouth of a hopper into which information is poured so that it can flow along the link to the instance. The mechanism whereby default values are overridden has changed during the last few years. In EWG, and also in Fraser and Hudson (1992), the mechanism was 'stipulated overriding', a system peculiar to WG; but since then this system has been abandoned. WG now uses a conventional system in which a fact is automatically blocked by any other fact which conflicts and is more specific. Thus the fact that the past tense of COME is came automatically blocks the inheritance of the default pattern for past tense verbs. One of the advantages of a network notation is that this is easy to define formally: we always prefer the value for 'R of C' (where R is some relationship, possibly complex, and C is a concept) which is nearest to C (in terms of intervening links). For example, if we want to find the shape of the past tense of COME, we have a choice between came and corned, but the route to came is shorter than that to corned because the latter passes through the concept 'past tense of a verb'. (For detailed discussions of default inheritance in WG, see Hudson 2000a, 2003b. ) Probably the most important question for any system that uses default inheritance concerns multiple inheritance, in which one concept inherits
WHAT IS WORD GRAMMAR?
13
from two different concepts simultaneously - as 'dog' inherits, for example, both from 'mammal' and from 'pet'. Multiple inheritance is allowed in WG, as in unification-based systems and the programming language DATR (Evans and Gazdar 1996); it is true that it opens up the possibility of conflicting information being inherited, but this is a problem only if the conflict is an artefact of the analysis. There seem to be some examples in language where a form is ungrammatical precisely because there is an irresoluble conflict between two characteristics; for example, in many varieties of standard English the combination */ amn't is predictable, but ungrammatical. One explanation for this strange gap is that the putative form amn't has to inherit simultaneously from aren't (the negative present of BE) and am (the I-form of BE); but these models offer conflicting shapes (aren't, am] without any way for either to override the other (Hudson 2000a). In short, WG does allow multiple inheritance, and indeed uses it a great deal (as we shall see in later sections).
5. The Language Network According to WG, then, language is a network of concepts. The following more specific claims flesh out this general idea. First, language is part of the same general conceptual network which contains many concepts which are not part of language. What distinguishes the language area of this network from the rest is that the concepts concerned are words and their immediate characteristics. This is simply a matter of definition: concepts which are not directly related to words would not be considered to be part of language. As explained in section 3. 3, language probably qualifies as a module in the weak sense that the links among words are denser than those between words and other kinds of concept, but this does not mean that language is a module in the stronger sense of being 'encapsulated' or having its own special formal characteristics. This is still a matter of debate, but we can be sure that at least some of the characteristics of language are also found elsewhere - the mechanism of default inheritance and the isa relation, the notion of linear order, and many other formal properties and principles. As we saw in Table 1, words may have a variety of links to each other and to other concepts. This is uncontroversial, and so are most of the links that are recognized. Even the traditional notions of 'levels of language' are respected in as much as each level is defined by a distinct kind of link: a word is linked to its morphological structure via the 'stem' and 'shape' links, to its semantics by the 'sense' and 'referent' links, and to its syntax by dependencies and word classes. Figure 5 shows how clearly the traditional levels can be separated from one another. In WG there is total commitment to the 'autonomy' of levels, in the sense that the levels are formally distinct. The most controversial characteristic of WG, at this level of generality, is probably the central role played by inheritance (isa) hierarchies. Inheritance hierarchies are the sole means available for classifying concepts, which means that there is no place for feature-descriptions. In most other theories, feature-descriptions are used to name concepts, so that instead of
14
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
SEMANTICS
SYNTAX
MORPHOLOGY
PHONOLOGY GRAPHOLOGY Figure 5 'verb' we have '[+V, -N]' or (changing notation) '[Verb: +, Noun: -, SUBCAT:
]' or even 'S/NP'. This is a fundamental difference because, as we saw earlier, the labels on WG nodes are simply mnemonics and the analysis would not be changed at all if they were all removed. The same is clearly not true where feature-descriptions are used, as the name itself contains crucial information which is not shown in any other way. In order to classify a word as a verb in WG we give it an isa link to 'verb'; we do not give it a featuredescription which contains that of 'verb'. The most obviously classifiable elements in language are words, so in addition to specific, unique, words we recognize general 'word-types'; but we can refer to both simply as 'words' because (as we shall see in the next section) their status is just the same. Multiple inheritance allows words to be classified on two different 'dimensions': as lexemes (DOG, LIKE, IF, etc. ) and as inflections (plural, past, etc. ). Figure 6 shows how this cross-classification can be incorporated into an isa hierarchy. The traditional word classes are shown on the lexeme dimension as classifications of lexemes, but they interact in complex ways with inflections. Cross-classification is possible even among word-classes; for example, English gerunds (e. g. Writing in Writing articles is fun. ) are both nouns and verbs (Hudson 2000b), and in many languages participles are probably both adjectives and verbs. Unlike other theories, the classification does not take words as the highest category of concepts - indeed, it cannot do so if language is part of a larger network. WG allows us to show the similarities between words and other kinds
WHAT IS WORD GRAMMAR?
15
Figure 6 of communicative behaviour by virtue of an isa link from 'word' to 'communication', and similar links show that words are actions and events. This is important in the analysis of deictic meanings which have to relate to the participants and circumstances of the word as an action. This hierarchy of words is not the only isa hierarchy in language. There are two more for speech sounds ('phonemes') and for letters ('graphemes'), and a fourth for morphemes and larger 'forms' (Hudson 1997b; Creider and Hudson 1999), but most important is the one for relationships - 'sense', 'subject' and so on. Some of these relationships belong to the hierarchy of dependents which we shall discuss in the section on syntax, but there are many others which do not seem to comprise a single coherent hierarchy peculiar to language (in contrast with the 'word' hierarchy). What seems much more likely is that relationships needed in other areas of thought (e. g. 'before', 'part-of) are put to use in language. To summarize, the language network is a collection of words and word-parts (speech-sounds, letters and morphemes) which are linked to each other and to the rest of cognition in a variety of ways, of which the most important is the 'isa' relationship which classifies them and allows default inheritance. 6. The Utterance Network A WG analysis of an utterance is also a network; in fact, it is simply an extension of the permanent cognitive network in which the relevant word tokens comprise a 'fringe' of temporary concepts attached by 'isa' links, so the
16
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
utterance network has just the same formal characteristics as the permanent network. For example, suppose you say to me 'I agree. ' My task, as hearer, is to segment your utterance into the two words / and agree, and then to classify each of these as an example of some word in my permanent network (my grammar). This is possible to the extent that default inheritance can apply smoothly; so, for example, if my grammar says that / must be the subject of a tensed verb, the same must be true of this token, though as we shall see below, exceptions can be tolerated. In short, a WG grammar can generate representations of actual utterances, warts and all, in contrast with most other kinds of grammar which generate only idealized utterances or 'sentences'. This blurring of the boundary between grammar and utterance is very controversial, but it follows inevitably from the cognitive orientation of WG. The status of utterances has a number of theoretical consequences both for the structures generated and for the grammar that generates them. The most obvious consequence is that word tokens must have different names from the types of which they are tokens; in our example, the first word must not be shown as / if this is also used as the name for the word-type in the grammar. This follows from the fact that identical labels imply identity of concept, whereas tokens and types are clearly distinct concepts. The WG convention is to reserve conventional names for types, with tokens labelled 'wl', 'w2' and so on through the utterance. Thus our example consists of wl and w2, which isa T and 'AGREE: pres' respectively. This system allows two tokens of the same type to be distinguished; so in / agree I made a mistake, wl and w3 both isa T. (For simplicity WG diagrams in this chapter only respect this convention when it is important to distinguish tokens from types. ) Another consequence of integrating utterances into the grammar is that word types and tokens must have characteristics such that a token can inherit them from its type. Obviously the token must have the familiar characteristics of types - it must belong to a lexeme and a word class, it must have a sense and a stem, and so on. But the implication goes in the other direction as well: the type may mention some of the token's characteristics that are normally excluded from grammar, such as characteristics of the speaker, the addressee and the situation. This allows a principled account of deictic meaning (e. g. / refers to the speaker, you to the addressee and now to the time of speaking), as shown in Figure 1 and Table 1. Perhaps even more importantly, it is possible to incorporate sociolinguistic information into the grammar, by indicating the kind of person who is a typical speaker or addressee, or the typical situation of use. Treating utterances as part of the grammar has two further effects which are important for the psycholinguistics of processing and of acquisition. As far as processing is concerned, the main point is that WG accommodates deviant input because the link between tokens and types is guided by the rather liberal 'Best Fit Principle' (Hudson 1990: 45ff): assume that the current token isa the type that provides the best fit with everything that is known. The default inheritance process which this triggers allows known characteristics of the token to override those of the type; for example, a misspelled word such as mispelled
WHAT IS WORD GRAMMAK?
can isa its type, just like any other exception, though it will also be shown as a deviant example. There is no need for the analysis to crash because of an error. (Of course a WG grammar is not in itself a model of either production or perception, but simply provides a network of knowledge which the processor can exploit. ) Turning to learning, the similarity between tokens and types means that learning can consist of nothing but the permanent storage of tokens minus their utterance-specific content. These remarks about utterances are summarized in Figure 7, which speculates about my mental representation for the (written) 'utterance' Tons mispelled it. According to this diagram, the grammar supplies two kinds of utterance-based information about wl: • that its referent is a set whose members include its addressee; • that its speaker is a 'northerner' (which may be inaccurate factually, but is roughly what I believe to be the case). It also shows that w2 is a deviant token of the type 'MISSPELL: past'. (The horizontal line below 'parts' is short-hand for a series of lines connecting the individual letters directly to the morpheme, each with a distinct part name: part 1, part 2 and so on. )
Figure 7
18
7.
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Morphology
As explained earlier, the central role of the word automatically means that the syntax is 'morphology-free'. Consequently it would be fundamentally against the spirit of WG to follow transformational analyses in taking Jo snores as Jo 'tense' snore. A morpheme for tense is not a word in any sense, so it cannot be a syntactic node. The internal structure of words is handled almost entirely by morphology. (The exception is the pattern found in clitics, which we return to at the end of this section. ) The WG theory of inflectional morphology has developed considerably in the last few years (Creider and Hudson 1998; Hudson 2000a) and is still evolving. In contrast with the views expressed in EWG, I now distinguish sharply between words, which are abstract, and forms, which are their concrete (visible or audible) shapes; so I now accept the distinction between syntactic words and phonological words (Rosta 1997) in all but terminology. The logic behind this distinction is simple: if two words can share the same form, the form must be a unit distinct from both. For example, we must recognize a morpheme {bear} which is distinct from both the noun and the verb that share it (BEAR noun and BEARvverb). This means that a word can never be directly related to phonemeserb). This means that a word can never be directly related to phonemes and letters, in contrast with the EWG account where this was possible (e. g. Hudson 1990: 90: 'whole of THEM = '). Instead, words are mapped to forms, and forms to phonemes and letters. A form is the 'shape' of a word, and a phoneme or letter is a 'pronunciation' or 'spelling' of a form. In Figure 7, for example, the verb MISSPELL has the form {misspell} as its stem (a kind of shape), and the spelling of {misspell} is < misspell>. In traditional terms, syntax, form and phonology define different 'levels of language'. As in traditional structuralism, their basic units are distinct words, morphemes and phoneme-type segments; and as in the European tradition, morphemes combine to define larger units of form which are still distinct from words. For example, {misspell} is clearly not a single morpheme, but it exists as a unit of form which might be written {mis+spell} - two morphemes combining to make a complex form - and similarly for {mis+spell+ed}, the shape of the past tense of this verb. Notice that in this analysis {... } indicates forms, not morphemes; morpheme boundaries are shown by '+'. Where does morphology, as a part of the grammar, fit in? Inflectional morphology is responsible for any differences between a word's stem - the shape of its lexeme - and its whole - the complete shape. For example, the stem of misspelled is {misspell}, so inflectional morphology explains the extra suffix. Derivational morphology, on the other hand, explains the relations between the stems of distinct lexemes - in this case, between the lexemes SPELL and MISSPELL, whereby the stem of one is contained in the stem of the other. The grammar therefore contains the following 'facts': • the stem of SPELL is {spell}; e the stem of MISSPELL is {mis+spell}; • the 'mis-verb' of a verb has a stem which contains {mis} + the stem of this verb;
WHAT IS WORD GRAMMAR?
19
• the whole of MISSPELL: past is {mis+spell+ed}; • the past tense of a verb has a whole which contains its stem + {ed}. In more complex cases (which we cannot consider here) the morphological rules can handle vowel alternations and other departures from simple combination of morphemes. A small sample of a network for inflectional morphology is shown in Figure 8. This diagram shows the default identity of whole and stem, and the default rule for plural nouns: their shape consists of their stem followed by {s}. No plural need be stored for regular nouns like DUCK, but for GOOSE the irregularity must be stored. According to the analysis shown here, geese is doubly irregular, having no suffix and having an irregular stem whose vowel positions (labelled here simply T and '2') are filled by (examples of) instead of the expected . In spite of the vowel change the stem of geese isa the stem of GOOSE, so it inherits all the other letters, but had it been suppletive a completely new stem would have been supplied.
Figure 8
20
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
This analysis is very similar to those which can be expressed in terms of 'network morphology' (Brown et al. 1996), which is also based on multiple default inheritance. One important difference lies in the treatment of syncretism, illustrated by the English verb's past participle and passive participle which are invariably the same. In network morphology the identity is shown by specifying one and cross-referring to it from the other, but this involves an arbitrary choice: which is the 'basic' one? In WG morphology, in contrast, the syncretic generalizations are expressed in terms of 'variant' relations between forms; for example, the past participle and passive participle both have as their whole the 'en-variant' of their stem, where the en-variant of {take} is {taken} and that of {walk} is {walked}. The en-variant is a 'morphological function' which relates one form (the word's stem) to another, allowing the required combination of generalization (by default a form's en-variant adds {ed} to a copy of the form) and exceptionality. As derivational morphology is responsible for relationships between lexemes, it relates one lexeme's stem to that of another by means of exactly the same apparatus of morphological functions as is used in inflectional morphology - indeed, some morphological functions may be used both in inflection and in derivation (for example, the one which is responsible for adding {ing} is responsible not only for present participles but also for nominalizations such as flooring). Derivational morphology is not well developed in WG, but the outlines of a system are clear. It will be based on abstract lexical relationships such as 'mis-verb' (relating SPELL to MISSPELL) and 'nominalization' (relating it to SPELLING); these abstract relations between words are realized, by default, by (relatively) concrete morphological functions, so, for example, a verb's nominalization is typically realized by the ing-variant of that verb's stem. Of course, not all lexical relationships are realized by derivational morphology, in which related lexemes are partly similar in morphology; the grammar must also relate lexemes where morphology is opaque (e. g. DIE - KILL, BROTHER - SISTER). The network approach allows us to integrate all these relationships into a single grammar without worrying about boundaries between traditional sub-disciplines such as derivational morphology and lexical semantics. I said at the start of this section that clitics are an exception to the generally clear distinction between morphology and syntax. A clitic is a word whose realization is an affix within a larger word. For example, in He's gone, the clitic 's is a word in terms of syntax, but its realization is a mere affix in terms of morphology. They are atypical because typical words are realized by an entire word-form; but the exceptionality is just a matter of morphology. In the case of 's, I suggest that it isa the word 'BE: present, singular' with the one exceptional feature that its whole isa the morpheme {s} - exactly the same morpheme as we find in plural nouns, other singular verbs and possessives. As in other uses, {s} needs to be part of a complete word-form, so it creates a special form called a 'host-form' to combine it with a suitable word-form to the left. In more complex cases ('special clitics' - Zwicky 1977) the position of the clitic is fixed by the morphology of the host-form and conflicts with the
WHAT IS WORD GRAMMAR?
21
demands of syntax, as in the French example (3) where en would follow deux (*Paul mange deux en) if it were not attached by cliticization to mange, giving a single word-form en mange. (3)
en Paul of-them Paul 'Paul eats two of them. '
mange eats
deux, two
Once again we can explain this special behaviour if we analyze en as an ordinary word EN whose shape (whole) is the affix {en}. There is a great deal more to be said about clitics, but not here. For more detail see Hudson (2001) and Camdzic and Hudson (2002).
8. Syntax As in most other theories, syntax is the best developed part of WG, which offers explanations for most of the 'standard' complexities of syntax such as extraction, raising, control, coordination, gapping and agreement. However the WG view of syntax is particularly controversial because of its rejection of phrase structure. WG belongs to the family of 'dependency-based' theories, in which syntactic structure consists of dependencies between pairs of single words. As we shall see below, WG also recognizes 'word-strings', but even these are not the same as conventional phrases. A syntactic dependency is a relationship between two words that are connected by a syntactic rule. Every syntactic rule (except for those involved in coordination) is 'carried' by a dependency, and every dependency carries at least one rule that applies to both the dependent and its 'parent' (the word on which it depends). These word-word dependencies form chains which link every word ultimately to the word which is the head of the phrase or sentence; consequently the individual links are asymmetrical, with one word depending on the other for its link to the rest of the sentence. Of course in some cases the direction of dependency is controversial; in particular, published WG analyses of noun phrases have taken the determiner as head of the phrase, though this analysis has been disputed and may turn out to be wrong (Van Langendonck 1994; Hudson 2004). The example in Figure 9 illustrates all these characteristics of WG syntax. A dependency analysis has many advantages over one based on phrase structure. For example, it is easy to relate a verb to a lexically selected preposition if they are directly connected by a dependency, as in the pair consists of in Figure 9; but it is much less easy (and natural) to do so if the preposition is part of a prepositional phrase. Such lexical interdependencies are commonplace in language, so dependency analysis is particularly well suited to descriptions which focus on 'constructions' - idiosyncratic patterns not covered by the most general rules (Holmes and Hudson 2005). A surface dependency analysis (explained below) can always be translated into a phrase structure by building a phrase for each word consisting of that word plus the phrases of all the words that depend on it (e. g. a sentence; of a sentence; and so on); but
22
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
KEY WORD CLASSES
DEPENDENCY TYPES
Figure 9 dependency analysis is much more restrictive than phrase-structure analysis because of its total flatness. Because one word can head only one phrase it is impossible to build a dependency analysis which emulates a VP node or 'unary branching'. This resfrictiveness is welcome, because it seems that such analyses are never needed. In contrast, the extra richness of dependency analysis lies partly in the labelled dependency links, and partly in the possibility of multiple dependencies. In a flat structure, in contrast with phrase structure, it is impossible to distinguish co-dependencies (e. g. a verb's subject and object) by configuration, so labels are the only way to distinguish them. There is clearly a theoretical trade-off between phrase structure and labelled functions: the more information is given in one, the less needs to be given in the other. The general theory of WG is certainly compatible with phrase structure - after all, we undoubtedly use part-whole structures in other areas of cognition, and they play an important role in morphology - but it strongly favours dependency analysis because labelled links are ubiquitous in the cognitive network, both in semantics, and elsewhere. If knowledge is generally organized in terms of labelled links, why not also in syntax? But if we do use labelled links (dependencies) in syntax, phrase structure is redundant. Syntactic structures can be much more complex than the example in Figure 9. We shall briefly consider just three kinds of complication: structuresharing, coordination and unreal words. Structure-sharing is found when one word depends on more than one other word - i. e. when it is 'shared' as a dependent. The notion is familiar from modern phrase-structure analyses, especially Head-driven Phrase Structure Grammar (HPSG) (Pollard and Sag 1994: 19), where it is described as 'the central explanatory mechanism', and it is the main device in WG which allows phrases to be discontinuous. (In
WHAT IS WORD GRAMMAR?
23
recognizing structure-sharing, WG departs from the European tradition of dependency analysis which generally allows only strictly 'projective', continuous structures such as Figure 9. ) Figure 10 illustrates two kinds of structure-sharing - in raising (you shared by have and been) and in extraction (what shared by have, been, looking and at). The label 'x<' means 'extractee', and V means 'sharer' (otherwise known as 'xcomp' or 'incomplement').
Figure 10 This diagram also illustrates the notion 'surface structure' mentioned above. Each dependency is licenced by the grammar network, but when the result is structure-sharing, just one of these dependencies is drawn above the words; the totality of dependencies drawn in this way constitutes the sentence's surface structure. In principle any of the competing dependencies could be chosen, but in general only one choice is compatible with the 'geometry' of a well-formed surface structure, which must be free of 'tangling' (crossing dependencies - i. e. discontinuous phrases) and 'dangling' (unintegrated words). There are no such constraints on the non-surface dependencies. (For extensive discussion of how this kind of analysis can be built into a parsing algorithm, see Hudson 2000c; for a comparison with phrase-structure analyses of extraction, see Hudson 2003c. ) The second complication is coordination. The basis of coordination is that conjuncts must share their 'external' dependencies - dependencies (if any) to words outside the coordination. The structure of the coordination itself (in terms of 'conjuncts' and 'coordinators') is analyzed in terms of 'word-strings', simple undifferentiated strings of words whose internal organization is described in terms of ordinary dependencies. A word string need not be a phrase, but can consist of two (or more) mutually independent phrases as in the example of Figure 11, where the coordination and conjuncts are bounded by brackets: {[... ] [... ]}. Unreal -words are the WG equivalent of 'empty categories' in other theories. Until recently I have rejected such categories for lack of persuasive evidence; for example, my claim has always been that verbs which appeared to have no subject really didn't have any subject at all. So an imperative (Hurry!) had no subject, rather than some kind of covert subject. However I am now convinced that, for at least some languages, this is wrong. The evidence comes from case-agreement between subjects and predicatives (WG sharers) in
24
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure II
languages such as Icelandic and Ancient Greek (Hudson 2003a); and the conclusion is that some words have no realization (Creider and Hudson, this volume). In this new analysis, therefore, an imperative verb does have a subject: the word you. This is the ordinary word you, with its ordinary meaning, but exceptionally, it is unrealized because this is what imperative verbs require of their subjects. As Creider and I show, unrealized words may explain a wide range of syntactic facts. This discussion of syntax merely sets the scene for many other syntactic topics, all of which now have reasonably well-motivated WG treatments: word order, agreement, features, case-selection, 'zero' dependents. The most important point made is probably the claim that in syntax the network approach to language and cognition in general leads naturally to dependency analysis rather than to phrase structure. 9.
Semantics
As in any other theory, WG has a compositional semantics in which each word in a sentence contributes some structure that is stored as its meaning. However, these meanings are ordinary concepts which, like every other concept, are defined by a network of links to other concepts. This means that there can be no division between 'purely linguistic' meaning and 'encyclopedic' meaning. For instance the lexemes APPLE and PEAR have distinct senses, the ordinary concepts 'apple' and 'pear', each linked to its known characteristics in the network of general knowledge. It would be impossible to distinguish them merely by the labels 'apple' and 'pear' because (as we saw in section 3. 2) labels on concepts are just optional mnemonics; the true definition of a concept is provided by its various links to other concepts. The same is true of verb meanings: for example, the sense of EAT is defined by its relationships to other concepts such as 'put', 'mouth', 'chew5, 'swallow' and 'food'. The underlying view of meaning is thus similar to Fillmore's Frame Semantics, in which lexical meanings are denned in relation to conceptual 'frames' such as the one for 'commercial transaction' which is exploited by the definitions of 'buy', 'sell' and so on. (See Hudson forthcoming for a WG analysis of commercial transaction verbs. ) Like everything else in cognition, WG semantic structures form a network with labelled links like those that are widely used in Artificial Intelligence. As in JackendofFs Conceptual Semantics (1990), words of all word classes contribute the same kind of semantic structure, which in WG is divided into 'sense5 (general categories) and 'referent' (the most specific individual or category
WHAT IS WORD GRAMMAR?
25
referred to). The contrast between these two kinds of meaning can be compared with the contrast in morphology (section 7) between stem and whole: a word's lexeme provides both its stem and its sense, while its inflection provides its whole and its referent. For example, the word dogs is defined by a combination of the lexeme DOG and the inflection 'plural', so it is classified as 'DOG: plural'. Its lexeme defines the sense, which is 'dog', the general concept of a (typical) dog, while its inflection defines the referent as a set with more than one member. As in other theories the semantics cannot identify the particular set or individual which a word refers to on a particular occasion of use, and which we shall call simply 'set s'; this identification process must be left to the pragmatics. But the semantics does provide a detailed specification for what that individual referent might be - in this case, a set, each of whose members is a dog. One WG notation for the two kinds of meaning parallels that for the two kinds of word-form: a straight line for the sense and the stem, which are both retrieved directly from the lexicon, and a curved line for the referent and the shape, which both have to be discovered by inference. The symmetry of these relationships can be seen in Figure 12.
Figure 12 The way in which the meanings of the words in a sentence are combined is guided by the syntax, but the semantic links are provided by the senses themselves. Figure 13 gives the semantic structure for Dogs barked, where the link between the word meanings is provided by 'bark', which has an 'agent' link (often abbreviated 'er' in WG) to its subject's referent. If we call the particular
26
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
act of barking that this utterance refers to 'event-e', the semantic structure must show that the agent of event-e is set-s. As with nouns, verb inflections contribute directly to the definition of the referent, but a past-tense inflection does this by limiting the event's time to some time ('tl') that preceded the moment of speaking ('now5). Figure 13 shows all these relationships, with the two words labelled VI' and 'w2'. For the sake of simplicity the diagram does not show how these word tokens inherit their characteristics from their respective types.
Figure 13 The analysis of Dogs barked illustrates an important characteristic of WG semantic structures. A word's 'basic' sense - the one that is inherited from its lexeme - is modified by the word's dependents; and the result of this modification is a second sense, more specific than the basic sense but more general than the referent. This intermediate sense contains the meaning of the head word plus its dependent, so in effect it is the meaning of that phrase. In contrast with the syntax, therefore, the semantic structure contains a node for each phrase, as well as nodes for the individual words - in short, a phrase structure. Moreover, this phrase structure must be strictly binary because there are reasons for believing that dependents modify the head word one at a time, each defining a distinct concept, and that the order of combining may correspond roughly to the bracketing found in conventional phrase structure. For example, although subjects and objects are co-dependents, subjects seem to modify the concepts already defined by objects, rather than the other way round, so Dogs chase cats defines the concepts 'chase cats' and 'dogs chase cats', but not 'dogs chase' - in short, a WG semantic structure contains something like a VP node. This step-wise composition of word meanings is called 'semantic phrasing9. This brief account of WG semantics has described some of the basic ideas,
WHAT IS WORD GRAMMAR?
27
but has not been able to illustrate the analyses that these ideas permit. In the WG literature there are extensive discussions of lexical semantics, and some explorations of quantification, definiteness and mood. However it has to be said that the semantics of WG is much less well researched than its syntax. 10. Processing The main achievements on processing are a theory of parsing and a theory of syntactic difficulty; but current research is focused on a general theory of cognitive processing in which language processing falls out as a particular case (Hudson, in preparation). In this theory, processing is driven by a combination of spreading activation, default inheritance and binding, like any other psychological model it needs to be tested, and one step towards this has been taken by building two computer systems called WGNet++ (see www. phon. ucl. ac. uk/home/WGNet/ wgnet++. htm) and Babbage (www. babbagenet. org) for experimenting with complex networks. The most obvious advantage of WG for a parser, compared with transformational theories, is the lack of freely-occurring 'invisible' words (in contrast with the unrealized words discussed above, which can always be predicted from other realized words such as imperative verbs); but the dependency basis also helps by allowing each incoming word to be integrated with the words already processed without the need to build (or rebuild) higher syntactic nodes. A very simple algorithm guides the search for dependencies in a way that guarantees a well-formed surface structure (in the sense defined in section 8): the current word first tries to 'capture' the nearest non-dependent word as its dependent, and if successful repeats the operation; then it tries to 'submit' as a dependent to the nearest word that is not part of its own phrase (or, if unsuccessful, to the word on which this word depends, and so on recursively up the dependency chain); and finally it checks for coordination. (More details can be found in Hudson 2000c. ) The algorithm is illustrated in the following sequence of 'snapshots' in the parsing of Short sentences make good examples, where the last word illustrates the algorithm best. The arrows indicate syntactic dependencies without the usual labels; and it is to be understood that the semantic structure is being built simultaneously, word by word. The structure after ': -' is the output of the parser at that point (4)
a b c d e f
wl w2 w3 w4 w5
= snort. = sentences. = make. = good. = examples.
No progress Capture Capture No progress Capture Submit
- wl. - wl - wl - wl - w4 - wl
w2. w2 w2 w5. w2
w3. w3, w4
w3
.
(w4 <-) w5.
The familiar complexities of syntax are mostly produced by discontinuous patterns. As explained in section 8, the discontinuous phrases are shown by dependencies which are drawn beneath the words, leaving a straightforward
28
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
surface structure. For example, subject-raising in He has been working is shown by non-surface subject links from both been and working to he. Once the surface structure is in place, these extra dependencies can be inferred more or less mechanically (bar ambiguities), with very little extra cost to the parser. The theory of syntactic complexity (Hudson 1996b) builds on this incremental parsing model. The aim of the parser is to link each word as a dependent to some other word, and this link can most easily be established while both words are still active in working memory. Once a word has become inactive it can be reconstructed (on the basis of the meaning that it contributed), but this is costly. The consequence is that short links are always preferred to long ones. This gives a very simple basis for calculating the processing load for a sentence (or even for a whole text): the mean 'dependency distance' (calculated as the number of other words between a word and the word on which it depends). Following research by Gibson (1998) the measure could be made more sophisticated by weighting intervening words, but even the simple measure described here gives plausible results when applied to sample texts (Hiranuma 2001). It is also supported by a very robust statistic about English texts: that dependency links tend to be very short. (Typically 70 per cent of words are adjacent to the word on which they depend, with 10 per cent variation in either direction according to the text's difficulty. ) 11. Conclusions WG addresses questions from a number of different research traditions. As in formal linguistics, it is concerned with the formal properties of language structure; but it also shares with cognitive linguistics a focus on how these structures are embedded in general cognition. Within syntax, it uses dependencies rather than phrase structure but also recognizes the rich structures that have been highlighted in the phrase-structure tradition. In morphology it follows the European tradition which separates morphology strictly from syntax, but also allows exceptional words which (thanks to cliticization) contain the forms of smaller words. And so on through other areas of language. Every theoretical decision is driven by two concerns: staying true to the facts of language, and providing the simplest possible explanation for these facts. The search for new insights is still continuing, and more cherished beliefs may well have to be abandoned; but the most general conclusion so far seems to be that language is mostly very much like other areas of cognition. References Aitchison, Jean (1987), Words in the Mind: An Introduction to the Mental Lexicon. Oxford: Blackwell. Altaian, Gerry (1997), The Ascent of Babel: An Exploration of Language, Mind and Understanding. Oxford: Oxford University Press. Anderson, John (1971), 'Dependency and grammatical functions'. Foundations of Language, 7, 30-7. Bates, Elizabeth, Elman, Jeffrey, Johnson, Mark, Karmiloff-Smith, Annette, Parisi,
WHAT IS WORD GRAMMAR?
29
Domenico and Plunkett, Kim (1998), 'Innateness and emergentism', in William Bechtel and George Graham (eds), A Companion to Cognitive Science. Oxford: Blackwell, pp. 590-601. Bennett, David (1994), 'Stratificational Grammar', in Ronald Asher (ed. ), Encyclopedia of Language and Linguistics. Oxford: Elsevier, pp. 4351-56. Bresnan, Joan (1978), 'A realistic transformational grammar', in Morris Halle, Joan Bresnan, and George Miller (eds), Linguistic Theory and Psychological Reality. Cambridge, MA: MIT Press, pp. 1-59. Brown, Dunstan, Corbett, Greville, Fraser, Norman, Hippisley, Andrew and Timberlake, Alan (1996), 'Russian noun stress and network morphology'. Linguistics, 34, 53-107. Butler, Christopher (1985), Systemic Linguistics: Theory and Application. London: Arnold. Camdzic, Amela and Hudson, Richard (2002), 'Clitics in Serbo-Croat-Bosnian'. UCL Working Papers in Linguistics, 14, 321-54. Chekili, Ferid (1982), 'The Morphology of the Arabic Dialect of Tunis'. (Unpublished doctoral dissertation, University of London). Chomsky, Noam (1957), Syntactic Structures. The Hague: Mouton. — (1965), Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. — (1970), 'Remarks on nominalization', in Rodney Jacobs and Peter Rosenbaum (eds), Readings in Transformational Grammar. London: Ginn, pp. 184-221. Creider, Chet (1999) 'Mixed categories in Word Grammar: Swahili infinitival nouns'. Linguistica Atlantica, 21, 53-68. Creider, Chet and Hudson, Richard (1999), 'Inflectional morphology in Word Grammar'. Lingua, 107, 163-87. — (Ch. 2 in this volume), 'Case agreement in Ancient Greek: implications for a theory of covert elements'. This volume. Eppler, Eva (2005), 'The Syntax of German-English code-switching'. (Unpublished doctoral dissertation, UCL). Evans, Roger and Gazdar, Gerald (1996), 'DATR: a language for lexical knowledge representation'. Computational Linguistics, 22, 167-216. Fillmore, Charles (1975), 'An alternative to checklist theories of meaning' Proceedings of the Berkeley Linguistics Society, 1, 123-31. — (1976), 'Frame semantics and the nature of language'. Annals of the New York Academy of Sciences, 280, 20-32. Fraser, Norman (1985), 'A Word Grammar Parser' (Unpublished doctoral dissertation, University of London). — (1989), 'Parsing and dependency grammar'. UCL Working Papers in Linguistics, 1, 296-319. — (1993), 'Dependency Parsing' (Unpublished doctoral dissertation, UCL). — (1994), 'Dependency Grammar', in Ronald Asher (ed. ), Encyclopedia of Language and Linguistics. Oxford: Elsevier, pp. 860-4. Fraser, Norman and Hudson, Richard (1992), 'Inheritance in Word Grammar'. Computational Linguistics, 18, 133-58. Gibson, Edward (1998), 'linguistic complexity: locality of syntactic dependencies. ' Cognition 68, 1-76. Gisborne, Nikolas (1993), 'Nominalisations of perception verbs'. UCL Working Papers in Linguistics, 5, 23-44. — (1996), 'English Perception Verbs'. (Unpublished doctoral dissertation, UCL). — (2000), 'The complementation of verbs of appearance by adverbs', in Ricardo Bermudez-Otero, David Denison, Richard Hogg and C. McCully (eds), Generative
30
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Theory and Corpus Studies: A Dialogue from 10 ICEHL. Berlin: Mouton de Gruyter, pp. 53-75. — (2001), 'The stative/dynamic contrast and argument linking'. Language Sciences, 23, 603-37. Goldberg, Adele (1995), Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. Gorayska, Barbara (1985), 'The Semantics and Pragmatics of English and Polish with Reference to Aspect' (Unpublished doctoral dissertation, UCL). Halliday, Michael (1961), 'Categories of the theory of grammar'. Word, 17, 241-92. — (1967-8), 'Notes on transitivity and theme in English'. Journal of Linguistics, 3, 3782, 199-244; 4, 179-216. — (1985), An Introduction to Functional Grammar. London: Arnold. Hiranuma, So (1999), 'Syntactic difficulty in English and Japanese: A textual study'. UCL Working Papers in Linguistics, 11, 309-21. — (2001), 'The Syntactic Difficulty of Japanese Sentences' (Unpublished doctoral dissertation, UCL). Holmes, Jasper (2004), 'Lexical Properties of English Verbs' (Unpublished doctoral dissertation, UCL). Holmes, Jasper and Hudson, Richard (2005), 'Constructions in Word Grammar', in Jan-Ola Ostman and Mirjam Fried (eds), Construction Grammars: Cognitive Grounding and Theoretical Extensions. Amsterdam: Benjamins, pp. 243-72. Hudson, Richard (1964), 'A Grammatical Analysis of Beja' (Unpublished doctoral dissertation, University of London). — (1970), English Complex Sentences: An Introduction to Systemic Grammar/Amsterdam: Nordi-Holland. — (1976), Arguments for a Non-transformational Grammar. Chicago: University of Chicago Press. — (1980a), Sociolinguistics. Cambridge: Cambridge University Press. — (1980b), 'Constituency and dependency'. Linguistics, 18, 179-98. — (1981), 'Panlexicalism'. Journal of Literary Semantics, 10, 67-78. - (1984), Word Grammar. Oxford: BlackweU. — (1989), 'Towards a computer-testable Word Grammar of English'. UCL Working Papers in Linguistics, 1, 321-39. — (1990), English Word Grammar. Oxford: Blackwell. — (1992), 'Raising in syntax, semantics and cognition', in Iggy Roca (ed. ), Thematic Structure: Its Role in Grammar. The Hague: Mouton, pp. 175-98. — (1993), 'Do we have heads in our minds?', in Greville Corbett, Scott McGlashen and Norman Fraser (eds), Heads in Grammatical Theory. Cambridge: Cambridge University Press, pp. 266-91. — (1995), Word Meaning. London: Roudedge. — (1996a), Sociolinguistics (2nd edition). Cambridge: Cambridge University Press. — (1996b), 'The difficulty of (so-called) self-embedded structures'. UCL Working Papers in Linguistics, 8, 283-314. — (1997a), 'The rise of auxiliary DO: verb non-raising or category-strengthening?'. Transactions of the Philological Society, 95, 41-72. — (1997b), 'Inherent variability and linguistic theory'. Cognitive Linguistics, 8, 73-108. — (1998), English Grammar. London: Routiedge. - (2000a), 'VflrowY. Language, 76, 297-323. — (2000b), 'Gerunds and multiple default inheritance'. UCL Working Papers in Linguistics, 12, 303-35. — (2000c), 'Discontinuity'. Traitement Automatique des Langues, 41, 15-56.
WHAT IS WORD GRAMMAR?
31
- (2001), 'Cities in Word Grammar'. UCL Working Papers in Linguistics, 13, 293-4. — (2003a), 'Case-agreement, PRO and structure-sharing'. Research in Language, 1, 7-33. — (2003b), 'Mismatches in default inheritance', in Elaine Francis and Laura Michaelis (eds), Linguistic Mismatch: Scope and Theory. Stanford: CSLI, pp. 269-317. - (2003c), 'Trouble on the left periphery'. Lingua, 113, 607-42. — (2004), 'Are determiners heads?'. Functions of Language,, 11, 7-43. — (forthcoming), 'Buying and selling in Word Grammar', in Jozsef Andor and Peter Pelyvas (eds), Empirical, Cognitive-Based Studies In The Semantics-Pragmatics Interface. Oxford: Elsevier. — (in preparation) Advances in Word Grammar. Oxford: Oxford University Press. Hudson, Richard and Holmes, Jasper (2000) 'Re-cycling in the Encyclopedia', in Bert Peeters (ed. ), The Lexicon/Encyclopedia Interface. Oxford: Elsevier, pp. 259-90. Jackendoff, Ray (1990), Semantic Structures. Cambridge, MA: MIT Press. Kamiloff-Smith, Annette (1992), Beyond Modularity: A developmental perspective on cognitive science. Cambridge, Mass.: MIT Press. Kaplan, Ron and Bresnan, Joan (1982), 'Lexical-functional Grammar: a formal system for grammatical representation', in Joan Bresnan (ed. ), The Mental Representation of Grammatical Relations. Cambridge, MA: MIT Press, pp. 173-281. Kreps, Christian (1997), 'Extraction, Movement and Dependency Theory' (Unpublished doctoral dissertation, UCL). Lakoff, George (1987), Women, Fire and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press. Lamb, Sidney (1966), An Outline of Stratificational Grammar. Washington, DC: Georgetown University Press. — (1999), Pathways of the Brain: The Neurocognitive Basis of Language. Amsterdam: Benjamins. Langacker, Ronald (1987), Foundations of Cognitive Grammar I. Theoretical Prerequisites. Stanford: Stanford University Press. — (1990), Concept, Image and Symbol. The Cognitive Basis of Grammar. Berlin: De Gruyter. Luger, George and Stubblefield, William (1993), Artificial Intelligence. Structures and Strategies for Complex Problem Solving. Redwood City, CA: Benjamin/Cummings Pub. Co. Lyons, John (1977), Semantics. Cambridge: Cambridge University Press. McCawley, James (1968), 'Concerning the base component of a transformational grammar'. Foundations of Language, 4, 243-69. Pinker, Steven (1994), The Language Instinct. Harmondsworth: Penguin Books. Pinker, Steven and Prince, Alan (1988), 'On language and connectionism: Analysis of a Parallel Distributed Processing model of language acquisition'. Cognition, 28, 73-193. Pollard, Carl and Sag, Ivan (1994), Head-driven Phrase Structure Grammar. Chicago: University of Chicago Press. Reisberg, Daniel (1997), Cognition. Exploring the Science of the Mind. New York: W. W. Norton. Robinson, Jane (1970), 'Dependency structures and transformational rules'. Language, 46, 259-85. Rosta, Andrew (1994), 'Dependency and grammatical relations'. UCL Working Papers in Linguistics, 6, 219-58. — (1996), 'S-dependency'. UCL Working Papers in Linguistics, 8, 387-421. — (1997), English Syntax and Word Grammar Theory. Unpublished doctoral dissertation, UCL, London. Shaumyan, Olga (1995), Parsing English with Word Grammar. Imperial College London MSc Thesis.
32
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Sugayama, Kensei (1991), 'More on unaccusative Sino-Japanese complex predicates in Japanese'. UCL Working Papers in Linguistics, 3, 397-415. — (1992), 'A word-grammatic account of complements and adjuncts in Japanese (interim report)'. Kobe City University Journal, 43, 89-99. — (1993), 'A word-grammatic account of complements and adjuncts in Japanese'. Proceedings of the 15th International Congress of Linguistics, Vol. 2, Universite Laval, pp. 373-6. — (1996), 'Semantic structure of eat and its Japanese equivalent taberu: a WordGrammatic account'. Translation and Meaning, 4, 193-202. Taylor, John (1989), Linguistic Categorisation: An Essay in Cognitive Linguistics. Oxford: Oxford University Press. Tesniere, Lucien (1959), Elements de Syntaxe Structurale. Paris: Klincksieck. Touretzky, David (1986), The Mathematics of Inheritance Systems. Los Altos, CA: M. Kaufmann Publishers. Tzanidaki, Dimitra (1995), 'Greek word order: towards a new approach'. UCL Working Papers in Linguistics, 7, 247-77. — (1996a), 'Configurationality and Greek clause structure'. UCL Working Papers in Linguistics, 8, 449-84. — (1996b), 'The Syntax and Pragmatics of Subject and Object Position in Modern Greek' (Unpublished doctoral dissertation, UCL). van Langendonck, Willy (1987), Word Grammar and child grammar'. Belgian Journal of Linguistics, 2, 109-32. — (1994), 'Determiners as heads?'. Cognitive Linguistics, 5, 243-59. Volino, Max (1990), 'Word Grammar, Unification and the Syntax of Italian Clitics' (Unpublished doctoral dissertation, Edinburgh University). Zwicky, Arnold (1977), On Clitics. Bloomington: IULC. — (1992), 'Some choices in the theory of morphology', in Robert Levine (ed. ), Formal Grammar: Theory and Implementation. Oxford: Oxford University Press, pp. 327-71.
Parti Word Grammar Approaches to Linguistic Analysis: Its explanatory power and applications
This page intentionally left blank
2. Case Agreement in Ancient Greek: Implications for a theory of covert elements CHET CREIDER AND RICHARD HUDSON
Abstract In Ancient Greek a predicative adjective or noun agrees in case with the subject of its clause, even if the latter is covert. This provides compelling evidence for 'empty' (i. e. covert) elements in syntax, contrary to the tradition in WG theory. We present an analysis of empty elements which exploits a feature unique to WG, the separation of 'existence' propositions from propositions dealing with other properties; an empty word has the property of not existing (or, more technically, 0 quantity). We contrast this theory with the Chomskyan PRO and the Head-driven Phrase Structure Grammar (HPSG) 'potential' SUBCAT list 1.
Introduction
Case agreement in Ancient Greek1 has attracted a small but varied set of treatments in the generative tradition (Andrews 1971; Lecarme 1978; Quicoli 1982). In this literature the problems were framed and solved in transformational frameworks. In the present chapter we wish to consider the data from the point of view of the problems they pose for a theory of case assignment and phonologically empty elements in a modern, declarative framework - Word Grammar (WG; Hudson 1990). We present an analysis of empty elements which exploits a feature unique to WG, the separation of existence propositions from propositions dealing with other properties; and we contrast it with earlier WG analyses in which these 'empty' elements are simply absent and with Chomskyan analyses in terms of PRO, a specific linguistic item which is always covert. The proposed analysis is similar in some respects to the one proposed by Pollard and Sag (1994) for HPSG. 2.
The Data
We confine our attention to infinitival constructions. The infinitive in Ancient Greek is not inflected for person, number or case and hence, when predicate adjectives and predicate nominals appear as complements of infinitives, it is necessary to account for the case of these elements. One purpose of this discussion is to show that traditional grammars are right to explain the case of predicates in terms of agreement with the subject, but this analysis works most
36
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
naturally if we also assume some kind of 'null' subject for some infinitives. The examples that support the null subject take the accusative case, and are discussed in section 2. 1; there are well-known exceptions which are traditionally described in terms of 'attraction' and which are discussed in section 2. 2. 2. 1. Accusative subjects Traditional grammars of Greek state that the subject of an infinitive takes the accusative case. Examples are usually given of the accusative plus infinitive construction as in the following:
(1)
(2)
ekeleuon autous poreuesthai them(acc) they-ordered to-proceed 'they ordered that they should proceed' (Smyth 1956: 260, X. A. 4. 2. 12) apelthein toiis andras phe: si the (ace) men(acc) to-go-away s/he-says s/he says that the men went away' (Goodwin 1930: 196)
A partial syntactic analysis of (2) is shown in Figure 1. In this analysis the infinitive is a dependent (object) of the main verb, and it has a dependent (subject) which bears the accusative case. We assume a standard analysis in which the definite determiner is a head with respect to a 'determiner phrase'.
Figure 1 Since the subject is accusative, elements (predicate nouns and adjectives) which agree with it are accusative (in contrast with the nominative case found when the subject is nominative): (3)
a b
phugas e: n Klearkhos exile(nom) was (contrast phugada, 'exile(acc)') Clearchus(nom) 'Clearchus was an exile. ' (X. A. 1. 1. 9) einai kai patrida huma: s emoi nomizo: gar you(acc) me(dat) to-be and fatherland(acc) I-think for philous kai friends (ace) and 'for I think you are to me both fatherland and friends' (X. A. 1. 3. 6)
CASE AGREEMENT IN ANCIENT GREEK
37
The agreement of a predicative with the subject can be conveniently diagrammed as in Figure 2. (In words, whatever a verb's subject and predicative may be, their case must be the same. )
Figure 2 However, note that the predicative may be accusative even when the accusative subject is itself absent: (4) philanthro: pon einai dei humane (ace) to-be must '(one) must be humane' (I. 2. 15) ergoisi dae: mona (5) oud' ara po: s e: n enpantess' not then in-any-way was in all works skilled(acc) pho: ta genesthai man(acc) to-become '(one) could not then in any way become a man skilled in all works' (H. II. 23. 670-71) This can also be true even when there is a coreferential element in the higher clause: (6)
exarkesei soi nirannon genesthai it-will-suffice you(dat) king(acc) to-become 'it will be enough for you to become king' (Liddell and Scott 1971, P. Ale. 2. 141. a)
Figure 3
38
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
A partial structure for (6) is presented in Figure 3. Such examples raise urgent questions about the status of 'understood' subjects. If an understood subject is simply one which is 'understood' in the semantics but entirely absent from the syntax, it is hard to explain the case of the predicative in these examples. We return to these questions below. When the subject of the infinitive is identical to that of the main verb, it is normally not expressed: (7) all' hod' ane: r ethelei pen panto: n but this man(nom) he-wishes above all (gen) allo: n others (gen) 'but this man wishes to be above all others' (H. II. 1. 287)
emmenai to-be
Agreeing elements may nevertheless appear in the accusative: (8)
enth' erne men pro: tisth' hetaroi lissonto then me (ace) then first-of-all companions (nom) they-begged epeesin turo: n ainumenous ienai palin words(dat) cheeses(gen) taking(acc-pl) to-go back 'thereupon my companions then first of all begged me with words to take (i. e. that they might take) some of the cheeses (and) to depart' (H. Od. 9. 224-5)
Examples like (8) are hard to explain without assuming some kind of accusative subject for the infinitive with which the predicative participle ('taking') can agree, as hinted at in Figure 4; but of course there is no overt accusative subject.
Figure 4 In situations of emphasis, an infinitival subject may be expressed (Chantraine 1953: 312; Kiihner and Gerth 1955, 2: 30-31; Smyth 1956: 439). When expressed it appears in the accusative case: (9)
ego: men toinun eukhomai prin tauta epideinhuph' humo: n I(nom) as-for therefore pray (that) before these-things to-see by you genomena murias erne ge kata te: s ge: s orguias having-become 10, 000 me (ace) indeed under the earth fathoms genesthai to-become 'for my part, therefore, I pray that before I see these tilings having been brought about by you, I may be ten thousand fathoms under the earth' (Kiihner and Gerth 1955: 31, X. A. 7. 1. 30)
CASE AGREEMENT IN ANCIENT GREEK
39
(10) hoi Aigiiptioi enomizon heo: utous pro: tous genesthai the(nom) Egyptians(nom) they-thought themselves (ace) first(acc) to-be panto: n anthro: po: n all (gen) human-beings (gen) 'the Egyptians used to think they were the first of all human beings' (Kuhner and Gerth 1955: 31, Hdt. 2. 2) (11) to: n d' allo: n erne phe: mi tolu propheresteron emai of-those others me (ace) I-say by-far better(acc) to-be 'but of those others I say I am better by far' (H. Od. 8. 221) (Other examples for Homeric Greek in II. 7. 198, 13. 269, 20. 361 - Chantraine
1953: 312. ) The emphasis need not be strong, as the following example, with unstressed clitic pronoun, shows: (12) kai te me phe: mi makhe: Tro: essi are: gein and in-fact me (ace) s/he-says batde(dat) Trojans(dat) to-help 'and she says that I help the Trojans in battle' (H. II. 1. 521) When the infinitive is used in exclamations with an overt subject, the latter appears in the accusative: (13) erne tatheih tade me (ace) to-suffer this 'That I should suffer this!' (A. Eum. 837) These examples with overt accusative subjects strongly support the traditional rule that infinitives have accusative subjects, so the question is how to allow this generalization to extend to infinitives which appear to have no subject at all in order to explain the accusative cases found on predicatives in infinitival clauses in examples such as (4) and (5). 2. 2 Non-accusative subjects Greek provides a number of interesting alternatives to the possibilities given in section 2. 1. These are traditionally discussed under two headings, although the process is the same in both cases: Sehr viele der Verben, die den Infinitive zu sich nehmen, haben daneben noch ein personliches Objekt bei sich, welches in dem Kasus steht, den das Verb erfordert... Wenn zu dem Infinitive adjektivische oder substantivische Pradikatsbestimmungen treten, so stehen dieselben entweder vermittelst einer Attraktion mit dem personlichen Objekte in gleichem Kasus oder mit Vernachlassigung der Attraktion im Akkusative (Kuhner and Gerth 1955: 24)3 Wenn aber das Subjekt des regierenden Verbs zugleich auch das Subjekt des Infinitivs ist, so wird das Subjekt des Infmitivs... weggelassen, und wenn adjektivische oder substantivische Pradiskatsbestimmungen bei dem Infinitive stehen, so werden diese vermittelst der Attraktion in den Nominative gesetzt (ibid.: 29)4
40
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
In short, the predicative of an infinitive may have a case which is 'attracted' to that of a nominal in the higher clause, whether its object or its subject. Examples: (14)
emoi de ke kerdion eie: seu aphamartouse: khthona me (dat) but would better it-be you (gen) losing (dat) earth (ace) dumenai to-go (beneath) 'but for me it would be better losing you to die' (H. II. 6. 410-11) (15) dokeo: he,: jm: nAigine: teo: n deesthai ton theon khre: sai I-think us(dat) Aeginetans(gen) to-beg the(acc) god(acc) toadvise timo: re: te: ro: n genestha helpers (gen) to-become 'I think the god has advised us to beg the Aeginetans to become (our) helpers' (Hdt. 5. 80) Examples of attraction show that some infinitives do not have accusative subjects, but they do not undermine the generalization that many do. The analysis of attraction is tangential to our present concern, but is easily accomplished via 'structure-sharing', 5 where the higher nominal doubles up as subject of the infinitival clause - for example, in (15) the genitive noun 'Aeginetans' is not only the complement of the higher verb 'beg' but also subject of the lower infinitive. The proposed structure for this remarkably complicated sentence is shown in Figure 5.
Figure 5 This analysis easily explains why the lower nominal predicate 'helpers' has the same case as this shared nominal, but it does not help with examples where even the higher nominal is merely implicit, as in (16). (We give an explanation for examples of this type in section 6. ) (16) ethelo: de toi e: pios einai I-wish but you(dat) kind(nom) to-be 'but I wish to be kind to you' (H. II. 8. 40)
CASE AGREEMENT IN ANCIENT GREEK
41
According to Chantraine (1953: 313) the relative frequency of attraction increased from Homer forward into Attic authors and in the Attic period it appears to have been obligatory in cases like (16), i. e. there are no examples like (8) in Attic Greek. This may have been the reason traditional grammars discuss attraction under two headings, one for nominative cases and the other for oblique cases. 3.
The Analysis of Case Agreement
First, note that morphological case, unlike gender or number, is a purely morpho-syntactic property, so it is available to words and not to their meanings. One consequence is that it is independent of reference. (17)
I saw him yesterday while he was on his way to the beach.
In (17) the three pronouns share a common set of semantic features (gender and number) and have a common referent, but occur respectively in the 'objective', 'subjective' and 'possessive' cases (to use the terms of traditional grammar). So far as we know, semantic co-reference between a nominal and a pronoun never triggers case agreement in the latter, though it often requires agreement for number and gender. A further consequence is that the only possible 'target' for case agreement is a word (or other syntactic element); this rules out a semantic account of case agreement, however attractive such an account might be for number and gender. Thus, faced with examples such as (18=6), where an infinitive has an accusative predicative but no overt subject, we cannot explain the predicative's case merely by postulating a subject argument in the semantic structure without a supporting syntactic subject. (18)
exarkesei soi turannon genesthai it-will-suffice you(dat) king(acc) to-become 'it will be enough for you to become king'
The argument X in the semantic structure 'X becoming a king' cannot by itself carry the accusative case; there must also be some kind of accusative subject nominal in the syntactic structure. Nor does a control analysis help, because the controlling nominal is the pronoun soi, 'to you', which is dative; as expected, its case has nothing to do with that of coreferential nominals. The analysis seems to show that a specifically syntactic subject must be present in order to account for the accusative case seen on the predicate nominal. We accept this conclusion, though somewhat reluctantly because it conflicts with the stress on concreteness which we consider an important principle of Word Grammar. We are sceptical about the proliferation of empty nodes in Chomskyan theories, and have always felt that the evidence for such nodes rested heavily on theory-internal assumptions which we did not share. In contrast, the evidence from case agreement strikes us as very persuasive, so we now believe that syntactic structure may contain some 'null' elements which are
42 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
not audible or visible, such as a case-bearing subject in Ancient Greek infinitival clauses; this is the analysis that we shall develop in the rest of this chapter. (For a fuller statement of this argument and conclusion, see Hudson 2003. ) Fortunately, it seems that this evidence is supported by completely independent data from other languages. For example, Welsh soft mutation and agreement are much easier to explain if we allow syntactic subjects to be inaudible (Borsley 2005). Welsh mutation applies to verb dependents which are separated from the verb. Normally these are objects as in (19): (19) Gweles (i) gi. saw-lSG (I) dog
'I saw a dog. '
Here gi is the mutated form of ci, 'dog', whose form shows it to be object rather than subject even when the optional subject i is omitted. Conversely, however, subjects are also mutated if they are delayed, as in (20): (20)
a.
Mae ci is dog
yn in
b.
Mae yn yr ardd gi.
yr the
ardd. garden
'A dog is in the garden. '
In sentence (a), ci is in the unmutated form expected of a subject, but it is mutated in (b) because it has been separated from the verb mae. The generalization seems to be that if a subject or object dependent is separated from the verb, it is mutated; but this generalization presupposes that there is always a syntactic subject in examples like (19), even when no subject is audible. The same conclusion also simplifies the treatment of verb agreement, which is confined to verbs whose subject is a personal pronoun. In (19), the suffix {es} on gweles can be said to agree with a first-person singular subject even when this is covert. In short, inaudible subjects make the grammar of Welsh simpler and more explanatory, a possibility which we assume occurs to naive learners of the language as well as to linguists. In the rest of this chapter we explore the notion 'null element' within the theoretical framework of Word Grammar. What exactly does it mean to say that an element is 'null' in a cognitive theory of language which maximizes the similarities between language and other kinds of cognition? Having introduced the relevant ideas we shall contrast our view of null elements with the more familiar ideas about elements such as PRO and pro, as well as with other proposals from the WG and HPSG traditions. 4.
Non-Existent Entities in Cognition and in Language
One of the rather obvious facts about everyday knowledge is that we know things about entities which we know not to exist. For example, we know that Father Christmas brings presents, wears a red coat and has a beard; but we also know that he doesn't exist. How can we know the characteristics of a non-
CASE AGREEMENT IN ANCIENT GREEK
43
existent object? The answer must be that 'existence' is somehow separable from other kinds of characteristic. However there is a serious danger of an internal contradiction because it is also clear that the concept of Father Christmas does exist, complete with the links to beards, red coats and presents, even for those of us who know he does not exist. How can this contradiction be avoided? One possible answer follows from a basic assumption of Word Grammar: that tokens and types are distinct concepts, each with a separate representation in the mind (Hudson 1984: 24; Hudson 1990: 31-2). Tokens exist in the realm of ongoing experience, while types exist in permanent knowledge; in other words, roughly speaking tokens are represented in working memory and types in long-term memory. For example, when we see a bird, we assume that we introduce a new concept to represent it in our minds, a token concept which is distinct from every permanent concept we have for birds or bird species. Having introduced this distinct concept we can then classify it (e. g. it's a robin), notice unusual features and remember it. None of this is possible if tokens and types share the same mental nodes. Another difference between tokens and types is that tokens are part of our ongoing experience; in short, they are 'real', whereas types are merely memories and may even be fictional. For example, we have a permanent concept for Father Christmas complete with a list of attributes. This concept is just like those for other people except that we know he's not real; in other words, we know we will never meet a token of Father Christmas (even if we do meet tokens of people pretending to be tokens of Father Christmas). This contrast between real and unreal types can be captured in WG by an attribute which we call 'quantity' (the 'quantitator' of Hudson 1990: 23). If every token of experience has a quantity of 1, then real and unreal types can be distinguished by their quantities - 1 for real and 0 for unreal. If Father Christmas has a quantity of 0, then any putative token of Father Christmas would be (highly) exceptional. The example of Father Christmas is rather isolated because most of our concepts are based firmly on experience. However, there is an even more important role for the quantity variable, which is in inherited attributes. For example, the default person has a foot on each leg, but some unfortunate individuals have lost a foot. The quantity variable provides a mechanism for stating the default (each leg has one foot), and then for overriding it in the case of specified individuals (e. g. the pantomime character Long John Silver has no foot on his left leg). Potentially inheritable attributes of this kind are a common part of experience and provide an important role for the quantity variable. Strictly speaking, quantity is a function like any other, but it plays such a basic role that it is convenient to abbreviate it as a mere number on the node itself. Using this notation (together with the triangles standardly used in WG notation for the 'is-a' relation) tiie facts about feet may be shown in a diagram such as Figure 6. In prose, a typical person has one left leg, which has one foot; but although Long John Silver has the expected number of left legs (shown as a mere dot, which inherits the default quantity), this leg has zero feet.
44
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure 6 Returning to the analysis of grammatical structure, we see no reason to recognize fictional words as such - after all, how could one learn them? (Notice that we learn about Father Christmas via verbal and visual representations, for which there is no parallel in language. ) In other words, we see no justification for lexical items such as the Chomskyan PRO which are inherently inaudible. However, we do see an important role for dependents that are merely potential (like a missing left foot). For example, take syntactic objects. On the one hand, we know that they are typically nouns, that they typically express the patient or theme of the action, that they follow the verb, and so on; it is essential to state all these properties just once, at the highest level (i. e. as a characteristic of the typical verb or even of the typical word). But on the other hand, we also know that some verbs require an object, others merely allow one, and others again refuse one. It is essential to be able to separate these statements of 'existence' from all the other properties of objects, and the obvious mechanism is the quantity variable introduced above. The default object may have a quantity which is compatible with either 1 or 0, but for many individual verbs this is overridden for obligatorily transitive or intransitive verbs. Similar remarks apply to subjects, the topic of this chapter, but first we must distinguish two different kinds of 'null' dependent. On the one hand, there are dependents that are optional in the semantics as well as in the syntax; for example, many verbs allow a beneficiary (e. g. make her a cake), but in the absence of a syntactic beneficiary there is no reason to assume a semantic one. In a sentence such as She made a cake, therefore, there is no beneficiary dependent although one is possible. In this case, therefore, the dependent itself has quantity 0. In many other cases, on the other hand, the null dependent does contribute to the semantics; for instance, even when DRINK has no object (He drank till he fell asleep], its semantic structure certainly includes some liquid which by default is alcohol. In this case, we assume that the syntax does contain an object noun, an ordinary noun such as ALCOHOL complete with its ordinary meaning; but exceptionally, it has no realization in form - no stem or fully inflected form. Null subjects in English are always of this second type: specific words which have their usual meaning but which are deprived of any form because their form's quantity is 0. This quantity varies with the verb's inflectional category;
CASE AGREEMENT IN ANCIENT GREEK
45
e. g. finite verbs generally have a subject, but imperatives (a kind of finite) normally have an unrealized YOU. (See section 5. 3 for more discussion. ) A notation for unrealized words in a written example would help to distinguish them conceptually from PRO; we suggest square brackets round the ordinary orthography for the missing word. Using this notation, we might write an English imperative as follows: (21)
[You] hurry up!
The relevant grammar is in Figure 7, which includes the very general default tendency for words to have a realization.
Figure 7 In the case of Ancient Greek infinitives and participles, what distinguishes those with overt subjects from those without is simply the quantity of the subject's realization. Even if an infinitive has no overt subject, it still has a subject, and this subject is an ordinary word (probably a personal pronoun) which has the full range of inheritable syntactic and semantic properties. And crucially, it has a case which may trigger case agreement in a predicate. The relevant part of the grammar is sketched in Figure 8. According to this diagram, a verb's subject is normally nominative and has optional realization - in other words, Greek is a (so-called) pro-drop language. (We discuss this point further in the next section. ) However, infinitives override the default pattern by demanding the accusative case, so even 'null' subjects of infinitives have the accusative case.
46
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure 8 To summarize, then, we are proposing an attribute 'quantity' which controls the way in which we map stored concepts to items of experience, as types to tokens. Any item of experience has the value 1 for this attribute, so it will only match stored concepts which have the same value. In the case of words, what allows us to experience them is their realization, so by definition the quantity for the realization of a word token is 1. However the grammar of a language allows the default 1 for realization to be overridden in the case of dependents of specific types of words, such as infinitives. But although these words have no realization, they do have all the other properties expected of them, including grammatical properties such as case. In Greek, it is the case of unrealized subjects that explains the agreement patterns described in the first section. 5.
Extensions to Other Parts of Grammar
Since the proposed system applies equally to our knowledge of Father Christmas and to the subjects of Greek infinitives, it would not be surprising if it turned out to be relevant in other parts of grammar as well. The following list suggests a number of other areas where 'understood' elements can be handled in the same way. 5. 1 Null subjects of tensed verbs in fpro-drop languages' Whereas English requires tensed verbs to have a realized subject, pro-drop languages allow it to be unrealized. This is helpful in Ancient Greek, where predicatives have the nominative case in tensed clauses even when the subject is unrealized, and similarly a virtual subject is as likely as an overt one to 'attract' the predicative of a lower clause to its nominative case. A relevant example is (22)=(16), which we noted above as an outstanding problem for a 'structuresharing' analysis of attraction. If we assume that the main verb ethelo:, 'I wish', has a nominative (but unreal) pronoun as its subject, the nominative on the lower predicative is as expected because this unreal pronoun is also the subject of the lower clause:
CASE AGREEMENT IN ANCIENT GREEK (22)
47
ethelo: de toi erpios einai I-wish but you(dat) kind(nom) to-be 'but I wish to be kind to you' (H. II. 8. 40)
The subject-verb agreement on the verb is easy to explain if there is always a subject, real or unreal. Without this assumption, however, a rule of agreement does not extend easily to examples where there is no overt higher subject. 5. 2 'object pro-drop* (23)
ou dei tois paidotribais enkalein oud' ekballein ek not necessary the(dat) trainers (dat) to-accuse nor to-banish from to: n poleo: n the (gen) cities (gen) 'it is not necessary to accuse the trainers nor to banish them from the cities' (P. G. 460 d. )
This phenomenon, less common than 'subject pro-drop' but very common in Greek, is traditionally analyzed under the rubric of 'object-sharing' and has no agreed modern analysis. Treating the 'omitted' object as unrealized provides a natural and simple account. Note that the traditional shared object analysis would incorrectly associate the dative case with the object of ekballein (normally accusative in this context). 5. 3 Subjects of imperatives In languages where these are usually absent, such as English, the identity of the unreal subject is very clear: as we assumed above, it must be the pronoun you for second-person imperatives, and we for first-person plural ones. This is clear not only from the meaning but also from the choice of pronoun in the tag question: (24) (25)
Hurry up, will you? Let's go now, shall we?
Moreover, where a language offers a choice between intimate and distant second-person pronouns (such as the French pair tu and vous), the same choice applies, with the same social consequences, to imperatives even though there is no overt pronoun (e. g. Viens! or Venez! for 'Come!'). Without unrealized pronouns as subject it is hard to extend the rule for choosing pronouns so that it applies to the choice of imperative forms as well; but with unreal pronouns the choice of pronoun automatically triggers the correct agreement on the verb. 5. 4 Complements of certain definite pronouns in English The argument here rests on the assumption that 'pluralia tantum' such as trousers and scales are singular in meaning but plural in syntax; the assumption
48
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
has been challenged (Wierzbicka 1988) but we still find it especially plausible for examples such as scales (plural) contrasting with balance (singular). The relevant datum is that the choice between this and these matches the syntactic number when the complement noun is overt (so this balance but these scales], but the same choice is made even when there is no overt complement (26)
I need some scales to weigh myself on, but these (*this) are (*is) broken.
If pluralia tantum really are singular in meaning, we cannot explain this choice in terms of meaning, and the most attractive explanation is that the choice is forced in the same way as in the overt case, by the presence of an unrealized example of the noun scales (or trousers or whatever). We might also consider extending this explanation to another curious fact about the demonstrative pronouns, which is that the singular form can only refer to a thing: (27)
Do you take this !!(woman) to be your lawfully wedded wife?
The explanation would be that only one unrealized noun is possible in the singular: the noun thing. The analyses that we are suggesting are of course controversial and may be wrong, but if they are correct then they show that the unrealized word may be a specific lexical noun rather than a general-purpose pronoun as in the earlier examples. 5. 5 Complements of certain verbs such as auxiliaries in English This covers the territory of so-called 'VP deletion' but also other kinds of anaphoric ellipsis: (28) (29)
I don't know whether I'm going to finish my thesis this year, but I may. I may finish my thesis this year, but I don't know for sure.
If the complement of mqyjs allowed to be unrealized, then may in (28) actually has a complement verb, whose properties are (more or less) copied from its antecedent (namely, (I) finish my thesis this year); and similarly know in (29) has an object which would have been realized as whether I'll finish my thesis this year. This analysis combines the flexibility of a purely semantic analysis with the ability of a syntactic analysis to accommodate syntactic detail such as extraction out of an elided complement: (30)
OK, you didn't enjoy that book, but here's one which you will.
If will has no syntactic complement at all, the extraction of which in (30) is very hard to explain; but if its complement is an unrealized enjoy, the rest of the syntactic structure can be exactly as for here's one which you will enjoy., 6 In all these examples the omitted element is redundant and easy to recover, so the option of leaving it unsaid obviously helps both the speaker and the
CASE AGREEMENT IN ANCIENT GREEK
49
hearer. The familiar functional pressure to minimize effort thus explains why the choice between 'realized' and 'unrealized' exists in the grammar. On the other hand, it does not explain why languages allow it in different places - e. g. why some languages allow tensed verbs to have a null subject while others do not This variation must be due to different ways of resolving the conflict between this functional pressure and others which push in the opposite direction, such as the pressure to make syntactic relations reliably identifiable (whether by word order as in English or by inflectional morphology as in Greek). 6.
Comparison with PRO and pro
The analysis that we are proposing is different from the more familiar ones which invoke null pronouns such as PRO and pro, and we believe that the differences are important: •
PRO and pro are special pronouns which combine the peculiarity of always being covert with the equally exceptional property of covering all persons, numbers and genders. The fact that they are exceptional in two such major respects should arouse suspicion. In contrast, our unrealized pronouns are the ordinary pronouns - he, me, us, and so on - which just happen to be unrealized. Even if we count this as an exceptional feature it is their only exceptional feature, in contrast with the double exceptionality of PRO and pro. • In our account, a pronoun may be realized for emphasis in contexts where it would normally be unrealized; this accounts in a simple way for the examples in (9) to (12) where the pronoun is emphatic. If the covert pronoun is always PRO, why should it always alternate with an ordinary pronoun? • Our unrealized words need not be pronouns, unlike PRO and pro. As explained in the previous section, this allows us to extend the same explanation to other kinds of unexpressed words, such as unrealized common nouns acting as complement of a pronoun/determiner such as this, or virtual complements of verbs such as auxiliary verbs. In other words, our proposal subsumes null subjects under a much broader analysis which covers ellipsis in general. • Unrealized words are identified by the quantity feature which applies outside language (e. g. to Father Christmas and feet) as well as inside. In contrast, the difference between PRO or pro and other words is specific to language, involving (presumably) the absence of a phonological entry. Any explanation which involves machinery that is available independently of language is preferable to one which involves special machinery. • In the standard analysis with PRO and pro the difference between these two is important because both abstract 'Case' and surface case are supposed to be impossible for PRO but obligatory for pro; this contrast is also claimed to correlate with the contrast between subjects of non-finite and finite verbs.
50
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE More recently, PRO has been claimed to have a special 'null' case (Chomsky and Lasnik 1993). The empirical basis for these claims was undermined long ago (e. g. Sigurdsson 1991), and our analysis does not recognize the distinction between PRO and pro. Unrealized pronouns all take case (or lack it) just like realized ones in the language concerned.
These differences between our proposal and the PRO/pro system all seem to favour our proposal. 7.
Comparison with Other PRO-free Analyses
In this section we compare our proposal with two other approaches to null elements neither of which invokes a 'covert' element such as PRO. The first approach is in the WYSIWYG spirit of earlier versions of Word Grammar, where it was assumed that null elements were simply absent. This assumption was only workable because of die possibility of structure-sharing. In this analysis, the missing subject is specified as (i. e. supplied by) the subject of the higher verb (see Hudson 1990: 235ff for details). For Greek, as we indicated in section 2, this approach is adequate for the cases traditionally described under the rubric of 'attraction', but it fails for the default situation, where the subject of the infinitive (and other elements dependent on the lower verb) display accusative case. On the early WG assumptions, the only possible analysis is simply to stipulate, for the case where there is no infinitival subject, that predicates of infinitives are accusatives. But this approach fails to be explanatory: why should these elements bear the accusative case rather than the general default nominative? (Contrast the principled explanation of Figure 8 and accompanying text) Moreover, this no-null-element, stipulative analysis suffers from an even graver defect: the relation 'subject' is a collecting point for a large number of different patterns in semantics and morphology as well as syntax (Keenan 1976). A verb's subject is the nominal that has the following properties (among others): • its referent is the 'active argument' of the verb as defined by the latter's lexical entry - for instance, with RUN/TREKHO: it is the runner, with FALL/PIPTO: it is the fuller, with LIKE/PHILEO: it is the liker, and so on; • in English, it typically stands before the verb; • it is the typical antecedent of a reflexive object of the same verb; • the verb agrees with it; • in English, it is obligatory if the verb is tensed; • it is also the verb's object if the verb is passive; • in Greek, its case is typically nominative. As soon as some nominal is defined as the verb's subject, it immediately inherits all these characteristics en bloc. But in the absence of a subject there is
CASE AGREEMENT IN ANCIENT GREEK
51
nothing to bring them all together. For example, if himself is the object of hurt it is tied anaphorically to the hurter via the 'subject' link, but if there is no subject this link disappears. And yet the fact is that the anaphoric relations are exactly the same regardless of whether or not there is an overt subject; for example, the 'understood' subject of hurt in Don't hurt yourself! binds yourself in exactly the same way as the overt one in You may hurt yourself. The analysis that we are proposing solves these problems by moving towards the standard view that every verb does indeed have a subject, whether or not this is overt. Similar problems face the earlier WG approach in other areas of grammar, and can be solved in the same way. In section 5 we outlined a range of phenomena that seem to call for analysis in terms of unrealized words, and which more traditional WG analyses have treated in terms of dependents that are simply absent. Another attempt to handle null subjects without invoking PRO is proposed by Pollard and Sag (1994: 123-45) in the framework of HPSG. As with the early WG analysis just described, this proposal applies only where syntactic structure-sharing is not possible. They propose a structure for 'Equi' verbs such as the following for try (ibid.: 135): CAT | SUBCAT ] The infinitive's subject is the italicized 'NP' in its 'SUBCAT' (valency) list This NP merely indicates the need for a subject, and would normally be 'cancelled' (satisfied) by unification with an NP outside the VP; for example, in They worked hard the verb needs a subject, which is provided by they. However at least the intention of this entry is to prevent the need from being satisfied, so that the infinitive's subject remains unrealized, as in our proposed WG analysis. Moreover, this unrealized subject in the SUBCAT list may carry other characteristics which are imposed both by the infinitive and by try; for example, the subscripts in the entry for try show that it must be coreferential with the subject of try - i. e. a sentence such as They try to work hard has only one meaning, in which they are the workers as well as the try-ers. Most importantly for the analysis of Greek, the unrealized subject can carry whatever case may be imposed on it by the infinitive (Henniss 1989). Consequently it can be the target of predicative case agreement, so Ancient Greek case-agreement would be no problem. This approach is clearly very similar to ours. In both theories: • • • •
the infinitive's subject is an ordinary noun(-phrase) rather than a special pronominal (PRO or pro); the subject's status (overt or covert) is handled by a separate mechanism from its other properties; the subject's properties include those inherited from the infinitive; the possibility of null realization is determined by the head, rather than inherent in the unrealized nominal.
52
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
However there are also significant differences between the two proposals. •
•
The null NP in HPSG is purely schematic, so all null subjects have the same syntax (bar any specific syntactic demands imposed by the infinitive). They are also schematic in their semantics, in spite of the coreference restriction, because reference is distinct from semantics (e. g. the winner may be coreferential with someone I met last night, but these phrases obviously have different semantic structures). In contrast, WG null subjects are ordinary lexical nouns and pronouns. So far as we can see, the HPSG machinery for distinguishing overt and covert valents does not appear to generalize beyond language; and indeed, many advocates of HPSG might argue that it should not do so. In contrast, we showed in section 4 that our proposal does; it can explain the 'nonoccurrence' of Father Christmas in just the same way as that of the subject of an infinitive.
Whether or not the proposals differ in terms of specifically linguistic analyses remains to be seen. 8.
Conclusions
The most important conclusion is that where there is strong empirical evidence for null elements, they can easily be included even in a 'surfacist' grammar such as WG. This can be done by exploiting the existing WG machinery for determining 'quantity', a variable which guides the user in applying knowledge to experience; for example, one of the properties that we attribute to Father Christmas is zero quantity - i. e. we expect no tokens in experience. In these terms, a 'null word' is an ordinary word whose realization has the quantity 0 - an unrealized word. This (or something like it) is generally available in cognition both for distinguishing fact and fiction and for cases where an expected attribute is exceptionally absent, so it comes 'for free', and it is preferable to inventing special linguistic inaudibilia such as PRO or pro. References to classical works Aeschylus, Eumenides (A. Eum. ) Herodotus (Hdt) Homer, Iliad (H. II. ) Homer, Odyssey (H. Od. ) (LSJ: see liddell & Scott in References) Isocrates (I. ) Plato, Alcibiades (P. Ale. ) Plato, Gorgias (P. G. ) Xenophon, Anabasis (X. A. )
CASE AGREEMENT IN ANCIENT GREEK
53
References Andrews, A. (1971), 'Case agreement of predicate modifiers in Ancient Greek'. Linguistic Inquiry, 2, 127-51. Borsley, R. (2005), 'Agreement, mutation and missing NPs in Welsh', Available: http: // privatewww. essex. ac. uk/~rborsley/Agreement-paper. pdf (Accessed: 19 April 2005). Chantraine, P. (1953), Grammaire Homerique. Vol 2. Paris: Klincksieck. Chomsky, N. and Lasnik, H. (1993), 'The theory of principles and parameters', in J. Jacobs, A. v. Stechow, W. Sternefeld and T. Venneman (eds), Syntax: An International Handbook of Contemporary Research. Berlin: Walter de Gruyter, pp. 506-69. Goodwin, W. W. (1930), Greek Grammar (rev. Charles Burton Gulick). Boston: Ginn. Henniss, K. (1989), '"Covert" subjects and determinate case: evidence from Malayalam', in J. Fee and K. Hunt (eds), Proceedings of the West Coast Conference on Formal Linguistics. Stanford: CSLI, pp. 167-75. Hudson, R. (1984), Word Grammar. Oxford: Blackwell. — (1990), English Word Grammar. Oxford: Blackwell. — (2003), 'Case-agreement, PRO and structure sharing'. Research in Language, 1, 7-33. Keenan, E. L. (1976), 'Towards a universal definition of "subject"', in Charles Li (ed. ), Subject and Topic. New York: Academic Press, pp. 303-33. Kiihner, R. and Gerth, B. (1955), Ausfuhrliche Grammatik der griechischen Sprache. Leverkusen: Gottschaksche Verlagsbuchhandlung. Lecarme, J. (1978), 'Aspects Syntaxiques des Completives du Grec' (Unpublished doctoral dissertation, University of Montreal). Liddell, H. G. and Scott, R. (1971), A Greek-English Lexicon (9th edn, rev. H. Jones and R. McKenzie, suppl. by E. Barber) Oxford: Clarendon Press. Pollard, C. J. and Sag, I. A. (1994), Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press. Quicoli, A. C. (1982), The Structure of Complementation. Ghent: Story-Scientia. Sigurdsson, H. (1991), 'Icelandic case-marked PRO and the licensing of lexical arguments'. Natural Language & Linguistic Theory, 9, 327-63. Smyth, H. W. (1956), Greek Grammar (rev. G. Messing). Cambridge, MA: Harvard University Press. Wierzbicka, A. (1988), The Semantics of Grammar. Amsterdam: Benjamins. Notes 1 By Ancient Greek we mean the Greek of early epic poetry ('Homeric Greek') down to the Attic prose of the 5th and 4th centuries B. C. E. 2 A list of abbreviated references to classical authors can be found at the end of this paper. 3 Very many of the verbs which take the infinitive also take a personal object which stands in the case that the verb requires... If the infinitive also has an adjectival or nominal predicate, this stands in the same case as the personal object by (an) attraction, or in the absence of attraction, in the accusative. 4 However if the subject of the governing verb is at the same time the subject of the infinitive, the subject of the infinitive is omitted, and if adjectival or nominal predicates accompany the infinitive, these are put in the nominative by attraction. 5 This felicitous term is taken from the work of Pollard and Sag (1994), but the analysis was worked out in full detail for English infinitives (and other constructions) in Hudson (1990: 235-9). 6 We owe this point to Andrew Rosta.
3 Understood Objects in English and Japanese with Reference to Eat and Taberu: A Word Grammar account KENSEI SUGAYAMA
Abstract The author argues that there is a semantic difference of the suppressed object between eat and its Japanese equivalent taberu. Then what kind of semantic structure would the Japanese verb taberu have? This chapter is an attempt to answer this question in the framework of Word Grammar (Hudson 1984, 1990, 1998, 2005). 1.
Introduction
Unlike. English and perhaps most other European languages, Japanese allows its transitive verbs to miss out their complements (e. g. subject, object) on the condition that the speaker assumes that they are known to the addressee.1 This is instantiated by the contrast in (1) and (2): (1)
A: B:
(2)
A:
mo keeki-wa yaki-mashita-ka already cake-TP baked-Q 'Did you bake the cake?' hai, yaki-mashita yes baked 'Yes, (I) baked it' *mo, yaki-mashita-ka already baked-Q (* unless the object is situationally recovered) 'Did you bake it?' (intended meaning)
The following sentences are also possible in Japanese.2 (3) (4) (5)
hyah! ugoita Interj moved 'Ouch! It moved. ' kondo-wa yameyoo next-time-TP stop 'I won't do it again. ' kanojo-wa yubiwa-o oita she-TP-Sb ring-Ob put 'She put the ring there'
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE
55
Sentences (3), (4) and (5) are colloquial and quite often used in the standard spoken Japanese. In this sense they are not marked sentences. In (3) only the subject is left out, while in (4) both the subject and object are left out as shown in the word-for-word translation. Sentence (5) involves the transitive verb oita, a past tense form of oku 'put', which corresponds to put in English. Oku is a threeplace predicate which semantically requires three arguments [agent, theme, place]. These three arguments are mapped syntactically to subject, object and place adverbial, respectively. Quite interestingly (5) shows that the place element, which is also considered to be a complement (or adjunct-complement) of the verb oku, is optional when it is known to the addressee, which is virtually impossible with its counterpart put in English. Although these complements are in fact missed out (i. e. unexpressed or ungrammaticalized), the addressee eventually will come to an appropriate interpretation of each sentence where unexpressed complements are supplied semantically or pragmatically and they are no doubt given full interpretation. Why is this possible? A possible answer comes from the assumption that in the semantic structure of the sentences above, there has to be a semantic argument which should be, but is not actually, mapped onto the syntactic structure (i. e. grammaticalized as a syntactic complement in my terms). Turning to English, on the other hand, it is possible to leave indefinite objects suppressed for semantically limited verbs such as eat, drink, read, etc. 3 Thus, following Hudson (2005), the syntactic and semantic structure of John ate will be something like the one shown in Figure 1.
Figure 1
The links between syntactic and semantic structures in Figure 1 are shown by the vertical solid and curved lines. The categories enclosed by single quotation marks (e. g. 'John', 'John ate') are concepts which are part of the sentence's semantic structure; the numbers are arbitrary. Detailed explanation about syntactic and semantic dependencies will be given in the next section. But this kind of semantic structure does not seem to be a viable one for the Japanese verb taberu, because, as I will argue later, there is a semantic difference in the semantic feature of the suppressed object between eat and taberu, which does not seem to be properly reflected in the semantic structure of those two
56
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
verbs in Word Grammar (WG). Then what kind of semantic structure will the Japanese verb tab em, the Japanese equivalent of eat, have? This chapter is an attempt to answer this question in the framework of Word Grammar. The rest of the chapter is organized in the following way. Section 2 introduces Word Grammar and deals with the relevant notions used in WG to deal with the problem of a covert object. Section 3 discusses the analysis of an intransitive use of the eat type verbs. Section 4 discusses the Japanese verb taberu, an equivalent of eat in English. It also discusses the interpretation of tab em which lacks an overt object, using the syntactic and semantic analysis in WG. Section 5 offers my own account of how taberu is more adequately described in the semantic structure in WG.
2. Word Grammar Before continuing any further, let us first have a brief look at the basic framework of WG and its main characteristics. WG, which is fully developed and formalized in Hudson (1984, 1990), subsequently revised by Rosta (1997), is to be taken as a lexicalist grammatical theory because the word is central hence the name of the theory, basically making no reference to any grammatical unit larger than a word.4 In his recent comparison of WG with Head-driven Phrase Structure Grammar (HPSG), Hudson (1995b: 4) gives a list of the common characteristics between the two theoretically different grammars, some relevant ones of which are repeated here for convenience: (6)
a. b. c. d. e.
both (i. e. WG and HPSG) include a rich semantic structure parallel with the syntactic structure; both are monostratal; both make use of inheritance in generating structures; neither relies on tree geometry to distinguish grammatical functions; both include contextual information about the utterance event (e. g. the identities of speaker and hearer) in the linguistic structure.
In WG, syntactic structure is based on grammatical relations within a general framework of dependency theory rather than on constituent structure. So a grammatical relation is defined as a dependency relation between the head and its dependents which include complements and adjuncts. In this framework, the syntactic head of a sentence, as well as its semantic head, is therefore a finite verb on which its dependents such as subject, object and so forth depend. To take a very simple example, the grammatical analysis of Vera lives in Altrincham can be partially shown by the diagram in Figure 2.5 Each arrow in this diagram shows a dependency between words and it points from a head to one of its dependents, but it is most important here that there are no phrases or clauses.6 Thus in terms of dependency, lives is the root of the sentence on which Vera and in depend as a subject and a complement respectively. In turn, in is the head of Altrincham. Semantically, 'Vera', which is a referent of Vera, is linked as a live-er to a semantic concept 'Vera lives in Altrincham', an instance of live' and a referent of lives at once and 'Altrincham', which is a referent of
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE
57
Figure 2 Altrincham, is also linked to 'Vera lives in Altrincham' as a place. The curved lines connecting in and Altrincham mean that the referent of in and that of Altrincham are the same (i. e. 'Altrincham'). 'Liv-er' and 'place' are names of a semantic relation. A convenient way to diagram the model-instance relation is by using a triangle with its base along the general category (= model) and its apex pointing at the member (= instance), with an extension line to link the two. In Figure 2, then, the diagram shows the relation between the sense of the word lives, live', and its instance 'Vera lives in Altrincham'. A WG grammar generates a semantic structure which parallels the syntactic structure described above. The parallels are in fact very close as in Figure 2. Virtually every word is linked to a single element of the semantic structure, and the dependency relations between the words are typically matched by one of the relations between their semantic concepts: dependency (shown as a line with a point). The familiar distinction between 'referent' and 'sense' are used in much the same way as in other linguistic theories. Therefore, in WG a word's sense is understood to be some general category (e. g. the typical chair), while its referent is some particular instance of this category. Apart from the diagram we have seen, how are the syntactic and semantic structure of a sentence represented in WG? WG consists of an unordered set of propositions called facts. All WG propositions have just two arguments and a relator between them, and all relations can be theoretically reduced to a single one represented as '=', although for convenience 'isa' and 'has' are also used. Hudson (1990: 256), for instance, gives a fairly complete lexical entry for eat in terms of propositions (or facts). Some of them considered to be most important for the present purpose are given in (7): (7)
a. b. c. d.
EAT isa verb. sense of EAT = eat EAT has [0-1] object. referent of object of EA T = eat-ee of sense of it
58
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
These facts are self-explanatory except for a few technical expressions: 'A isa B' means 'A is an instance of B' and [0-1] in front of object means 'at least 0 and at most 1' (i. e. 0 or one in this particular case). An element in italics is meant to be the antecedent of the pronoun. Propositions in (7) partially represent the syntactic and semantic structures of John ate potatoes diagrammed in Figure 3 with eat replaced by its past tense form ate.
Figure 3 In Figure 3, as mentioned earlier, an item enclosed by single quotation marks represents a concept in the semantic structure. In this case, 'Fred', 'potatoes', etc. are referents, whereas 'ate' and 'potato' are senses. X-er, Xee, etc., labelled on a semantic dependency, are semantic relations complements considered to bear to their head. This outline of WG brings us now to the syntactic and semantic analysis of the English verb eat. 3.
Eat in English
Let us consider those English transitive verbs that optionally appear without their object Examples of such verbs, among others, include dress, shave, lose, win, eat and read, as in (8a) to (8f): (8)
a. b. c. d. e. f
William dressed/shaved Andrew. hester United won/lost the game. read the book/ate the shepherd's pie. am dressed/shaved. hester United won/lost. ate/read.
While these six verbs will take part in the same syntactic alternation, the intransitive verbs are obviously interpreted in different ways. In the examples (8d)-(8f), we can identify three verb classes according to difference in
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE
59
interpretation of the unexpressed object. The paired sentences in (9) illustrate these differing interpretations. (9)
a. b. c.
William shaved = William shaved himself Manchester United won = Manchester United won the game John ate = John ate something (edible) or other
For the shave type verbs, the object, if omitted, is interpreted as being coreferent with the subject or the subject's specific body part. For the win type verbs, the surface intransitive verb form signals a severe narrowing of the range of possible referents of the implicit object, roughly speaking, 'a specific game' to be recoverable from the context. For the eat type verbs, the intransitive form means a lack of commitment by the speaker to the referent of the object. With the eat type verbs, the identity of the referent of the unexpressed object may be non-specific, i. e. literally unknown to the speaker, because the sentences in (10) do make sense. (10)
a. b.
I saw Oliver eating, but I don't know what he was eating When I peeked into Oliver's room, he was reading; now I wonder what he was reading
In both sentences in (10), the identity of what was eaten/read is asked in the second part. This implies that the patient (or eat-£/read-^) argument of eat/ read, which may be grammaticalized as the object at surface structure, does not have to be definite. There is other evidence that supports the indefmiteness of the suppressed object of eat. Consider the following dialogues: (11) (12)
A: What happened to my scones? B: *The dog ate. A: Did you eat your kippers? B: *Yes, I ate.
In both (11) and (12), speaker B cannot reply to speaker A by using ate without its object. What (11) and (12) suggest is that eat, when its object is suppressed, cannot have its null object referring to the element in the previous discourse, which, as I will explain very shortly, is in fact possible in Japanese. What the ungrammaticality of the utterances by speaker B indicates is that the understood object has to be indefinite if eat is used as an intransitive verb.7 Here is another interesting piece of evidence supporting that this claim is true. Observe the following dialogue: (13)
A: I'm starving, let's eat. B: What would you like to eat? A: Doesn't matter, anything, I'm just so hungry.
When speaker A first uses the intransitive eat, it is clear that he/she does not have a definite object (or the referent of a definite object) in mind, and is just
60
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
expressing his/her desire to consume something or other. As our previous arguments predict, this is exactly a case where the intransitive eat should appear, because there is no antecedent available in this context. However, the semantic structure of eat in this example is considered to have the patient argument, as the lexical semantics of eat requires two arguments whether its object is definite or not. Then the question arises why this argument does not appear at surface structure. Now let us reconsider a WG representation of Fred ate, the diagram of which is repeated here for convenience.
Figure 4 By now it is clear that this semantic representation is inadequate for eat (ate}. It is not so difficult to see that the important semantic information of the suppressed object is missing in Figure 4. In Fred ate, there is no object, which implies that it should be indefinite. Regrettably, this important semantic information does not seem to be incorporated in Figure 4. Therefore, my proposal is that we have to revise the diagram so that it can be enriched with the semantic information of the unexpressed object and accordingly a more accurate one will be something like the one in Figure 5.
Figure 5 4.
Taberu in Japanese
Let us now turn to a Japanese counterpart of eat, taberu. The picture of taberu, an equivalent of eat in Japanese, is quite different from that of English eat,
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE
61
which we have just discussed. As stated in section 1, complements are usually missed out in Japanese as far as they are accessible to the speaker and the addressee (or recoverable) in the context. This generalization applies to the verb taberu in Japanese. Before analyzing the structure of taberu, which can be used with the suppressed complement as in (14) and (15), let us consider what kind of grammatical structure WG would give to taberu. Like English eat, taberu in Japanese takes two arguments in its semantic structure, the agent (eat-er), which is realized as subject, and the patient (eat-ee) which is realized as object. Thus WG gives the syntactic and semantic structures of Shota ga ringo o tabeta 'Shota ate apples' as diagrammed in Figure 6.
Figure 6 In passing, one of the advantages of using WG to analyze the syntactic structure of Japanese is that by doing so, it is quite easy to explain the phenomenon of Tree word order as long as the parent is at the end' in Japanese. As is well-known, Japanese is a verb-final language which implies that the order of the subject, object and other dependents is not fixed as far as they are before the verb, which is at the end of a sentence. Thus Shota ga ringo o tabeta has an alternative version of ringo o Shota ga tabeta. A rather free order of two complements in the sentence is explained in WG by saying that these two elements are co-dependents of the head taberu, therefore the order of the complements is free as far as they are before the head. As stated above, complements of taberu can be missed out providing that they are recovered from the context. The following examples illustrate this point (14)
hayaku tabero quick eat 'Eat it quick' (15) moo tabe-mashita-ka? already ate-Q 'Did you eat it already?'
62
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
In both sentences above the definite object is apparently suppressed. Presumably the suppressed objects can be expressed as (definite) pronouns without any change in meaning as in the following sentences: (16)
(17)
hayaku sore-o tabero quick it-Ob eat 'Eat it quick' moo sore-o tabemashita-ka? already it-Ob ate-Q 'Did you eat it already?'
In the last section I argued that the suppressed object of eat is indefinite because it cannot refer to its antecedent even when it is available in the preceding context. Interestingly enough the opposite is true with taberu. Consider die following dialogues corresponding to (11) and (12), which we discussed in the last section: (18)
A:
B:
(19)
A:
B:
watashino sukohn-wa doo-shimashi-ta? my scones-TP how-did 'What happened to my scones?' inu-ga tabeta dog-Sb ate 'The dog ate them' kippahzu-wa tabeta? kippers-TP-Sb ate 'Did you eat your kippers?' ee tabeta yes ate 'Yes, I ate them'
In (18) and (19), the definite object referring to an element in die previous context is left out in B's utterance. In (19), the subject referring to the speaker is also missing in B's utterance. These cases show diat the suppressed object of taberu is definite. In contrast, when the object of taberu is indefinite, there are in fact cases where it has to be expressed as an indefinite noun as in (20): (20)
A:
B:
himana toki-wa nani-o shite-imasu-ka? spare time-TP what-Ob do-ing-Q 'What do you do in your spare time?' taitei nanika tabete-imasu usually something eat-ing 'Usually I eat'
These arguments make it very clear that WG should represent the syntactic and semantic structures of inuga tabeta in (18) as diagrammed in Figure 7. As I stated in section 2, Hudson (1995b) claims that one of the key characteristics of WG is taken as 'including contextual information about the utterance event' (e. g. the identities of a speaker and a hearer) in the linguistic structure. However, as it stands, the syntactic and semantic structures in Figures
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE
63
Figure 7 1 and 4 do not seem to include as much contextual information as he suggests that a WG does. Revisions I have made in this section surely contribute to increasing contextual information in the grammatical representation in WG. 5.
Conclusion
Considering the fact I mentioned above of deletability of definite objects given a proper context in Japanese, it seems reasonable to add the following rule to the grammar of Japanese to explain the proper semantic structure of tab em: (21)
Knower of eai-ee of sense of tab em = addressee of it.
Taking into account the arguments above, I conclude that the syntax of missing complements in Japanese can be given a more satisfactory description by introducing a parameter of 'default definiteness'. I do not want to enter into details now but simply suggest that to distinguish between complements and adjuncts in Japanese one needs another parameter such as 'default definiteness'. To put it differently, by default the definiteness of a covert complement is [+defmite] and that of a covert adjunct is [+/-definite] as in (22): (22)
• Covert complement of verb = definite • Covert adjunct of verb = indefinite • Knower of referent of complement of verb = addressee of it
References Allerton, D. J. (1982), Valency and the English Verb. London: Academic Press. Cote, S. A. (1996), 'Grammatical and Discourse Properties of Null Elements in English'. (Unpublished doctoral dissertation, University of Pennsylvania).
64
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Fillmore, Ch. J. (1986), 'Pragmatically controlled zero anaphora'. BLS 12, 95-107. Groefsema, M. (1995), 'Understood arguments: A semantic/pragmatic approach', Lingua 96, 139-61. Haegeman, L. (1987a), 'The interpretation of inherent objects in English', Australian Journal of Linguistics, 7, 223-48. — (1987b), 'Register variation in English'. Journal of English Linguistics, 20, (2), 23048. Halliday, M. A. K. and Hasan, R. (1976), Cohesion in English. London: Longman. Hudson, R. A. (1984), Word Grammar. Oxford: Blackwell. — (1990), English Word Grammar. Oxford: Blackwell. — (1992), 'Raising in syntax, semantics and cognition', in Rocca I. (ed. ), Thematic Structure: Its Role in Grammar. Berlin: Mouton de Gruyter, pp. 175-98. — (1994), 'Word Grammar', in Asher, R. E. (ed. ), The Encyclopedia of Language and Linguistics. Vol. 9. Oxford: Pergamon Press Ltd, pp. 4990-3. — (1995a), 'Really bare phrase-structure=dependency structure'. Eigo Go ho Bunpoh Kenkyu (Studies in English Language Usage and English Language Teaching}, 17, 3-17. - (1995b), HPSG without PS? Ms. — (1995c), Word Meaning. London: Routledge. - (1996, October 28), 'Summary: Watch', ( LINGUIST List 7. 1525), Available: http:// linguisdistorg/issues/7/7-1525. html#?CFID=4038808&CFrOKEN=16874386. (Accessed: 21 April 2005). — (1998), English Grammar. London: Routledge. - (2000), '*! amn't'. Language, 76, (2), 297-323. — (2005), 'An Encyclopedia of English Grammar and Word Grammar', (Word Grammar), Available: www. phon. ucl. ac. uk/home/dick/wg. htm. (Accessed: 21 April 2005). Kilby, D. (1984), Descriptive Syntax and the English Verb. London: Groom Helm. Lambrecht, K. (1996, October 30), 'Re: 7. 1525, Sum: Watch', (LINGUIST List 7. 1534), Available: http://linguisrlist. org/issues/7/7-1534. html#?CFID=4038808&CFTOKEN= 16874386 (Accessed: 21 April 2005). Langacker, R. W. (1990), Concept, Image and Symbol. Berlin: Mouton de Gruyter. Larjavaara, M. (1996, November 3), 'Disc: Watch', (LINGUIST List 7. 1552), Available: http: //linguisdist. org/issues/7/7-1552. html#?CFID=4038808&CFTOKEN=16874386 (Accessed: 21 April 2005). Lehrer, A. (1970), 'Verbs and deletable objects'. Lingua, 25, 227-53. Levin, B. (1993), English Verb Classes and Alternations. Chicago: University of Chicago Press. Massam, D. (1987), 'Middle, tough, and recipe context constructions in English'. NELS 18, 315-32. — (1992), 'Null objects and non-thematic subjects'. Journal of Linguistics, 28, (1), 11537. Massam, D. and Y. Roberge. (1989), 'Recipe context null objects in English'. Linguistic Inquiry, 20, 134-9. Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (1985), A Comprehensive Grammar of the English Language. Harlow: Longman. Rispoli, M. (1992), 'Discourse and the acquisition of eat'. Journal of Child Language, 19, 581-95. Rizzi, L. (1986), 'Null objects in Italian and the theory of pro'. Linguistic Inquiry, 17, 501-57. Roberge, Y. (1991), 'On the recoverability of null objects', in D. Wanner and D. A.
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE
65
Kibbee (eds), New Analyses in Romance Linguistics. Amsterdam: John Benjamins, pp. 299-312. Rosta, A. (1994), 'Dependency and grammatical relations'. UCL Working Papers in Linguistics, (6), 219-58. — (1997), 'English Syntax and Word Grammar Theory'. (Unpublished doctoral dissertation, University of London). Sperber, D. and Wilson, D. (1995), Relevance (2nd edn) Oxford: Blackwell. Sugayama, K. (1993), 'A Word-Grammatic account of complements and adjuncts in Japanese', in A. Crochetiere, J. -C. Boulanger and C. Ouellon (eds), Acte du XVs Congres International des Linguistes Vol. 2. Sainte-Foy, Quebec: Les Presses de 1'Universite Laval, pp. 373-76. — (1994), 'Eigo no "missing objects" ni tsuite (Notes on missing objects in English)', Eigo Goho Bunpoh Kenkyu (Studies in English Language Usage and English Language Teaching), 1, 91-104. — (1999), 'Speculations on unsolved problems in Word Grammar'. Kobe City University Journal, 50, (7), 5-24. Thomas, A. L. (1979), 'Ellipsis: the interplay of sentence structure and context'. Lingua, 47, 43-68. Notes 1 Notice here that Word Grammar, in the framework of which the arguments are developed, assumes that complements include subject as well as object, contrary to most of the phrase-structure-based theories. For details about the distinction between complements and adjuncts in Japanese, see Sugayama (1993, 1999). 2 The following symbols for grammatical markers are used in the gloss: TP (Topic), Sb (Subject), Ob (Object), Q (Question marker). The symbol 0 is used for a zero pronominal in examples when necessary. 3 For a complete list of some 50 verbs having this feature, together with references, see Levin (1993: 33). Lehrer (1970) also gives a similar list. Unspecified Object Alternation in Levin's terms (Levin, 1993: 33), which applies to eat, embroider, hum, hunt, fish, iron, knead, knit, mend, milk, mow, nurse, pack, paint, play, plough, polish, read, recite, sew, sculpt, sing, sketch, sow, study, sweep, teach, type, sketch, vacuum, wash, weave, whittle, write, etc., has the following pattern: a. Mike ate the cake. b. Mike ate. (= Mike ate a meal or something one typically eats. ) 4 For some recent advances in WG, see Hudson (1992, 1995a, 1995b, 1998, 2000, 2005) and Rosta (1994, 1997). 5 Notice here that C stands for a complement, rather than a complementizer in the diagram. 6 No phrases or clauses are assumed in WG except for some constructions (e. g. coordinate structures). 7 Things are not so straightforward. Surprisingly younger children seem to have a different grammar in which (9) and (10) are grammatical and are actually used. Rispoli (1992) looked at the acquisition of Direct Object omissibility with eat in young children's natural production. In terms of GB, eat is one of the many English verbs with which the internal argument can be saturated. He found that at an earlier stage of development children frequently omitted a Direct Object with eat when the understood object referred to something in the discourse context, as in this exchange between a parent (P) and a child (C):
66
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE (i) Child (2; 7) (Talking about a pencil) P: Well I see you already ate the eraser off of it. That's one of the first things you hadta do. C: I eat. (four times) P: I know you ate the eraser, so you don't need a candy bar now. (Rispoli 1992: 590)
4 The Grammar of Be To: From a Word Grammar point of view1 KENSEI SUGAYAMA
Abstract This chapter is an attempt to characterize the be to construction within a Word Grammar framework. First (section 2), a concise account of previous studies into the category of be in the construction is followed by a description of be in Word Grammar (section 3). Section 4, section 5 and section 6, then, present a morphological, syntactic and semantic discussion of the be to construction, respectively. Section 7 gives a detailed discussion of the question whether be to is a lexical unit or not The analysis is theoretically framed in section 8, where it is shown how Word Grammar offers a syntactico-semantic approach to the construction.
1. Introduction and the Problem In contemporary English there is a construction instantiated by the sentences in (I):2 (1)
a. b. c. d.
You are to report here at 6 a. m. What am I to do? I am to leave tomorrow. That young boy was to become President of the United States.
I shall call this construction the be to construction.3 Previous studies have analyzed be in this construction in three different ways: (i) modal (e. g. Huddleston 1980); (ii) 'be + to' analyzed as quasi-auxiliary; (iii) intermediate between the two, as semi-auxiliary. These approaches, however, did not give enough evidence to justify their analyses. In this chapter I will argue that be in this construction is an instance of modal verb and that 'be + to' is not a lexical (or possibly neither syntactic) unit as it is often treated in reference grammars.4 My argument is within the framework of a theory called Word Grammar, hence called a Word Grammar account of the problem, and based on the fact that there is ample evidence supporting the claim that there is a syntactic and semantic gap between the two elements, i. e. be and to. In the following sections, I will provide a characterization of the be to construction within a Word Grammar framework, as outlined above in the abstract.
68
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
2. Category of Be Before we go into our analysis, let us have a brief look at what characteristics the modal be shares with other auxiliary verbs, e. g. can and have. The table below presents a modal be, a prototypical modal can and a prototypical perfective auxiliary have in respect of the 30 criteria used in Huddleston (1980) to characterize auxiliaries and modals. Table 1 Characteristics of be, can and have
1 Non-catenative use 2 Inversion POLARITY 3 Negative forms 4 Precedes not 5 Emphatic positive STRANDING 6 So/neither tags 7 SV order, verb given 8 SV order, verb new 9 Complement fronting 10 Relativized complement 11 Occurrence with DO 12 Contraction POSITION OF PREVERBS 13 Precedes never 14 Precedes epistemic adverb 15 Precedes subject quantifier INDEPENDENCE OF CATENATIVE 16 Temporal discreteness 17 Negative complement 18 Relation with subject TYPE OF COMPLEMENTATION 19 Base-complement 20 fo-complement 21 -en complement 22 -ing complement INFLECTIONAL FORMS 23 3rd Singular 24 -en form 25 -ing form 26 Base-form 27 Past Tense 28 Unreal mood: protasis 29 Unreal mood: apodosis 30 Unreal mood: tentative
BE
CAN (modal)
HAVE (perfect)
-a
+
-
+
R + +
+ + +
+ + +
+ +
+ + + + +
+
-
+
+
+/R +/R + +/R
-
-
-
+
+
+
+ + +
+ + +
+ + +
+ +
+ +
R +
-
-
-
=
+
+
-a
-
-
-
+
-
-
-
+
-
+ •
=
-
-
-
-
-
+ +
+ + + +
R + + +
-
-
NB: R means that the verb has the given property but under restricted conditions.
THE GRAMMAR OF BE TO
69
Clearly, at the outset we can say that be in the current construction shares quite a lot of features with a typical modal can. In what follows I concentrate on the extent to which this claim holds.
3. Modal Be in Word Grammar In this section, we look at what Word Grammar says about aspects of be. Word Grammar (WG), which is fully developed and formalized in Hudson (1984, 1990) is to be taken as a lexicalist grammatical theory because the word is central - hence the name of the theory, basically making no reference to any grammatical unit larger than a word. WG uses a small triangle to show the model-instance relation. So the general category (i. e. model) is on the base and its apex pointing at the member (i. e. instance), with an extension line to link the two. Now, let us see more in detail the properties of be as in (1), to examine the claim that it should be categorized as a modal in WG terms. Consider the diagram in Figure 1.
Figure 1 BEto in Word Grammar What Hudson (1996) basically claims in WG using the model-instance relation is the following: • word is an independent entity in grammar; • verb is an instance of word; • auxiliary verb is an instance of verb; • modal verb is an instance of auxiliary verb, along with other instances (e. g. HAVEs, DOs and BE); both • be in this construction (represented as BE to in Figure 1) is an instance of modal verb and BE. This analysis implies that BEto may have inherited characteristics of modal
70
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
verbs and at the same time those of BE by Hudson's Inheritance Principle although they are not always necessarily inherited: Inheritance Principle (final version, Hudson 2005): If fact F contains C, and C' is an instance of C, then it is possible to infer a second rule F' in which C' replaces C provided that: a. F does not contain "is an instance o f . . . ", and b. there is no other fact which contradicts F and which can also be inherited by C'. The idea of 'contradicting' can be spelt out more precisely, but the idea here should be clear. In a nutshell, the Inheritance Principle says that a fact about one concept C can be inherited by any instance C' of C unless it is contradicted by another specific fact about C'. In sum, WG analyzes be in the be to construction as an instance of modal verb and be, allowing it to inherit characteristics from both heads in the modelinstance hierarchy in Figure 1. 4.
Morphological Aspects
I claim diat be in (1) should be considered as a modal because it shares most of the properties with a prototypical modal in morphology, syntax and semantics. This claim is assured in Figure 1 by the fact that BEto has a multiple head, thus inheriting the features from both heads, i. e. modal verb and be. However, a further semantic analysis of the construction shows that the sense of the construction is derived from the sense of the infinitive clause rather than that of be. Let us start by having a look at the morphological characteristics of be in the construction and its similarities with other modals. 4. 1 Like modals Consider the following examples: •
In Standard English a modal is always tensed - i. e. eidier present or past
This is compatible with the behaviour presented by be in the be to construction as in (2): (2)
*It is a shame for John to be to leave here tomorrow. [Warner] * To be to leave is sad. [Pullum & Wilson] *I expect him to be to leave. [Pullum & Wilson] *He could be to leave. [Pullum & Wilson] *She might be to win the prize. *I don't like our being to leave tomorrow. [Warner] *I am being to put in a lot of overwork these days. [Seppanen] *I have always been to work together. [Seppanen] * Don't be to leave by midnight. [Pullum & Wilson] * Be to leave it till later. [Seppanen]
THE GRAMMAR OF BE TO
71
Clearly each of the examples in (2) shows that be in this construction cannot be infinite. The presence of tense is what be shares with modal verbs. •
Only (tensed) auxiliary verbs accept -n't
Be in this construction allows negative contraction, which is shown only by the tensed auxiliaries, which again implies that be is a modal because auxiliaries include modals. (3)
Her novel wasn't to win that year.
4. 2 Unlike modals • Be has a distinct j-form controlled by subject-verb agreement (4) 5.
(Her novel is/They were/I am/She wasfWe are} to win the prize. Syntactic Aspects
The second aspect is related to syntax. 5. 1 Like modals • Only (tensed) auxiliary verbs allow a dependent not to follow them. Be in this construction also has this feature as in (5). (5)
He is not to leave this room. [They are/*get not tired. ]
• It must share subject with the following verb (i. e. it is a raising. verb)5 The behaviour of be is the same as that of a typical raising verb seem as in (6). (6)
•
He is to go/*He is (for) John to go He can speak/*He can (for) John speak *Mary seemed (for) John to be enjoying himself Voice-neutral in many circumstances
This again strongly suggests that be in the be to is a raising verb. (7) •
You are to take this back to the library at once ~ This is to be taken back to the library at once.6 It cannot co-occur with other modals
This feature is critical in that two members of the same category of modal verbs cannot appear consecutively. (8) shows that might and be to belong in the same category of modal verbs.
72
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(8)
*She might be to win the prize.
• It can precede perfective/progressive/passive auxiliary When be appears with a perfective, progressive or passive auxiliary verb, it always appears in the left-most slot reserved for modal verbs, i. e. immediately before these auxiliaries. (9)
Her novel (was to have won/was to be going on display/was to be considered} that year.
• It is an operator, i. e. has NICE properties [but Code only if the same verb given in the previous clause as in (11)] It has NICE properties. Therefore it is an operator in the sense of Quirk et al. (1985) Auxiliaries have NICE properties. The NICE properties are also shared by modals. (10) Was her novel to win the prize? Mine was. (11) *Joe's novel would win a prize that year. Mine wasn't. 5. 2 Unlike modals • taking to -infinitive rather than a bare infinitive as a complement of to (12) 6.
That young boy was *(to) become President of the United States. Semantics of the Be To construction
There exist an array of meanings this be-to has: arrangement, obligation, and predestined future, 'future in the past', possibility, purpose ('to be intended to') and hypothetical condition. It must be noted here that be in this construction has both epistemic/nonepistemic meanings, which is again a diagnostic typical modals show. Sentences in (13)-(19) involve a non-epistemic instance of the construction: (13) She is to see the dean tomorrow at 4 p. m. (14) You are to sit down and keep quiet. (15) You're to marry him within the next six months. (16) Their daughter is to be married soon. [Quirk et al. ] (17) They are to be married in June [OALD6] (18) The Prime Minister is to get a full briefing on the release of the hostages next week. (19) Ministers are to reduce significantly the number of examinations taken by pupils in their first year in the sixth form as the result of an official review to be published later this week. The review will recommend dismantling the modular system of assessment that is at the heart of the new sixth-form curriculum. [ The Times}
THE GRAMMAR OF BE TO
73
Although be to has several different meanings, its basic (or core) meaning can be stated as follows: •
The agent has been set or scheduled to do something by some external (outside) forces, and is thus obliged. However, the agent's commitment to the obligation is left open.
Here the key points are the arrangeability of the event described and the openness of the agent's commitment to the obligation. The first point is most easily detected when it occurs with the event that cannot be arranged. Consider the following examples: (20) (21) (22)
?The sun is to rise at 5. 15 a. m. tomorrow morning The sun will rise at 5. 15 a. m. tomorrow morning. You are to take these four times a day.
A straightforward example of the use of be to in the context where an event cannot be arranged can be found in (20). (20) is odd in comparison with both other sentences for the point I am about to make. The fact that the sun's rise is (or cannot be) normally not arranged is indeed the reason why (20) is low in acceptability.7 In contrast, (21) is all right with will implying the speaker's subjective prediction. One can also utter a sentence like (22), which depicts an arrangeable event. One might take it for granted that the be to necessarily implies the arrangement of an event, but that would be missing the more general point that there is no need to express the agent as in (23) or (24): (23) (24)
There's to be an official inquiry. [Quirk et al. \ Regional accents are still acceptable but there is to be a blitz on incorrect grammar. [CO BUILD2]
What is needed for non-epistemic meaning of the be to construction is that the sentence expresses an arrangeable event or activity. There is another use of be to representing 'predestined future' as in (25)-(30): (25) They are to stay with us when they arrive. [CLD] (26) You are to be back by 10 o'clock. [Quirk et al. \ (27) A clean coal-fired power plant is to be built at Bilsthorpe Colliery. [COBUILD] (28) You are to take this back to the library at once ~ This is to be taken back to the library at once. (29) 'They are to be seen and displayed on walls and floors both in museums and domestically. ' UK Written. CO BUILD WB (30) I've also learned that in these difficult times it truly is important that we're all thinking together about what is to be done and how best to move. [US spoken] All these examples assert the speaker's high certainty at the speech time of the event happening in the (near) future.
74
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Related to this usage type is 'future in the past', a case where the speech time is transferred to some point in the past (31) a. After dinner they were to go to a movie. [COBUILD3] b. Then he received a phone call that was to change his life... [COBUILD4] (32) He was eventually to end up in the bankruptcy court. [Quirk et al. \ (33) The meeting was to be held the following week. [Quirk et al. \ (34) Her novel was to win the prize. (35) Worse was to follow. (36) This episode was to be a taste of what was to come in the following couple of weeks. Different or varied meanings such as 'compulsion', 'plan', 'destiny', etc. can derive from the core meaning according to the context it appears in as in (37a) and (37b). In (37), the part before and/as is the same in both sentences. Nevertheless the interpretation of this part at the level of sentence meaning is quite different. Where does this difference come from? The context, more precisely the following context in this particular case, is responsible for this difference. What is interesting is that the meaning (sense) of be to is determined by the following context. (37) a. You aren't to marry him, and that's an order. b. You aren't to marry him, as I read it in the cards. (37a) is interpreted as an order obviously, while (37b) has an epistemic predictive sense. It is manifested quite clearly above that the array of connotations is pragmatically determined. On the other hand, the be to has epistemic meanings illustrated in (38)-(42). (38) Such an outcome is to be expected. (39) These insects are to be found in NSW. [Huddleston 1980: 66; Seppanen] Furthermore it can be used in conditionals in English as in (40)-(42). (40) (41) (42)
And the free world has reacted quickly to this momentous process and must continue to do so if it is to help and influence events [ICE-GB: S1B-054 #17: 1: B] the system is totally dependent on employee goodwill if it is to produce good information. [ICE-GB:W2A-016 #118: 1] However, in nerves regeneration is essential if there is to be a satisfactory functional outcome. [ICE-GB: W2A-026 #15: 1]
There exists arguably a clear-cut distinction between be to and the epistemic modals in the use in conditionals. It is practically ruled out, or catalogued as performance error, that speakers of English select an epistemic modal for the protasis of conditionals, even though the meaning of this modal is conceptually quite compatible with the functioning of either part (protasis or adposis) of a conditional. The contrast in (43) serves as a most relevant observation to explain this phenomenon.
THE GRAMMAR OF BE TO (43)
a. b.
75
??If it may rain, you should take your umbrella. If it is possible that it will rain, you should take your umbrella.
According to Lyons (1977: 805-86), 'conditional clauses are incompatible with subjective epistemic modal expressions'. In (43a), may in the protasis If it may rain shows figment of the speaker's imagination and merely expresses possibility as non-factual, which is in conflict with another possible world created by if, while the possibility expressed by non-modal expressions in an acceptable utterance like (43b) refers to possibility as actuality independent of the speaker, and possibility is categorically asserted and therefore it is factual. In passing, non-modal expressions can express modal-like meanings as in (44) and (45): (44) (45)
a. b. a. b.
It's your duty to visit your ailing parents, You ought to visit your ailing parents. Jessica is possibly at home now. Jessica may be at home.
In the end, there are of course differences between be to and modal verbs. Still be in (1) shares enough properties with modal verbs to be categorized as a modal, and although it may be categorized as a modal, the sense of the be to construction is best considered to be an existence of a situation where the event is represented by the VP in the infinitive. Modal-like meanings of the construction are derived from the sense of to rather than that of be, which is attested in the following section. 7. Should To be Counted as Part of the Lexical Item? Let me take pieces of evidence one by one to argue for my proposal. •
Inversion
In a yes-no question, what moves to the front is not be to but be, which suggests that be behaves like a modal (operator), with to being an infinitive marker. (46) (47) (48)
•
He should go Should he go? He ought to go. * Ought to he go? Ought he to go? He is to go. *Is to he go? Is he to go?
VP fronting (Gazdar et al, 1982)
Impossibility of VP fronting as in (49f) shows that be itself isn't a modal. If it is, it should behave as will in (49b):
76
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(49)
a. b. c. d. e. f. g. h. i. j.
•
*... and went he ... and go he will ... and going he is ... and gone he has ... and taken by Sandy he was *... and to go he is *... and to go he wants *... and be going he will *... and have gone he will ... and being evasive he was
Be may be separated from to.
Therefore, be to isn't a syntactic unit (50) We are, I believe, to start tomorrow. (51) The most severe weather is yet/still to come. [Quirk et aL: 143] (52) He was eventually to end up in the bankruptcy court. [Quirk et aL: 218] • To may be missed out in the tag. If be to is a unit, be to has to be retained. (53)
He was to have gone, wasn't he?
• Unlike ought to and have to, the to doesn't have to be retained in be to when a VP that follows to is deleted. Since deletion of the VP after to appears always to be possible whether the relevant verb is a modal or not as in (55), this contrast tells nothing about the category of the item before to, but what (54) does imply is that the VP deletion is dependent on be in be to, rather than on to, which in turn suggests that be is a modal on its own. (54)
(55) •
Bill is to leave at once, and Alice is (to) also. [McCawley] Bill has to leave at once, and Alice has * (to) also. [McCawley] We don't save as much money these days as we {ought (to)/used to}. [Quirk et aL: 909] I've never met a Klingon, and I wouldn't want to. [Pullum, 1982: 185] Unlike ought to and have to, there is no phonetic fusion with be to.
Though it is not fully clear whether examples in (56) are zeugmatic or not, the /o-infinitive may be coordinated with a wide range of conjuncts of different categories (Warner 1993). My informants however say that they are all right without a zeugmatic reading. If this is the case, the to-infinitive is an independent unit in be to and there has to be a syntactic gap between be and to.
THE GRAMMAR OF BE TO 77 (56)
He was new to the school and to be debagged the following day. The old man is an idiot and to be pitied. You are under military discipline and to take orders only from me. You are a mere private and not to enter this mess without permission. He was an intelligent boy and soon to eclipse his fellows.
If this is the case and there is a one-to-one relation between syntax and semantics as is maintained in WG, the fo-mfmitive is semantically as well as functionally an independent element in the be-to construction and therefore the be-to is obviously not a syntactic unit, although there has to be some grammatical relation between the two elements (i. e. be and fo-infmitive). It seems that what fo-infinitives in (56) have in common is the function of predication. Otherwise they cannot be coordinated with those conjuncts before and in (56). This implies that be is an ordinary predicative be, followed by an infinitival clause with to of predicative function. Sentences in (57) and (58) give evidence supporting this predicative function of the infinitive clause because there is no be found in examples in (57). Nevertheless, the NPs in (57) and (58) all express predication, although they are NPs as a whole in syntactic terms: (57)
Prudential to float its Egg online bank Woman to head British Library Teeth to be farmed Naked swim teacher to sue Vestey grandson to stand trial Hayward Gallery to be refurbished and extended
(58)
Tendulkar to stand down as captain ... the quaint aspects of working class life to be found in many major novelists [Palmer]
This predicative analysis of the to-infinitive is also supported by (59a), where the toinfinitive is a part of a small clause with the preposition with. It is a well-attested fact that with heads a small clause expressing predication as in (59b). Therefore, the to-infinitive functions as a predicative in (59a). (59)
a. b.
... in a consultation paper agreed with Dublin to be released at the... [CO BUILD] With Peter the referee we might as well not play the match. [Aarts 1992: 42]
All these arguments show that be to is not a lexical unit. 8.
A Word Grammar Analysis of the Be To Construction
In this section, I will show how a WG analysis will make it possible to give the same syntactic and semantic structure to epistemic/non-epistemic meanings of the construction, based on the evidence that be is an instance of a raising verb in both cases. As far as I know, there has been very little, if any, research done with the mapping between semantic and syntactic structures of core and
78
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
marginal modals, including the present construction. In this sense, my approach is quite a valuable one. In these linguistic circumstances, I suggest that the question to be asked is: what must be the mapping between the semantic and syntactic structure of what is represented by the be to construction? I now present an answer framed in WG terms. Before giving a detailed analysis, let us have a quick look at WG in a nutshell. In WG, a syntactic structure is based on grammatical relations within a general framework of dependency theory, rather than on constituent structure. Accordingly, a grammatical relation is defined as a dependency relation between the head and its dependents which include complements and adjuncts (alias modifiers). In this framework, the syntactic head of a sentence, as well as its semantic head, is therefore a finite verb on which its dependents such as subject, object and so forth depend. To take a very simple example, the grammatical analysis ofyou are reading a Word Grammar paper is partially shown by the diagram in Figure 2.
Figure 2 Syntax and semantics in WG Each arrow in this diagram shows a dependency between words and it points from a head to one of its dependents, but what is most important here is that there are no phrases or clauses in the sense of constituency grammars. Thus in terms of dependency, are is the root of the sentence (represented as a bold arrow) on which you and reading depend as a subject and sharer (a kind of complement),8 respectively. In turn, a is the head of paper. Grammar depends on paper, Word depends on Grammar, and so on. Turning to the semantics of this sentence, 'you', which is a referent of you, is linked as a read-er (i. e. agent) to a semantic concept 'You read a Word Grammar', an instance of 'read'.9 The curved (vertical) lines point to the referent of a word. '-Er' and '-ee' are names of semantic relation or thematic roles. A small triangle representing the
THE GRAMMAR OF BE TO
79
model-instance relation is the same in earlier figures. A convenient way to diagram the model-instance relation is by using a triangle with its base along the general category (= model) and its apex pointing at the member (= instance), with an extension line to link the two. In Figure 2, then, the diagram shows the relation between the sense of the word read, fread', and its instance fyou read a WG paper'. Based on the arguments in the preceding sections, diagrams in (60a) and (60b) offer a view of syntax and semantics of the be to construction, which has both epistemic and non-epistemic senses. Translated into WG schematic features, this means that in syntax of both senses, it has a raising structure represented as the main subject functioning as both the subject of be and that of to or the infinitival verb. Thus WG configuration posits by and large the same semantic structure to both senses of the construction, which is headed by the semantic concept of 'Be'. The detailed analysis of the epistemic structure is given in (60a):
(60) a.
What this diagram shows on the semantic structure is: • epistemic sense is an instance of modality; • sense of aren't is 'be' (neglecting the negation); • 'be' is an instance of epistemic modality; • 'be' has a proposition as a dependent. Similarly in (60b), containing the non-epistemic sense of the be to:
80
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(60) b.
Here 'be' is an instance of non-epistemic, because (60b) means that some event is arranged or planned, without any sense of the speaker's judgement on the proposition embedded in the utterance (sentence). In both cases, the meaning (i. e. sense) of 'be' needs a proposition which expresses an activity or event as a dependent, which is a sense of the verb. Abstracting away from the technical markers, the diagram in (60c) represents a WG analysis of the coordinate structure in (56). This diagram schematizes the very idea that the same predicative function (a dependency relation) holds between was and the first conjunct new to the school enclosed by square brackets on the one hand, and between was and the second conjunct to be debagged.... at the same time.
(60) c.
THE GRAMMAR OF BE TO 9.
81
Conclusion
In this chapter, I have shown that a morphological, syntactic and semantic analysis of be in the be to construction provides evidence for the category of be in this construction proposed here. Namely, be is an instance of a modal verb in terms of morphology and syntax, while the sense of the whole construction is determined by the sense of 'to'. The analysis also explains why be to does not constitute a lexical unit. Finally, the WG account presented here gives the same syntactic and semantic structures to the construction, reducing the complexity of the mapping between the two levels of the structure of the modal-like expression be to. References Aarts, Bas (1992), Small Clauses in English. Berlin: Mouton de Gruyter. Bybee, John (1994), The Evolution of Grammar. Chicago: The University of Chicago Press. Celce-Murcia, Marianne and Larsen-Freeman, Diane (1999 ), The Grammar Book. Boston, MA: Heinle & Heinle. Collins Cobuild English Language Dictionary for Advanced Learners. (1995 ), Glasgow: Harper Collins Publishers. Collins Cobuild English Language Dictionary for Advanced Learners. (2001 ), Glasgow: Harper Collins Publishers. Collins Cobuild Advanced Learner's English Dictionary. (2003 ), Glasgow: Harper Collins Publishers. Gazdar, Gerald, Pullum, Geoffrey K. and Sag, Ivan A. (1982), 'Auxiliaries and related phenomena in a restrictive theory of grammar'. Language, 58, 591-638. Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (eds), (1980), Studies in English Linguistics for Randolph Quirk. London: Longman. Huddleston, Rodney (1976), 'Some theoretical issues in the description of the English verb'. Lingua, 40, 331-383. — (1980), 'Criteria for auxiliaries and modals', in Greenbaum, Sidney, et al. (eds), Studies in English Linguistics for Randolph Quirk. London: Longman, pp. 65-78. Hudson, Richard A. (1984), Word Grammar. Oxford: Blackwell. — (1990), English Word Grammar. Oxford: Blackwell. — (1996), A Word Grammar Encyclopedia (Version of 7 October 1996). University College London. — (2005, February 17 - last update), 'An Encyclopedia of English Grammar and Word Grammar', (Word Grammar), Available: www. phon. ucl. ac. uk/home/dicVwg. htm. (Accessed: 21 April 2005). Kreider, Charles W. (1998), Introducing English Semantics. London: Roudedge. Lampert, Gunther and Lampert, Martina (2000), The Conceptual Structure (s) of Modality. Frankfurt am Main: Peter Lang. Lyons, John (1977), Semantics. Cambridge: Cambridge University Press. McCawley, James D. (1988), The Syntactic Phenomena of English. Chicago: University of Chicago Press. Napoli, Donna Jo (1989), Predication Theory. Cambridge: Cambridge University Press. Palmer, Frank Robert (19902), Modality and the English Modals. Harlow: Longman. — (2001 ), Mood and Modality. Cambridge: Cambridge University Press. Perkins, Michael. R. (1983), Modal Expressions in English. London: Frances Pinter.
82
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Pullum, Geoffrey K. (1982), 'Syncategorematicity and English infinitival to*'. Glossa, 16 (2), 181-215. Pullum, Geoffrey K. and Wilson, Deirdre (1977), 'Autonomous syntax and the analysis of auxiliaries'. Language, 53, 741-88. Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (1985), A Comprehensive Grammar of the English Language. Harlow: Longman. Seppanen, A. (1979), 'On the syntactic status of the verb be to in Present-day English'. Anglia, 97, 6-26. Sugayama, Kensei (1996), 'Semantic structure of eat and its Japanese equivalent taberu: a Word-Grammatic account', in Barbara Lewandowska-Tomaszczyk and Marcel Thelen (eds), Translation and Meaning, Part 4. Maastricht: Universitaire Pers Maastricht, pp. 193-202 — (1998), 'On be in the be to Construction', in Yuzaburo Murata (ed), Grammar and Usage in Contemporary English. Tokyo: Taishukan, pp. 169-77. Warner, Anthony R. (1993), English Auxiliaries. Cambridge: Cambridge University Press. Notes 1 This is a revised and expanded version of my paper of the same title read at the International Conference 'Modality in Contemporary English' held in Verona, Italy on 6-8 September 2001. 1 am most grateful to the comments from the audience at the conference. Remaining errors are however entirely my own. The analysis reported here was partially supported by grants from the Daiwa Anglo-Japanese Foundation (Ref: 02/2030). Their support is gratefully acknowledged. 2 Here we shall not take into account similar sentences as in (i), which are considered to have a different grammatical structure from the one we are concerned with in this chapter. (i) My dream is to visit Florence before I die. 3 The idea of construction here is quite a naive one, different from the technical definition of the one used in Goldberg's Construction Grammar. 4 Palmer (1990: 164), among others, claims that 'is to' is formally a modal verb. 5 Be to can be used in the there construction as in (i). (i) Regional accents are still acceptable but there is to be a blitz on incorrect grammar. [COBUILD2] This suggests that be is a raising verb because there is no semantic relation between there and 'be to'. Example (i) is construed as a counter-example to view this construction as one involving a subject-control as in (ii). (ii) Mary is [PRO to leave by 5]. - Napoli 6 The possibility sense is found only in the passive, so there is no active counterpart for These insects are to be found in NSW. [Huddleston 1980: 66; Seppanen] 7 Quite exceptionally, it could be arranged by God or other supernatural beings. Otherwise it cannot be. 8 Sharer is a grammatical relation in Word Grammar. 9 Here we disregard the tense and aspect of the sentence.
5 Linking in Word Grammar JASPER HOLMES
Abstract In this chapter I shall develop an account for the linking of syntactic and semantic arguments in the Word Grammar (WG) framework. The WG account is shown to have some of the properties of role-based approaches and some of the properties of class-based approaches.
1. Linking in Word Grammar: The syntax semantics principle 1. 1 Introduction Any description of linguistic semantics must be able to account for the way in which words and their meanings combine in sentences. Clearly, this presupposes an account of the regular relationships between syntactic and semantic structures: a description of the mechanisms involved in linking. The search for an adequate account of linking has two further motivations: it makes it possible to explain the syntactic argument-taking properties of words (and therefore obviates the need for valency lists or other stipulative representations of subcategorization facts); and it provides a framework for dealing with words whose argument-taking properties vary regularly with the word's meaning (many such cases are treated below and in the work of other writers in the field of lexical semantics including Copestake and Briscoe 1996; Croft 1990; Goldberg 1995; Lemmens 1998; Levin 1993; Levin and Rappaport Hovav 1995; Pustejovsky 1995; and Pustejovsky and Boguraev 1996). Levin and Rappaport Hovav provide yet another reason to seek an account of argument linking: that it is an intrinsic part of the structure of language. In their introduction, they make the following claim: To the extent that the semantic role of an argument is determined by the meaning of the verb selecting it, the existence of linking regularities supports the idea that verb meaning is a factor in detennining the syntactic structure of sentences. The striking similarities in the linking regularities across languages suggest that they are part of the architecture of language. (1995: 1, my emphasis)
Of course, it is not the meanings of verbs alone that are relevant in determining semantic structure. It should also be clear that I do not share Levin and
84
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Rappaport Hovav's conviction of the similarities across languages in the details of argument linking. However, I accept readily that the fact of argument linking, and the mechanism that controls it, must be shared across languages. The linking regularities that we seek are generalizations over correspondences between syntactic and semantic relationships. In the WG framework (Hudson 1984, 1990, 1994, 2004; Holmes 2005), they take the form of specializations or refinements of the Syntax Semantics Principle (SSP) (Hudson 1990: 132). This is represented schematically in Figure 1 and given in prose in (1). The SSP, as shown here, corresponds to the bijection principle of Lexical Functional Grammar (Bresnan 1982) and to the projection principles and 0criterion of Government and Binding Theory (GB) (Chomsky 1981: 36, 38).
Figure 1 Syntax Semantics Principle (1) Syntax Semantics Principle (SSP): A word's dependent refers to an associate of its sense.1 Specific linking rules for specific relationships link classes of syntactic dependency with classes of semantic associate. These classes gather together the relevant syntactic and semantic properties. By way of exemplification, I begin with the structure associated with the indirect object relationship. I go on to discuss the properties of objects and subjects. 1. 2 Indirect objects Figure 2 shows some of the syntactic and semantic structure that needs to be associated lexically with GIVE: the verb has a subject (s in the diagram), an object (o in the diagram) and an indirect object (io in the diagram), all of which are nouns. The sense of the verb, Giving, is an event with an 'er' (the referent of the subject; the properties of 'ers' and 'ees' are discussed shortly), an 'ee' (the referent of the object), a recipient (the referent of the indirect object) and a result,
LINKING IN WORD GRAMMAR
85
an example of Having which shares its arguments with its parent The giver has agentive control over the event, being in possession of the givee beforehand and willing the transfer of possession. The givee and the recipient are more passive participants: the former undergoes a change of possession, but nothing else, the latter simply takes possession of the givee. Being a haver presupposes other properties (centrally humanity), but those are not shown here.
Figure 2 GIVE Clearly, not all this information is specific to GIVE. Volitional involvement, control and instigation are semantic properties associated with many other subject relationships, even those of verbs that have no (indirect) objects; the passive role and affectedness of the object also apply in many other cases; and many other verbs can appear with indirect objects, with similar semantic properties. Levin provides the following two groups of verbs permitting indirect objects (1993: 45-49), distinguished from each other by semantic properties (those in (2) alternate, according to Levin's analysis, with constructions with the preposition TO, those in (3) with constructions with FOR). The question of alternation, as well as the difference between the two groups, is dealt with shortly.
(2) ADVANCE, ALLOCATE, ALLOT, ASK, ASSIGN, AWARD, BARGE, BASH, BAT, BEQUEATH, BOUNCE, BRING, BUNT, BUS, CABLE, CARRY, CART, CATAPULT, CEDE, CHUCK, CITE, CONCEDE,
86
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
DRAG, DRIVE, E-MAIL, EXTEND, FAX, FEED, FERRY, FLICK, FLING, FLIP, FLOAT, FLY, FORWARD, GIVE, GRANT, GUARANTEE, HAND, HAUL, HEAVE, HEFT, HIT, HOIST, HURL, ISSUE, KICK, LEASE, LEAVE, LEND, LOAN, LOB, LUG, MAIL, MODEM, NETMAIL, OFFER, OWE, PASS, PAY, PEDDLE, PHONE, PITCH, POSE, POST, PREACH, PROMISE, PULL, PUNT, PUSH, QUOTE, RADIO, READ, REFUND, RELAY, RENDER, RENT, REPAY, ROLL, ROW, SATELLITE, SCHLEP, SELL, SEMAPHORE, SEND, SERVE, SHIP, SHOOT, SHOVE, SHOW, SHUTTLE, SIGN, SIGNAL, SLAM, SLAP, SLIDE, SUNG, SLIP, SMUGGLE, SNEAK, TAKE, TEACH, TELECAST, TELEGRAPH, TELEPHONE, TELEX, TELL, THROW, TIP, TOSS, TOTE, TOW, TRADE, TRUCK, TUG, VOTE, WHEEL, WILL, WIRE, WIRELESS, WRITE, YIELD. (3) ARRANGE, ASSEMBLE, BAKE, BLEND, BLOW, BOIL, BOOK, BREW, BUILD, BUY, CALL, CARVE, CASH, CAST, CATCH, CHARTER, CHISEL, CHOOSE, CHURN, CLEAN, CLEAR, COMPILE, COOK, CROCHET, CUT, DANCE, DESIGN, DEVELOP, DIG, DRAW, EARN, EMBROIDER, FASHION, FETCH, FIND, FIX, FOLD, FORGE, FRY, GAIN, GATHER, GET, GRILL, GRIND, GROW, HACK, HAMMER, HARDBOIL, HATCH, HIRE, HUM, IRON, KEEP, KNIT, LEASE, LEAVE, LIGHT, MAKE, MINT, MIX, MOLD, ORDER, PAINT, PHONE, PICK, PLAY, PLUCK, POACH, POUND, POUR, PREPARE, PROCURE, PULL, REACH, RECITE, RENT, RESERVE, ROAST, ROLL, RUN, SAVE, SCRAMBLE, SCULPT, SECURE, SET, SEW, SHAPE, SHOOT, SING, SLAUGHTER, SOFTBOIL, SPIN, STEAL, STITCH, TOAST, TOSS, VOTE, WASH, WEAVE, WHISTLE, WHITTLE, WIN, WRITE. The set of verbs that can take an indirect object, of either kind, is in principle unlimited in size, since it is possible to extend it in one of two ways. First, membership is open to new verbs which refer to appropriate activities: (4) (5) (6)
We radioed/phoned/faxed/emailed/texted/SMSed them the news. We posted/mailed/couriered/FedExed™ them the manuscript. Boil/coddle/microwave/Breville™ me an egg.
Second, and even more tellingly, existing verbs can be used with indirect objects, with novel meanings contributed by the semantics of the indirect object (7) (8)
The colonel waggled her his bid with his ears. Dust me the chops with flour.
Examples like (7) and (8) are acceptable to the extent that the actions they profile can be construed as having the appropriate semantic properties. For example a bottle of beer can be construed as having been prepared for someone if it has been opened for them to drink from, but a door is not construed as prepared when it has been opened for someone to pass through:
UNKING IN WORD GRAMMAR (9)
87
Open me a bottle of pils/*the door.
It is clear from this, and from the fact, noted by Levin (1993: 4-5) with respect to the middle construction, that speakers make robust judgements about the meanings of unfamiliar verbs in constructions on the basis of the construction's meaning (see also (10)), that the meaning of the construction must be represented in a schematic form in the mind of the language user. (10)
Flense me a whale.
This schematic representation must pair the semantic properties of the construction with its syntactic and formal (phonological/graphological) properties. Goldberg (2002) provides a powerful further argument for treating constructions as symbolic units in this way. This argument, which she traces to Chomsky (1970) and to Williams (1991) (where it is called the 'target syntax argument'), holds that where the properties of supposedly derived structures (here the creative indirect objects) match those of non-derived ones (here the lexically selected indirect objects), the generalization over the two sorts of structure is most effectively treated as an argument structure construction in its own right. In English, which does not have a rich inflectional morphology, the formal properties of the indirect object relationship are limited to the fact that personal pronouns in the indirect object position appear in their default form ({me} not {I}), which is also true of direct objects. Other languages show more variation, marking the presence of an indirect object in the form of the verb, as in (11) from Indonesian (Shibatani 1996: 171), or assigning different case to nouns in indirect object position than those functioning as direct objects, as in German (12). (11) Suya membunuh-kan Ana lipas. I kill BEN [name] centipede I killed a centipede for Ana. (12) a. Gib ihr/*sie Blumen. give her flowers Give her flowers, b. Ki\P *ihr/sie. kiss her Kiss her.
Some syntactic properties of the indirect object (in English) are given by Hudson (1992). These include the possibility of merger with subject in passive constructions (13), the obligatoriness of direct objects in indirect object constructions (14) and its position immediately following the verb (15). (13) (14) (15)
She was given some flowers. We gave (her) *(some flowers). a. We gave her some flowers/*some flowers her. b. We sent her some flowers over/her over some flowers/*some flowers her over/*over her some flowers.
88
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
The semantic property common to all indirect objects is that they refer to havers: in the case of the verbs taking 'dative' indirect objects in (2), the result of the verb's sense is that the referent of the indirect object comes into possession of something; in the case of those taking 'benefactive' indirect objects in (3), the verb profiles an act of creating or preparing something intended to be given to the referent of the indirect object.
Figure 3 Some verbs have indirect objects Figure 3 shows the various properties associated with indirect objects. First, the diagram shows that indirect objects are nouns, and that it is verbs, and more particularly verbs with objects, that have indirect objects: Ditransitive, the category of verbs with indirect objects, isa Transitive, the category of verbs with direct objects. This is enough by itself to represent the fact that the direct object is obligatory with indirect objects (14), but the object relationship is nevertheless also shown in the ditransitive structure, since it appears in the word order rule (indirect objects precede objects). The referent of the object also appears in the semantic structure, along with that of the indirect object, since without it the semantic structure cannot be interpreted. I show the two referents as coarguments of the result of the verb's sense, though the semantics is worked out more clearly in the discussion of Figure 4. The fact that indirect objects can merge with subjects in passive constructions is dealt with in the following section. Indirect objects may have one of two slightly different semantic structures, each associated with a separate category of ditransitive verbs. In both, the referents of the two dependents are 'er' and 'ee' of a Having, but the role of that Having differs somewhat between the two. The two structures are given in Figure 4.
UNKING IN WORD GRAMMAR
89
Figure 4 Two kinds of indirect object Ditransitive/1 is exemplified in (16): (16)
We baked her a cake.
The sense of the verb isa Making, and its result (therefore) isa Being (is a state) and the argument of that Being is the referent of the direct object: baking a cake results in that cake's existence, baking a potato results in that potato's being ready. The Having that connects the referents of the two arguments is the purpose of the verb's sense: the purpose of the baking of the cake is that it should belong to her (the referent of the indirect object). This concept is connected to the sense of the verb by the beneficiary relationship (labelled ben/ fy). Ditransitive/2 (17) has as its sense a Giving event, which straightforwardly has as its result the Having that connects the referents of the two arguments. The referent of the indirect object is connected to the sense of the verb by the recipient relationship. (17)
We passed her a parcel.
Once these two semantic structures are established, they can be used in the treatment of the relationship between the indirect object construction and constructions with the prepositions TO and FOR. Simply, TO has the same sense as Ditransitive/2 and FOR the same as Ditransitive/1 (with some differences: see (18)). This synonymy can, though it need not, be treated as a chance occurrence: no explanation is necessary for the relationship between constructions with indirect objects and those with TO. The case of FOR and the difference seen in (18) certainly supports the idea that the two constructions converge on a single
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE meaning by chance, since the two meanings are in fact different The use of the indirect object to refer to the beneficiary of an act of preparation is only possible where the prepared item is prepared so it can be owned (or consumed) by the beneficiary; this constraint does not apply to beneficiary FOR. (18)
a. b.
Open a bottle of pils/the door for me. Open me a bottle of pils/*the door.
The pattern in Figure 3 (and Figure 4) represents a symbolic relationship. Lexical structures include specifications of the meanings of individual lexemes and of classes of lexemes defined by common properties of all sorts. A lexeme has a form and a range of syntactic properties which identify the syntactic pole of the symbolic relationship; it also has a sense, which provides the connection to a range of semantic properties. Similarly, inflectional and other classes of lexemes share formal, syntactic and semantic properties. And similarly, syntactic dependencies are associated with a range of formal and syntactic properties (chiefly constraints on the elements at either end of the dependency) and semantic properties (represented in the semantic relationship between the meanings of the two elements). Figure 5 shows, by way of an example, partial lexical structures for the lexeme OPEN, the inflectional category Past and the indirect object relationship.
Figure 5 Schematic representation of OPEN, Past, indirect object The pattern in Figure 3 (and Figure 4) is a generalization over verbs taking indirect objects. A verb appearing in a construction with an indirect object instantiates the more general model. The model represents the properties of the construction in the same way as a lexeme represents the properties of a particular word. In the case of a novel use of the construction (19), the fact that the sentence conforms to the formal properties entails that it also conforms to the semantic properties of the construction. In fact the construction can also be used to constrain the set of verbs that may take an indirect object, since only
LINKING IN WORD GRAMMAR
91
those verbs that can conform to the properties of the construction can appear in it: *Skate me a half-pipe/*Run me a mile, etc. (19)
Waggle me your bid.
Examples like (19) represent cases of multiple inheritance: the verb instantiates both WAGGLE (from which it gets its form and much of its meaning) and Ditransitive (from which it gets the indirect object and concomitant semantic properties). This is the same mechanism that mediates verbal inflection: the past tense of a verb inherits from the verb's lexeme and from the category Past at the same time. Because of this possibility, it is not necessary to include all of the structure in the diagrams in the lexical specification even of a verb like GIVE, since (when it is used ditransitively) the relevant properties follow from the general properties of ditransitive verbs. These verbs, whose use with an indirect object seems unexceptional compared to those like (19), probably are lexically associated with the indirect object construction. GIVE, for example, might be separated into two sub-types, one of which isa Ditransitive, and the other of which takes TO as a complement. By contrast, a verb like ACCORD, that never appears without an indirect object, inherits all the properties of Ditransitive. Figure 6 shows a part of the lexical structure of ACCORD and GIVE. All cases of ACCORD have indirect objects, so the whole category is subsumed under Ditransitive. GIVE, by contrast, is divided into two subcategories: one which isa Ditransitive, and one which isn't (this category has TO as a complement). The diagram also shows that creative use of the indirect object as in (19) can be mediated by a contextual (= non-lexical) specialization of the relevant lexeme, that inherits also from the 'inflectional' category Ditransitive.
Figure 6 ACCORD, GIVE, Waggle me your bid
92
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Some of the features of the ditransitive model are nevertheless often repeated or overridden in the structures associated with verbs that are specializations of it. For example, Lending and Loaning are special in that their result is temporary, Donating because the recipient is a charitable organization, and Denying in that the intention is that the recipient should not receive from the givee. These specializations of/divergences from the model must be represented in the individual lexical structures of the verbs concerned. A classification hierarchy consisting of classes defined by properties that distinguish them from other categories is a commonplace in many approaches to knowledge representation (and elsewhere). In linguistics the idea is found in the work of structuralist semanticists (Weisgerber 1927; Trier 1931; Cruse 1986), among others. 1. 3 Objects Biber et al. (1999: 126-8) give a number of syntactic properties for English objects, as follows (the properties assigned to objects and to subjects are all taken from Biber et al. ', some details may be disputed, but the general point remains the same): a. b. c. d.
found with transitive verbs only is characteristically an NP, but may be a nominal clause is in accusative case (when a pronoun) typically follows immediately after the VP (though there may be an intervening indirect object)2 e. may correspond to the subject in passive paraphrases •
The first two syntactic properties refer to the classes of the words at either end of the object relationship: some verbs (the transitive verbs) lexically select an object; the objects themselves are generally nouns. • The third property concerns the form of the object: when it is a pronoun, it takes the 'accusative' form (what I have above called the default form). • The fourth property concerns its relative position in the sentence: objects generally follow their parents, and only a limited set of other dependents of the parent may intervene (any number of predependents of the object may intervene) (20). Biber et al. note that indirect objects may come between the object and the parent (21); this possibility is also open to particles (22). (20) (21) (22) •
Philly fiUeted ('skillfully) the fish. We gave her a new knife. She threw away the old one. The final syntactic property refers to passive constructions. Under the WG analysis (see Hudson 1990: 336-53), the subject of a passive verb is at the same time its object (23) (or indirect object (24)), the merger of dependents being licenced by the passive construction itself.
LINKING IN WORD GRAMMAR (23) (24)
93
The camel hair coat was given to Cathy. Cathy was given the camel hair coat
Figure 7 shows how these syntactic properties can be represented in a lexical structure.
Figure 7 Syntactic properties of objects The parent in an object relationship isa Verb and the dependent isa Noun. In this way, the object relationship defines a class of transitive verbs (verbs that have objects). Verbs that only appear in transitive constructions inherit all properties from this class (DEVOUR isa Transitive, just as ACCORD isa Ditransitive). The word order properties are represented by the next relationship: the form of the dependent is the next of that of the parent. The diagram also shows the category Ditransitive (see Figure 4), where the word order properties are somewhat different (the form of the object is the next of that of the indirect object). Also represented in the diagram is the class of passive verbs (the category Passive). These verbs are defined by their formal properties: the form of a passive verb consists of its base plus a suitable ending (not shown). There are two classes of passive verb: one, which also isa Transitive, in which the subject is merged with the object, and one, which also isa Ditransitive, in which the subject is merged with the indirect object. The full lexical structure of the object relationship must also include its semantic properties. In line with the approach outlined above for indirect
94
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
objects, the semantic properties of the object are related to its syntax through a specialization of the SSP. Biber et al. (1999: 126-8) also identify a range of possible semantic relationships that correspond with the object relationship (see a-g), and the lexical semantic representation of the object relationship should be general over all of these. a. ected (bake a potato) b. resultant (bake a cake) c. locative (swim the Ohio) d. instrumental (kick your feet) e. measure (weigh 100 tons) f. cognate object (laugh a sincere laugh) g. eventive (have a snooze) Properties a, b and d can be quite straightforwardly collected under a general treatment, in terms of their force-dynamic properties: in each case, the sense of the verb has a result which is a further event having the referent of the object as an argument (when you bake a potato, the potato becomes soft and edible; when you bake a cake, the cake comes into existence; when you kick your feet, the feet move). This is represented in Figure 8: a verb's object refers to the 'er' of the result of the verb's sense. This two-stage relationship is further represented in a direct relationship between the verb's sense and the referent of its object, labelled 'ee'.
Figure 8 Affected/effected objects Notice that a similar conflation of a two-stage relationship into a direct one was used above in the semantic structure of indirect objects. In fact, when a verb has an indirect object, the recipient relationship overrides the 'ee' relationship in being assigned to the 'er' of the result, in much the same way as the word order properties of the indirect object override those of the object. This is
LINKING IN WORD GRAMMAR
, 95
determined in the semantics by the nature of the resulting state: where this state isa Being, its 'er' is the 'ee' of the verb's sense; where it isa Having, its 'er' is the recipient of the verb's sense, rather than its 'ee', and the 'ee' of the verb's sense is the same as the 'ee' of the result (see Figure 4 above). 'LocationaT objects, as in c, do not refer to affected arguments, but to parts of a path. The example in c defines the beginning and end of the path (on opposite sides of the river), but other examples may profile the beginning (25a), middle (25b) or end (25c) of the path. (25)
a. The express jumped the rails, (from Biber et al. (1999: 127)) b. nny vaulted the horse. c. Elly entered the room.
The set of verbs that can appear with an object of this kind is (naturally) limited to those that can refer to a motion event and in this sense the 'locative' object is lexically selected by its parent. Notice also that the verb (often) determines which part or parts of the path may be profiled by such an object. Because of this, these arguments must appear in the lexical structures of quite specific categories (at the level of the lexeme or just above). The relevant categories are subsumed under Transitive, since the syntactic properties are the same as those of the affected/effected objects, but it is arguable whether they need to be collected under a category 'locative object verb'. This category is justified to the extent that generalizations can be made over the relevant constructions. There seems to be little semantically in common between locative objects and affected/effected objects, though there is some relationship. For example, Dowry's (1991) incremental theme is a property of both kinds of object the event in both cases is bounded by the theme: (26)
a. b.
Barry baked a potato/*potatoes in five minutes, Sammy swam the Ohio/*rivers in five minutes.
When the sense of the verb is an unbounded event, a measure expression can be used to define a bounded path: Sammy swam jive miles. It is not entirely clear that arguments of this sort are indeed objects. Some certainly are not; in 27 the object of pushed is the (pea) and not the measure expression. (27)
Evans pushed the pea five miles with his nose.
The 'measure' objects are also confined to a limited class of verbs, by which they are semantically selected (weigh Jive tons, measure Jive furlongs). They also have little in common semantically with the other types of object, since their semantics is so heavily constrained by the verb. 'Cognate' objects (28, 29) are also associated with a very small class of verbs (Levin gives 47 (1993: 95-6), out of a total of 3107 verbs). They have something in common semantically with effected objects, but the semantics is constrained by the verb, which may also go so far as to select a particular lexeme. Levin notes:
96
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE Most verbs that take cognate objects do not take a wide range of objects. Often they only permit a cognate object, although some verbs will take as object anything that is a hyponym of a cognate object (1993: 96).
The verb and its object refer jointly to a performance of some kind. (28) (29)
She sang a sweet song. Deirdre died a slow and painful death.
'Eventive' objects are confined to an even smaller class, the 'light' verbs. In these cases the event structure is determined by the verb, but the details of the semantics are supplied by the noun. In light verb constructions with HAVE, the object refers to an event (have a bath/meal/billiards match); light DO, in contrast, refers to an affective/effective event, the precise nature of which is determined by the semantics of the (affected) object: (30)
a. b. c.
I'll do the beds, ['dig them/make them up'] I'll do the potatoes, ['peel them'] I'll do the cake, ['bake it']
Figure 9 collects together the various semantic properties of objects. The category Transitive is the same as appeared in Figure 7: it is the locus of the syntactic properties of objects (these are represented schematically here): •
The majority of objects are subsumed under the affective/effective category. In the diagram this is represented by the semantic concept Making. • I show two subcategories. Making' (as in The cold made our lips blue) and Creating are schematic for the senses of the affective object verbs and the effective object verbs respectively. • Making is schematic for all affective/effective events, and as such provides a sense for 'light' DO (shown as DO/light in the diagram). 'Light' HAVE is shown as a simple transitive verb that corefers with its object (the shared referent being an event). • The set of verbs taking 'locative' objects is represented by a class having Moving' as its sense. This concept, which is a subcategory of ordinary Moving, subsumes cases of moving with respect to some landmark. The landmark appears in the semantic structure (labelled 1m). • The types of Moving' are classified here according to whether the landmark is construed as the middle of a path (Passing), an obstacle (Traversing), an end point (Entering) or a source (Leaving). • Finally, the diagram shows that some nouns which are objects refer to Measurements, and they define a property of the er of their parent's sense. The lexical structure given in Figure 9 integrates the syntactic properties identified above (Figure 7) with the semantic properties of the various types of object Figure 9 is schematic for all 'transitive constructions' in that verbs with objects inherit (some of) their properties from the category Transitive (usually
UNKING IN WORD GRAMMAR
97
Figure 9 Semantic properties of objects by way of one of the subclasses) and nouns that are objects inherit some of their properties from the category that fills the relevant slot in the structure (perhaps also by way of one of its subclasses: the diagram does not show inheritance relationships between the object noun in the most general case and those in the subcases, but these relationships are nevertheless implicit in the inheritance structure). 1. 4 Subjects Biber et al. (1999: 123-5) give the following syntactic properties for English subjects: a. b. c. d.
found with all types of verbs is characteristically an NP, but may be a nominal clause is in nominative case (when a pronoun and in a finite clause) characteristically precedes the VP, except in questions where it follows, except where the subject is a Wh word itself e. determines the form of present tense verbs (and of past tense BE) f. may correspond to a by phrase in passive paraphrases •
•
Again, the first two syntactic properties concern the classes of words that participate in the relationship: verbs have subjects, which are generally nouns. Any verb may have a subject, so the class of 'subject verbs' is less constrained than the class of transitive verbs. It is perhaps for this reason that the semantic roles played by subjects are so much more diverse (see below). All tensed verbs have subjects, so the class Tensed is shown as a subset of the subject verbs. (See Figure 10. ) The 'nominative' form of personal pronouns consists of the five words I, SHE, HE, WE and THEY which are subcases of the relevant pronouns that are used only in subject position. (See Figure 10. )
98
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure 10 Some syntactic properties of subjects •
The word order properties of subjects are slightly more complicated. Generally the subject precedes its parent, but some subjects follow their parents and in many of these cases the referent of the verb is questioned (the construction forms a yes/no question); these cases are represented in the subclass of subject verbs Inverted. The word order properties of Wh questions are determined in part by the lexical properties of the category Wh (schematic over Wh words). This category is always the extractee (x< in the diagram) of its parent and so precedes it. Where the Wh word is not the subject of the verb, the verb and subject are also inverted (the complement of Wh isa Inverted).
Figure 11 Word order properties of subjects
LINKING IN WORD GRAMMAR •
99
Subject-verb agreement is a property of the categories participating in the subject relationship. Present verbs (Present is a subcase of Tensed) must have the same agreement value as their subjects. Those with the agreement singular have a form consisting of their base plus an {s}. Notice that this requires that the pronouns I and YOU have agreement plural (or have no agreement value) (/ like I she likes). Subject-verb agreement is dealt with at length by Hudson (1999).
Figure 12 Subject-verb agreement • The final syntactic property is more properly semantic in WG: just as there is overlap between the semantics of the indirect object relationship and that of the preposition TO, so there is considerable overlap between the semantics of the subject relationship and that of the preposition BY. The semantic properties of subjects are explored more fully in the following section, but some general remarks can be made here. Biber et al. (1999: 123-5) give the following possible semantic roles for subjects: a.
agent/willful initiator (She kicked a bottle cap at him)
100
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
b. external causer (The wind blew the plane off course) c. instrument (Tactics can win you these games] d. with stative verbs: • recipient (/ know it, She could smell petrol) • source (Ton smell funny) • positioner (She sat against a wall) e. affected (It broke, An escapee drowned) f. local (The first floor contains sculptures) g- eventive (A post mortem examination will take place) h. empty (It rained) • The first three roles (a-c) can be collected together by virtue of the force-
• •
• •
dynamic properties they share: agents, causes and instruments all precede the event in the force-dynamic chain. I argue below that affected subjects (e) are similarly controlled by the forcedynamic structures of the verbs that take them. The semantic roles played by the subjects of stative verbs are chiefly determined by the lexical (semantic) structure of the individual lexeme, though some semantic classification is possible (see Figure 13). 'Local' and 'eventive' subjects are controlled by the lexical structures of the verbs that take them. Since every verb can have a subject, the number of different semantic roles open to the referents of subjects is limited only by the number of different event types denoted by verbs. This can be seen particularly clearly in the case of 'dummy' subjects (h).
Figure 13 Semantic properties of subjects
LINKING IN WORD GRAMMAR
101
Figure 13 collects together the possible semantic roles associated with the subject relationship, and relates them symbolically to the syntactic properties identified above (given schematically in the diagram). The various semantic types of subject are glossed by the 'er' relationship introduced above. A full account of this relationship and of the 'ee' relationship linked with objects is provided in the following section. Four kinds of stative predicate are shown, covering the three possibilities under (d) and the 'local' subjects in (f). Some of these semantic classes are dealt with in more detail in following chapters; each makes different requirements of its 'er'. A class of 'eventive verbs' is also included; these corefer with their subjects. 1. 5 Three linking rules On the basis of the above discussion we can construct general linking rules for the three relationships subject, object and indirect object These linking rules link sets of syntactic properties, associated with the relevant dependency class, with sets of semantic properties, associated with classes of semantic association. The linking rule for subjects is given in Figure 14 and in prose in (31). The rule pairs the syntactic relationship subject with the semantic relationship 'er'. The former gathers together the syntactic properties of subjects (Figure 10-Figure 12) and the latter the semantic properties associated with them (Figure 13).
Figure 14 Subject linking rule (31)
A word's subject refers to the 'er' of its sense.
A linking rule for objects is given in Figure 15 and in (32). The rule pairs the syntactic relationship object with the semantic relationship 'ee' (this is the pattern given above for DO/light, which is followed by most transitive verbs). The former gathers together the syntactic properties of objects (Figure 7) and the latter the semantic properties associated with them (Figure 8).
102
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure 15 Object linking rule (32)
A word's object refers to the ee of its sense.
Finally, abstracting away from Figure 4 (and using 'beneficiary' as schematic over recipients and beneficiaries) gives us the following linking rule for indirect objects. This rule gathers together, in the two associations indirect object and beneficiary respectively, the syntactic and semantic properties of indirect object constructions, as identified above.
Figure 16 Indirect object linking rule (33)
A word's indirect object refers to the beneficiary of its sense.
Now, semantic relationships like recipient or beneficiary are quite straightforwardly understood in terms of more complex semantic structures: if a concept C has a result which is an example of Having, dien that result's first
UNKING IN WORD GRAMMAR
103
argument is the recipient of C. The relationships 'er' and 'ee', however, which are linked to subject and object respectively, are less straightforward, so the status of the subject and object linking rules is at least open to question. In the second part of this chapter, I address the outstanding issues, providing more detailed linking rules for subjects and objects. 2.
The Event Type Hierarchy: The framework; event types; roles and relations
2. 1 The framework In the first part of this chapter I sketched a linking mechanism within the WG framework, based on generalizations over grammatical relations (specializations of the Syntax Semantics Principle). The details are fleshed out in this part. The linking regularities presented above consist of symbolic structures which link specific syntactic relationships (subject, object, indirect object, etc. ) with specific semantic relationships ('er', 'ee', recipient, etc. ). The syntactic relationships are identified by a set of word-level (syntactic, morphological, phonological, etc. ) properties which, by default, are inherited by all cases of the dependency: unless otherwise specified, subjects precede their parents and determine their form, objects follow and permit no intervening codependents, and so on. The semantic relationships are identified by a set of concept-level (thematic, force-dynamic, etc. ) properties, which likewise constitute the default model for the relationship. The syntactic and semantic properties taken together constitute the lexical structure of the relevant relationship, and can be seen as a gestalt. As I argue above, semantic relationships like recipient and result are quite straightforwardly understood in terms of more complex semantic structures. The relationships 'er' and 'ee', however, which are linked to subject and object respectively, are less straightforward. An account is provided here in which the properties of 'ers' and 'ees' are defined by a hierarchy of event types (notice that an event type (Having) played a role in the definition of result and recipient). Since most of the event types are defined by a single exceptional argument relationship, and since the linking regularities are still stated in terms of single roles, the WG approach outlined here combines the properties of role-based and class-based approaches. The linking regularities presented above are generalizations over the linking properties of all subjects, objects, etc. While each syntactic dependency always maps onto the same semantic argument, the exact nature of the role played by that argument is determined by the wider conceptual structure associated with the parent's sense (as represented partly by its event type). The distinction between words and constructions is an emergent property of the network structure. The categories in the event type hierarchy are defined by their semantic (conceptual) properties, including force-dynamic properties (but not including aspectual properties: see Holmes (2005: 176-211)). Many of the event types function as the senses of words, though some do not. The categories support a
104
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
number of associations (more at the more specific levels), including those mentioned in the linking regularities. The roles of those arguments are defined by the rest of the conceptual structure associated with the lexical category.
2. 2 Event types Figure 17 shows the event type hierarchy. The various types are shown, but most of their properties are not (they are given in the following diagrams). The category at the top of the hierarchy is labelled Predicate; this is not an entirely satisfactory name for this concept, but it has the benefit of subsuming both states and events. The names of the concepts in the hierarchy are intended to be the senses of lexical words, and for this reason it is perhaps surprising that no readily useable term exists for the highest category, though it might be argued that this concept does not have much use as an element in the normal use of language. The event type hierarchy should more properly be called the predicate type hierarchy.
Figure 17 Predicate type hierarchy Predicates are divided into states (State) and events (Event), the latter consisting of a series of (more or less transient) states. The most general category, Predicate, is shown with a single argument, labelled 'er', and this association is inherited (implicitly) by the two subclasses. The states are divided into Being and Having; the latter and some of the former have a second argument, labelled 'ee'. Further properties of these categories are explored shortly. The events include processes like Laughing and Yawning as well as the further categories Becoming and Affecting. The first of these is telic (it has a result which is a state); the second has an 'ee' as well as an 'er'. Affecting
LINKING IN WORD GRAMMAR
105
includes transitive processes like Pushing and Beating ('hitting' not 'defeating') as well as the category Making which subsumes two further categories, Creating, which is telic since its result is an example of Being (or Existing), and Making', which is telic in that its result isa Becoming and the result of this second event is a state. Figure 18 shows in more detail the properties of the states. Being defines a property of its 'er'. For example, Big functions as the size of its 'er' (Drunk is also shown as an example of the way in which the semantic network represents all aspects of meaning). Other subcases of Being include Feeling, which subsumes psychological states (see Figure 19), and At, which subsumes locations (see Figure 20). The inclusion of the traditional semantic roles theme and actor pre-empts the discussion of the difference between Being and Having in Figure 21 and of the relationship between the argument positions and traditional semantic roles in the following section.
Figure 18 Hierarchy of states
Figure 19 shows in more detail the properties of Feeling. This category subsumes one- and two-argument psychological states. In both cases the 'er' must be sentient. One of each kind of state is shown as an example. A single semantic relationship is shown for each; this stands for a fuller characterization of the words' meanings which would include for example the relationship between Happy and Smiling (the 'er' of Happy is often the 'er' of Smiling too). Figure 20 shows the properties of At, the category subsuming locations, and the sense of AT. The 'ee' of At defines the place of its 'er', which is therefore understood as the theme of a state defined by the 'ee'. For this reason, the 'ee' is also shown as the Landmark (see Figure 20). Two subcases of At are shown, In and On, the senses of the prepositions IN and ON respectively. These two differ from At in that the place of the 'er' is not the same as the 'ee', but is rather the
106 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure 19 Feeling same as the place of a part of the 'ee'. In the case of In, this part is the interior; in the case of On, it is the surface. The diagram also shows that Containing and Supporting are the converses of In and On respectively (if a is in b then b contains a; if a is on b then b supports a). These facts are integral parts of the meanings of the prepositions.
Figure 20 At
LINKING IN WORD GRAMMAR
107
Figure 21 shows the properties of Having, the sense of HAVE. As I have shown in Figure 18, the arguments of Having and those of Being have different properties. In the case of Having the 'er' is also its actor and the 'ee' its theme (see section 2. 3); in the case of Being the 'er' is the theme, and the 'ee', if there is one, is a landmark, or plays some other role (in the case of the psychological states it is often called a stimulus). Figure 21 shows that Supporting and Containing are subcases of Having (subsumed under a general category labelled 'Locating'). This explains why these categories assign their arguments in the opposite way to the corresponding concepts On and In, which inherit their argument structure from Being (by way of At). It may also help to explain the way in which some languages use verbs corresponding to English BE and HAVE with different sets of verbs in perfect constructions and perhaps also explain the relationship between passive and perfect constructions even in English. This possibility needs to be explored in future work.
Figure 21 Having The correspondence between Being and Having also suggests an alternative to the most usual analyses for verbs like GIVE and the indirect object (see Holmes 2005: 46-54). It is often claimed that the more specific semantics of indirect objects overrides the usual principle that the 'ee' of a causative event is assigned as the 'er' of its result (the gift, which is the 'ee' of Giving, is the 'ee' rather than the 'er' of the result, if this is to be a case of Having). However, it is also possible that the result of Giving is instead a case of Being (more specifically, it isa At), which would preserve the default arrangement. This would also provide a means of describing the contrast between verbs like GIVE and those like EQUIP (rare in English) that show the opposite linking arrangement. This view is supported by the prepositions that are used with these verbs. GIVE selects TO (in the absence of an indirect object), which in
108 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE other constructions refers to a path terminating in a location; EQUIP selects WITH, which has Having as its sense. This suggestion is sketched in Figure 22. The result of Giving isa At; its 'er' (the thing located) is the 'ee' of Giving and its ee (the location) is the recipient. The result of Equipping isa Having; its 'er' (the possessor) is the 'ee' of Giving and its 'ee' is the 'equipment'.
Figure 22 Giving, Equipping Figure 23 shows the properties of the non-states. Event inherits the 'er' relationship from the Predicate category, and passes it down to the subclasses. Becoming has additionally a result which is a state which shares its 'er'; the class is telic and provides the semantic schema for unaccusative constructions. Dying is shown as an example (see Figure 27). Affecting has additionally an 'ee', which is a patient. Pushing is shown as an example (see Figure 25). Making represents telic affective events (it has a result). Two subclasses of Making are shown. Creating provides the model for effective constructions and Making' for causative (affective) ones. In both cases it is the 'ee' that functions as the 'er' of the result Killing is shown as an example of Making' (see Figure 26).
Figure 23 Events
LINKING IN WORD GRAMMAR
Figure 24 Yawning isa Event
Figure 25 Pushing isa Affecting
Figure 26 Killing isa Making'
109
110 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Figure 27 Dying isa Becoming 2. 3 Semantic roles and semantic relationships I have given above a hierarchical classification of predicate types denned by their properties (see Figure 17, Figure 18, Figure 23). Note that the senses of particular words (not just verbs: prepositions and adjectives refer to events, as do some nouns like DESTRUCTION, WEDDING, etc. ) are arranged in the same hierarchy since they simply instantiate the more general predicate types. The properties of the predicate types determine the number and nature of the semantic relationships associated with these senses and the linking of those associations to syntactic dependencies; alternatively, the number and nature of semantic associations and the linking of those associations determines the position of the sense in the predicate type hierarchy. In the first part I provided linking regularities that link more or less schematic semantic associations with more or less schematic syntactic ones. There I gave linking rules for subject, object and indirect object as well as the more general Syntax Semantics Principle (SSP). The semantic associations referred to in these rules are the same as those supported by the various predicate types. In fact the linking rules themselves form part of this hierarchy, appearing at the highest relevant level. As noted above, semantic associations like recipient are fairly straightforwardly characterized in terms of other semantic relationships (in terms of their meanings) but 'er' and 'ee', the two relationships involved in subject and object linking, are not. The 'ers' and 'ees' of particular events (or event classes) are instantiations of the more general 'er' and 'ee' that appear in the linking regularities (note that the 'er' of Predicate (Figure 23) is the most general one there is, so this is the locus of the subject linking rule and all other 'ers' are instantiations of this one). The properties of 'ers' and 'ees' of more specific categories are determined at the appropriate level in the predicate type hierarchy and it is here that the most semantic information is found. In the preceding section I define the semantics of these relationships by relating them to named thematic roles (actor, patient, theme, landmark), but this begs the question in the absence of a fuller semantic definition of these roles. Indeed, as discussed below, once the thematic roles have definitions, it may no longer be necessary, or desirable, to keep the relationships agent, theme etc. in lexical structure.3
LINKING IN WORD GRAMMAR
111
A number of problems with thematic roles have been identified in the literature. The most immediate practical difficulty is that different writers (and even different works by the same writer) use the same terms with different meanings; this is a particular problem for the terms Goal, Patient and Theme (see below). But there is also the non-monotonicity of argument linking ((34)(36) are from Davis and Koenig (2000: 58)), which leads to proposals like JackendofFs (1990) hierarchical argument linking. (34) (35) (36)
a. b. a. b. a. b.
Mary owns many books. This book belongs to Mary. We missed the meaning of what he said. The meaning of what he said escaped/eluded us. Oak trees plague/grace/dot the hillsides, The hillsides boast/sport/feature oak trees.
A further, theoretical, difficulty (raised by Dowty 1991) is the open-ended nature of the set of roles to be used. Goldberg considers this only an empirical problem, since in principle the set of thematic roles need not be finite, the nature of the roles being determined by the set of predicate types recognized in the language: [P]hrasal constructions that capture argument structure generalizations have argument roles associated with them; these often correspond roughly to traditional thematic roles... At the same time, because they are denned in terms of the semantic requirements of particular constructions, argument roles in this framework are more specific and numerous than traditional thematic roles. (2002: 342) Since the semantic relationships supported by the senses of words instantiate (isa) those of more general categories, the senses of different words (or constructions) may elaborate the more general models in different ways, so that the set of thematic roles at the more specific levels can be very large indeed. In Figure 23, 1 used the thematic role actor as schematic over the first arguments of all non-states (including processes (37) and causative (38) and unaccusative (39) events). (37)
(38)
(39)
a. b. c. d. a. b. c. d. a. b. c.
The flag fluttered in the breeze. The tourist yawned. The flag distracted the tourist. Perry pushed a pea with his nose. Perry pushed a pea to Peterborough. The flag angered the tourist. The judges made a cake. Perry opened a bottle. The pea vanished. The ice melted. The band disbanded.
Trask defines actor as 'that argument NP exercising the highest degree of
112 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE independent action in the clause. ' (1993: 6), noting that this is a simple extension of the category agent to fit other kinds of subject-linked arguments. This extension covers verbs referring to changes undergone by their single argument (unaccusative verbs), whose arguments therefore may have few or no agentive properties (note however, that some are agents (30c). In fact, the actors of other one- or two-argument events are also not agents ((28a), (28c), (29b)). Agency is a property of some actors, determined by the thematic properties of the event, so the thematic role agent ('the semantic role borne by an NP which is perceived as the conscious instigator of an action' ibid.: 11) is not called for. Actor, then, corresponds roughly to Dowty's (1991) proto-agent: it is defined by properties like volitional involvement, causal instigation etc., but not all cases share all these properties. Dowty's proto-agent wills the event, is sentient, causes an event or change of state, moves and has independent existence; the WG treatment presented here accepts all of these but the fourth, movement. Patient ('the semantic role borne by an NP which expresses the entity undergoing an action' Trask (1993: 202)) is schematic over the second argument of transitive events. Affecting, which is the most general such event, subsumes processes (like pushing a pea or patting a dog) and causative events (like pushing a pea to Peterborough or angering a tourist). The patient is the affected (or effected) argument, even in some of the transitive processes. Processes have a temporal profile that consists of a set of repeated events. These events may themselves be causative (Pushing consists of a set of repeated causative actions on an object), though they may also be states (Patting consists of a set of repeated locative states) in which case the patient is the theme of the state (see below). Dowty's (1991) proto-patient undergoes a change of state, is an incremental theme, is causally affected by another participant, does not move and does not have independent existence. Again, the WG analysis accepts all these but the fourth, concerning movement The incremental theme is a product of the aspectual structure of affective events (see Holmes 2005). States have themes, and some have actors. Actors of states share the properties of those of non-states. The theme is the argument that the state is predicated of (theme is also used with similar meaning as the name of a discourse function, where it contrasts with rheme, as topic does with comment). Trask gives 'an entity which is in a state or a location or which is undergoing motion' (1993: 278), a definition which subsumes some patients, as defined above; Trask also notes that the terms theme and patient are used more or less interchangeably. However, in the current framework the two are separate: patients undergo some affective/effective process or change; themes have some stable property. Locative states also have a landmark: the argument whose position defines that of the theme. The above definitions of the thematic roles are given in terms of semantic properties. For example, an actor wills the event, is sentient, causes an event or change of state and has independent existence. These semantic properties of actor are shown in Figure 28.
LINKING IN WORD GRAMMAR
113
Figure 28 Actor In the linking framework outlined here, syntactic associations are linked to semantic ones in a regular way (subjects refer to 'ers', objects to 'ees', indirect objects to beneficiaries, etc. ), and those semantic associations are defined by (structural) semantic properties. The relationships 'er' and 'ee' are defined by the categories of the predicate type hierarchy, and linked there to the various properties of actors, patients, themes and landmarks, like those in Figure 28. It is an empirical question whether it is necessary to keep hold of the relationships actor, patient, etc.: theoretically the 'ers' of non-states could simply be linked directly to the structure shown in Figure 28 without the mediation of the actor relationship. The contrast between Having and Being (the former has an actor-er and a theme-ee, the latter a theme-er and in some cases a landmark-ee, see Figure 18) demonstrates that 'er' and 'ee' are distinct from the thematic roles. This separation of properties is found in other frameworks also. For example, in Goldberg's (2002) Construction Grammar the lexical structures of grammatical constructions are separated from those of specific words. Semantic relationships like Actor, Theme, etc. (participant roles), which are supported by the senses of words, instantiate the argument roles of phrasal constructions (these correspond to my 'er' and 'ee'), which are therefore schematic over them. The separation, in lexical structure and in the structures of sentences (constructs), of the two argument structures allows different verbs to elaborate different constructions differently: the argument structure of the construction may add or take away participant roles from the verb, or vice versa. The WG framework, however, represents the distinction differently: rather than being properties of two different kinds of elements, the participant roles and the argument roles are simply different kinds of association supported by the same elements (events). In Holmes (2005) I show this property of the WG framework to be crucial in the treatment of specific examples, since it becomes
114
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
clear there that both words and constructions may select both argument and participant roles. Since the participant roles are defined in terms of sets of default properties, it is possible for more than one argument of a verb's sense to fit the bill for one or other participant role. This is the case for the verbs SPRAY and LOAD. As is well known, these two verbs can be used with objects referring to a thing or substance moved or to the place it is moved to. These two possibilities reflect two ways of interpreting the roles of the participants (of choosing which participant best fits the patient model, and is therefore linked to 'ee' and thence to object). In these cases the lexical properties of the syntactic relationship (here object) can be added to those of the verb. Where the two are not in conflict, they are simply merged. For example, since LOAD does not select either of its nonsubject arguments as an incremental theme, this property is assigned to the object-linked argument by the semantics of the 'ee' relationship (40) (the mechanics of this example are discussed in Holmes 2005: 206ff). (40)
a. b.
Larry loaded * (the) lorries with (the) lollies in 2 hours, Larry loaded * (the) lollies on (the) lorries in 2 hours.
When there is a conflict between the lexical properties of the construction and those of the verb, the construct is (usually) rendered incoherent. The two examples in (41) are unacceptable because the lexical structure of POUR specifies that the 'ee' of its sense is a liquid (that is how the manner of pouring is defined) and that of COVER specifies that the 'ee' of the sense ends up underneath something. These two requirements clash with the semantics of the construction. (41)
3.
a. b.
* Polly poured the pot with water *Corrie covered the quilt over the baby.
Conclusion
In the first part of this chapter I sketched the linking mechanisms of WG. Syntactic and semantic associative relationships participate in symbolic relationships: syntactic dependencies have meanings, which serve to determine the interpretations of compositional structures, as well as to constrain the possibilities for composition. Just as the (default) properties of syntactic associations are given in terms of a network of related concepts and properties surrounding the dependency class they instantiate, so are the (default) properties of semantic associations. In the second part I distinguished two kinds of semantic association: participant roles, which carry thematic content; and argument roles, which are determined by the force-dynamic properties of the event class.
LINKING IN WORD GRAMMAR
115
References Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan and Finegan, Edward (1999), Longman Grammar of Spoken and Written English. Harlow, Essex: Longman. Bresnan, Joan W. (1982), The Mental Representation of Grammatical Relations. Cambridge, MA: MIT Press. Chomsky, Noam (1970), 'Remarks on nominalization', in Roderick A. Jacobs and Peter S. Rosenbaum (eds), Readings in English Transformational Grammar. Waltham MA: Ginn and Company, pp. 184-221. — (1981), Lectures on Government and Binding. Dordrecht Foris. Copestake, Ann and Briscoe, Ted (1996), 'Semi-productive polysemy and sense extension', in James Pustejovsky and Branimir Boguraev, Lexical Semantics: the Problem of Polysemy. Oxford: Clarendon Press, pp. 15-68. Croft, William (1990), 'Possible verbs and the structure of events', in Savas L. Tsohatzidis (ed. ), Meanings and Prototypes: Studies in Linguistic Categorization. London: Routledge, pp. 48-73. Cruse, David A. (1986), Lexical Semantics. Cambridge: Cambridge University Press. Davis, Anthony R. and Koenig, Jean-Pierre (2000), 'Linking as constraints on word classes in a hierarchical lexicon'. Language, 76, 56-91. Dowry, David R. (1991), 'Thematic proto-roles and argument selection'. Language, 67, 547-619. Goldberg, Adele E. (1995), Constructions: a Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. — (2002), 'Surface generalizations'. Cognitive Linguistics, 13, 327-56. Holmes, Jasper W. (2005), 'Lexical Properties of English Verbs' (Unpublished doctoral dissertation, University of London). Hudson, Richard A. (1984), Word Grammar. Oxford: Blackwell. — (1990), English Word Grammar. Oxford: Blackwell. — (1992), 'So-called "double objects" and grammatical relations'. Language, 68, 251-76. — (1994), 'Word Grammar', in Ronald Asher (ed. ), The Encyclopedia of Language and Linguistics. Oxford: Pergamon Press, pp. 4990-93. — (1999), 'Subject-verb agreement in English'. English Language and Linguistics, 3, 173-207. — (2004, July 1-last update), 'Word Grammar', (Word Grammar], Available: www. phon. ucl. ac. uk/home/dicVwg. htm (Accessed: 18 April 2005). Jackendoff, Ray S. (1990), Semantic Structures. Cambridge, MA: MIT Press. Lemmens, Maarten (1998), Lexical Perspectives on Transitivity and Ergativity: Causative Constructions in English. Amsterdam: J. Benjamins. Levin, Beth (1993), English Verb Classes and Alternations: a Preliminary Investigation. Chicago: University of Chicago Press. Levin, Beth and Rappaport Hovav, Malka (1995), Unaccusativity: at the Syntax-Lexical Semantics Interface. Cambridge, MA: MIT Press. Pustejovsky, James (1995), The Generative Lexicon. Cambridge, MA: MIT Press. — (2001), 'Type construction and the logic of concepts', in Pierette Bouillon and Federica Busa (eds), The Language of Word Meaning. Cambridge: Cambridge University Press, pp. 91-123. Pustejovsky, James and Branimir Boguraev (1996), 'Introduction: lexical semantics in context', in James Pustejovsky and Branimir Boguraev, Lexical Semantics: the Problem of Polysemy. Oxford: Clarendon Press, pp. 1-14. Shibatani, Masayoshi (1996), 'Applicatives and benefactives: a cognitive account', in Masayoshi Shibatani and Sandra A. Thompson (eds), Grammatical Functions: their Form and Meaning. Oxford: Clarendon Press, pp. 157-94.
116
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Trask, Robert L. (1993), A Dictionary of Grammatical Terms in Linguistics. London: Roudedge. Trier, Jost (1931), Der Deutsche Wortschatz im Sinnbezirk des Verstandes. Von den Anfdngen bis zum 13. Jahrhundert. Heidelberg: Winter. Weisgerber, Leo (1927), 'Die Bedeutungslehre - ein Irrweg der Sprachwissenschaft'. Germanisch-Romanische Monatsschrift, 15, 161-83. Williams, Edwin (1991), 'Meaning categories of NPs and Ss'. Linguistic Inquiry, 22, 584-7. Notes 1 The SSP given here turns out not to be able to account for all cases of linking. It is revised in Holmes (2005: 44). 2 Note that in Biber et al. the VP category subsumes the 'verbal complex' (main verb and any auxiliaries), but not any complements or other postdependents of the verb. 3 Of course, if a particular speaker knows the words ACTOR and THEME (as metalinguistic terms), then they must have these relationships in their lexicon, since they are (or should be!) the meanings of the relevant terms.
6 Word Grammar and Syntactic Code-Mixing Research EVA EPPLER
Abstract This chapter aims to show that WG is preferential over other linguistic theories for the study of bilingual speech. Constituent-based models have difficulties accounting for intra-sentential code-mixing because the notions of government and functional categories are too powerful and rule out naturally occurring examples. Properties of WG which make this syntactic theory particularly well suited for code-mixing research are the central role of the word, the dependency analysis, and several consequences of the view of language as a network which is integrated with the rest of cognition. A qualitative and quantitative analysis of because and weil clauses shows that code-mixing patterns can be studied productively in WG. 1. Introduction Intra-sententially CODE-MIXED data, i. e. utterances constructed from words from more than one language, pose an interesting problem for syntactic research as two grammars interact in one utterance. Based on a German/ English bilingual corpus,11 will show in section 2 of this chapter that constraints on code-switching formulated within Phrase Structure Grammar frameworks (Government and Binding, Principles and Parameters, Minimalism) are too restrictive in that they rule out naturally occurring examples of mixing. In section 3 I will discuss aspects of WG that make it particularly well suited for the syntactic analysis of intra-sententially mixed data. WG facilitates the full syntactic analysis of sizeable corpora and allows us to formulate hypotheses on code-switching which can subsequently be tested on data. All findings are supported by quantitative data. As the word order contrast between German and English is most marked in subordinate clauses, I focus on examples of this construction type in section 4. 1 will show that code-mixing patterns can be studied productively in terms of WG: WG rules determining the word order in German/English mixed clauses hold in relation to my corpus and are supported by evidence from other corpora. The main section of this chapter focuses on because and weil clauses. A comparison of the mixed and monolingual clauses reveals that German/English bilinguals who engage in code-mixing recognize and utilize structural
118 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE congruence at the syntax-pragmatics interface. They predominantly mix in a construction type in which the word order contrast between German (SOV) and English (SVO) is neutralized. 2. Constituent Structure Grammar Approaches to IntraSentential Code-Mixing The question underlying grammatical code-switching research is whether there are syntactic constraints on code-mixing. Some of the hypotheses on intrasententiaT code-switching have been formulated in informal frameworks of traditional grammatical notions; others are derived from assumptions underlying specific modern syntactic theories. In this section I will review the main phrase structure grammar approaches to code-mixing and show that the constraints formulated within them do not account for the data. DiSciullo, Muysken and Singh (1986) propose to constrain code-switching by government, the traditional assumption behind X-bar theory. They initially used the Chomsky (1981: 164) formulation of government 'oc governs y in [J3 ... y . . . a . . . y . . . ], where a = X, and oc and y are part of the same maximal projection'. The X-bar assumption that syntactic constituents are endocentric is important for the formulation and working of the government constraint. Heads not only project their syntactic features onto the constituent they govern, but also their language index. The language index is assumed to be something specified in the lexicon (DiSciullo et al. 1986: 6), since the lexicon is a languagespecific collection of elements. For code-switching purposes the Government Constraint was formalized in DiSciullo et al. (1986: 6) as [Xp Yp], where X governs Y, and p and q are language indices. The nodes in a tree must dominate elements drawn from the same language when there is a government relation holding between them. The Government Constraint predicts that ungoverned elements, such as discourse markers, tags, exclamations, interjections and many adverbs, can easily be switched. This prediction is also supported by my data (see also Eppler 1999) and most other bilingual corpora. However, the Government Constraint also predicts that switches between verbs and their objects and/or clausal complements, and switches between prepositions and their NP complements, are ungrammatical. Examples violating these predictions from my corpus are: (1)
*TRU: so [/] so you have
eine Ubersicht. Jen2. cha, line 133 an overview (2) *DOR: I wonder, wem sie nachgradt Jen2. cha, line 1531 whom she takes after
or in the other direction, i. e. from a German verb to an English clausal complement: (3)
*MEL: ich hab(e) gedacht there is going to be a fight. Jenl. cha, line 987 I have thought
WORD GRAMMAR AND SYNTACTIC CODE-MIXING (4)
*TRU: der he
119
hat iiber faith + healing gesprochen. Jen2. cha, line 2383 has about spoken
The original inclusion of functional categories in the class of governors ruled out code-switches which are also documented in my data, e. g. between complementizers and clauses that depend on them, as in (5): (5)
TRU: to buy yourself in means that +... Jenl. cha, lines 977ff DOR: du kannst dich nochmal einkaufen. you can yourself once more buy in
and the domain of government was too large. The above formulation of the government constraint includes the whole maximal projection and thus, for example, bans switching between verbs and location adverbs, again contrary to the evidence. Therefore a limited definition of government, involving only the immediate domain of the lexical head, including its complements but not its modifiers/adjuncts, was adopted and the Government Constraint was rephrased (Muysken 1989) in terms of L-marking: *[Xp Yq], where X Lrmarks Y, and p and q are language indices Muysken and collaborators thus shifted from an early and quite general definition of government to the more limited definition of Lrmarking in their formulation of the Government Constraint. L-marking restricts government to the relation between a lexical head and its immediate complements. Even the modified version of the government constraint in terms of L-marking is empirically not borne out, as we see from the following example: (6) TRU: das ist painful. Jen3. cha, line 1879 this
is
Muysken (2000: 25) identifies two main reasons why the Government Constraint, even in its revised form, is inadequate. The main reason is that CATEGORIAL EQUIVALENCE2 undoes the effect of the government restriction. The Government Constraint is furthermore assumed to insufficiently acknowledge the crucial role functional categories are supposed to play in code-mixing. Functional categories feature prominently in several approaches to codemixing. Joshi, for example, proposes that 'Closed class items (e. g. determiners, quantifiers, prepositions, possessive, Aux, Tense, helping verbs, etc. ) cannot be switched' (1985: 194). Myers-Scotton and Jake (1995: 983) assume that in mixed constituents, all SYSTEM MORPHEMES3 that have grammatical relations external to their head constituent (i. e. participate in the sentence's thematic grid) will come from the language that sets the grammatical frame in the unit of analysis (CP). And Belazi, Rubin and Toribio (1994) propose the Functional Head Constraint. Their model is embedded in the principles and parameters approach. Belazi, Rubin and Toribio (1994) propose to restrict code-mixing by the
120
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
feature-checking process of f-selection. In Belazi, Rubin and Toribio's model, language is a feature4 of FUNCTIONAL heads that needs checking like all other features. The Functional Head Constraint (Belazi, Rubin and Toribio (1994: 228)) is formulated as follows: The language feature of the complement F-selected by a functional head, like all other relevant features, must match the corresponding feature of that functional head.
Code switching between a lexical head and its complement proceeds unimpeded in this model. Because many inflectional morphemes were treated as independent functional heads in the principles and parameters approach, Belazi, Rubin and Toribio (1994) subsume the FREE MORPHEME CONSTRAINT5 (Sankoff and Poplack 1981) under their functional head constraint: switching is disallowed between an inflectional morpheme and a word-stem. A counterexample to this restriction from my corpus would be: (7)
*DOR: wir we
suffer-n da suffer INFL MP
alle. all
Jen2. cha, line 904
Like all researchers working on Spanish/English and Arabic/French codemixing, Belazi, Rubin and Toribio (1994) have to deal with the different placement of adjectives pre- or post-modifying nouns in the language pairs they are working on. Their data indicate that switching is possible when the adjectives and nouns obey the grammars of the languages from which they are drawn. This leads them to supplement the Functional Head Constraint with the WORD-GRAMMAR INTEGRITY COROLLARY, 6 which states that 'a word of language X, with grammar G, must obey grammar G' (Belazi, Rubin and Toribio 1994: 232). like the Government Constraint, the Functional Head Constraint rules out switches between complementizers and their clausal complements. Therefore example (5) provides counter-evidence to this constraint It also rules out switches between infinitival to and its verbal complement, examples of which are also attested in my corpus. (8)
*LIL:
you don't need to
wegwerfen. throw away
Jen2. cha, line 2555
The Functional Head Constraint furthermore rules out switches between determiners (including quantifiers and numerals) and nouns. As nouns are the most frequently borrowed or switched word class, counterexamples abound in my and many other corpora. MacSwan (1999: 188), working within the minimalist framework, also assumes that code-switching within a PF component is not possible. This PF Disjunction Theorem amounts to the same effect as the Free Morpheme Constraint (Sankoff and Poplack 1981) and the various restrictions on switching
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
121
between stems and morphologically bound inflectional material. Examples (7) and (9) are therefore clear violations of the PF Disjunction Theorem. (9)
*DOR: sie haben einfach nicht ge#bother-ed. they have simply not
Ibron. cha, lines 1012, 14
The minimalist framework he is working in forces MacSwan (1999) to preserve constituent structure, but he acknowledges the advantages of a system of lexicalized parameters for the analysis of code-switching. In this section I reviewed approaches to code-mixed data that crucially depend on constituency structure/maximal projections (DiSciullo, Muysken and Singh 1986) and functional categories (Belazi, Rubin and Toribio 1994). I showed that these constraints and models are too restrictive in that they rule out naturally occurring examples of intra-sentential code-mixing. The 'government constraints' (DiSciullo, Muysken and Singh 1986; Muysken 1989) were found to be too restrictive when tested against natural language data because the government domain was too large. Models, approaches and constraints based on functional categories (Joshi 1985; Myers-Scotton 1993; Belazi, Rubin and Toribio 1994) fall short of accounting for the data available and are unsatisfactory because none of the definitions of functional categories that have been offered (in terms of function words, closed class items, system morphemes or non-thematicity) work. They either define fuzzy categories where a sharp distinction would be needed, or they conflict with the data. Complementizers and determiners, the two most commonly quoted examples of functional categories, provide most of the counterexamples. For these reasons a syntactic theory that rejects constituency structure and does not recognize functional categories (Hudson 2000) seems an interesting and promising option to explore. In the next section I will review other aspects/ characteristics of WG which are perceived to make this theory of sentence structure more suitable for the analysis of (monolingual and) code-mixed data than other theories. 3. A Word Grammar Approach to Code-Mixing The main reason why I chose WG for the syntactic analysis of my data is because this theory of sentence structure takes the word as a central unit of analysis. In WG, syntactic structures are analysed in terms of dependency relations between single words,7 a parent and a dependent. Phrases are defined by dependency structures which consist of a word plus the phrases rooted in any of its dependents. In other words, WG syntax does not use phrase structure in describing sentence structure, because everything that needs to be said about sentence structure can be formulated in terms of dependencies between single words. For intra-sententially switched data this is seen as an advantage over other syntactic theories because each parent only determines the properties of its immediate dependent. Language specific requirements are thus satisfied, if the particular pair of words, i. e. the parent and the dependent, satisfy them. A word's requirements do not project to larger units like maximal projections/
122 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE phrasal constituents. If we want to formulate constraints on code-switching within WG, they have to be formulated for individual types of dependency relations. Because they do not affect larger units, they are less likely to be too restrictive than constraints affecting whole phrasal constituents. One of the main problems of constituency based models, i. e. over-generalization through phenomena like government chains, therefore cannot occur in a WG approach to code-mixing. The central role of the word in WG moreover means that words are not only the largest units of WG syntax, but also the smallest. In contrast with Chomskyan linguistics, syntactic structures do not, and cannot, separate stems and inflections. Furthermore, at least as far as overt words are concerned, WG rejects the notion of functional category. Hudson (2000) shows that this notion is problematic, because it has never been defined coherently and because all the individual categories that have been given as examples (e. g. complementizers) raise serious problems. For the same reasons, constraints on intrasentential code-switching based on functional categories Qoshi 1985; Belazi, Rubin and Toribios 1994) and models of code-switching that crucially depend on the distinction between system and content morphemes (Myers-Scotton 1993) run into serious empirical difficulties (see section 2). Because WG is an example of 'morphology-free syntax' (Zwicky 1992: 354) which rejects the notion of functional categories, a WG approach to intra-sentential codeswitching cannot over-emphasize the role of inflectional morphemes. Words being the only and central unit of analysis in Word Grammar furthermore benefits code-mixing research in a purely pragmatic way. The majority of research in this area is based on sizable natural language corpora. Because the only units that need to be processed in WG are individual words and larger units are built by dependency relations between two words which can be looked at individually, a WG approach to intra-sentential code-mixing requires less analysis than constituency-based modes. This facilitates the analysis of large-ish corpora. Eppler (2004), for example, is based on a WG analysis of a 22, 000 word corpus. For inrra-sententially mixed sentences a dependency analysis is furthermore seen as an advantage over phrase structure grammar frameworks because it highlights the functional relations between words (from the same or different languages) rather than code-switch points. Constituency-based models describe and/or constrain intra-sentential code-switching by disallowing switches between, for example, PP and NP (see section 2). A WG analysis would note a switched complement relation which is grammatical, if the preposition and the determiner/(pro-)noun involved in it satisfy the constraints imposed on them by their own language. To start to understand what is going on in intrasentential code-switching, it seems more beneficial to gain an insight into which syntactic relations are frequently or rarely switched, rather than to increase our knowledge about points in sentences where switching does not occur. Another characteristic of WG is that dependency analyses have a totally flat structure. A single, completely surface structure analysis (with extra dependencies being drawn below the sentence-words) is seen as benefiting
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
123
WG over other theories of language structure for code-mixing research: linguists working on code-mixing during times when Chomskyan frameworks still stressed the difference between surface and deep structure did not know what to do with D-structure, because code-switching clearly seems to be a surface structure phenomenon. Romaine (1989: 145) concludes her discussion of the government constraint with the statement 'data such as these [codemixing data] have no bearing on abstract principles such as government [... ] because code-switching sites are properties of S-structure, which are not base generated and therefore not determined by X-bar theory'. This problem does not emerge when one works with WG because of its totally flat, i. e. surface, analysis. A syntactic theory that shares properties of the linguistic phenomenon under investigation appears to be preferable to other syntactic theories; i. e. for a surface-structure phenomenon like code-mixing, a syntactic model that allows a single, completely surface analysis seems to be well suited. Other aspects of WG which make this theory of sentence structure more suitable for the analysis of code-mixed data than other theories are derived from the WG view of language as a network which contains both the grammar and the lexicon and which integrates language with the rest of cognition. This cognitive view of language as a labelled network has consequences for a controversial issue in psycholinguistic bilingualism research: the lexicons debate, i. e. whether bilinguals' lexical items/lemmas are stored in one or two lexicons. The network idea offers the advantage of viewing a bilingual's two languages as sub-networks, with denser links between lexical items from the same language and looser connections between lexical items from different languages. This view of the bilingual lexicon (s) in combination with the multiple default inheritance system which WG operates on could possibly have enormous benefits for writing a psycholinguistically realistic grammar of a bilingual. The following exploration is just a sketchy idea as to how this could work and requires fleshing out, but the basic idea seems to work. Default inheritance allows us to build a maximally efficient system for bilinguals by locating the shared properties of words which 'belong' to different languages higher up the is-a hierarchy and the language specific properties lower down in this hierarchy. English come and German kommen, for example, are both verbs (is-a verb). They therefore share certain characteristics: they have a similar meaning ('move towards'), they both have tense (present or past), they have a subject and the subject tends to be a pre-dependent noun, etc. All these generalizable facts about German and English verbs can be located fairly high up in the is-a hierarchy. The features in which our two example words differ, for example that they have a different form (/kDITISn/ and /kAm/ respectively), and that German kommen, when it is the complement of a subordinating conjunction or an auxiliary/modal, would be placed in clause final position, would be lower in the is-a hierarchy. Because of the way default inheritance works, characteristics of a general category are 'inherited' by instances of that category only if they are not overridden by a more specific (e. g. language-specific) characteristic. A fact located lower down in the inheritance hierarchy of entities or relations takes priority over one located above it. Thus we could maximize the bilingual system
124
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
by allowing generalization by default inheritance and ensure that the language specific properties would automatically override the general pattern. For bilinguals this system would have the advantage that the grammatical system of a Castilian/Catalan bilingual, for example, would have fewer overriding/blocking language specific properties listed than that of a German/English bilingual.8 WG furthermore aims at integrating all aspects of language into a single theory which is also compatible with what is known about general cognition; that is, language can be analyzed and explained in the same way as other kinds of knowledge or behaviour. For example, it is widely acknowledged that codemixing is influenced by social and psychological factors (Muysken 2000) and a syntactic model that allows us to incorporate this kind of information is better suited to describe language contact phenomena than theories that deal exclusively with language. Knowledge of more than one language, and the use of more than one language in one sentence, can be analyzed and explained in the same way as knowledge of one language and monolingual language use. In other words, code-mixing is not seen as 'deviant'. Because WG aims to explain and analyze language in the same way as other kinds of social and psychological knowledge or behaviour, it is perceived to be more suitable for research into bilingualism than other models of syntax. The WG view of language as a network of associations which is closely integrated with the rest of our knowledge lends itself particularly well to codemixing research for another reason. It is a well accepted fact in this research paradigm that adult bilinguals know, first of all, which language the words they use belong to. Second, they know when to code-switch and when not to (codeswitching as a MARKED or UNMARKED choice,9 for example, Myers-Scotton and Jake 1995), or when they should be in MONOLINGUAL SPEECH MODE or when they can go into BILINGUAL MODE (Grosjean 1995). Third, bilinguals also know which mixing patterns are acceptable in their speech community and which are not (SMOOTH versus FLAGGED code-switching,10 for example, Poplack and Meechan 1995). This knowledge about language use is obviously closely integrated with other types of (social) knowledge and a syntactic theory that views language as a part of the total associative network is clearly more suitable to explain these phenomena than other theories. Viewing language as a sub-network (responsible for words) which is just a part of the total associative network creates another advantage of WG for the research paradigm under discussion in this chapter. This benefit is related to the fact that most code-mixing research is based on natural language corpora. 11 In contrast with most other kinds of grammar which generate only idealized utterances or sentences, WG grammar can generate representations of actual utterances. A WG analysis of an utterance is also a network; it is simply an extension of the permanent cognitive network in which the relevant word tokens comprise a fringe of temporary concepts attached by 'is-a' links; so the utterance network has just the same formal characteristics as the permanent network. This blurring of the boundary between grammar and utterance is quite controversial, but it follows from the cognitive orientation of WG. For work based on natural speech data it is seen as another crucial advantage of
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
125
WG over other theories which can only generate syntactic structures for sentences. From the examples quoted so far, it is obvious that the audio data this study is based on are transcribed as utterances, i. e. units of conversational structure. For the grammatical analysis, however, I assume that conversational speech consists of the instantiation of linguistic units, i. e. sentences. In other words, every conversational utterance is taken to be a token of a particular type of linguistic unit, the structural features of that unit being defined by the grammatical rules of either German or English. When using a WG approach to code-mixed data, one does not have to 'edit' the corpus prior to linguistic analysis. Any material that cannot be taken as a token of either a German or English word-form can be left in the texts, but if it cannot be linked to other elements in the utterance via a relationship of dependency, it is not included in the syntactic analysis. That is, all the words in a transcribed utterance that are related to other words by syntactic relationships constitute the sentences the grammatical analysis is based on. As far as I am aware, WG is the only syntactic theory that can (and wants to) generate representations of actual utterances, and facilitates the grammatical analysis of natural speech data without prior editing. Another consequence of integrating utterances into the grammar is that a word token must be able to inherit from its type. Obviously the token must have the typical features of its type - it must belong to a lexeme and a word class, it must have a sense and a stem, and so on. But the implication goes in the other direction as well: the type may mention some of the token's characteristics that are normally excluded from grammar, such as characteristics of the speaker, the addressee and the situation. For example, we can say that the speaker is a German/English bilingual and so is the addressee; the situation thus allows code-mixing. This aspect of WG theory thus allows us to incorporate sociolinguistic information into the grammar, by indicating the kind of person who is a typical speaker or addressee, or the typical situation of use. Treating utterances as part of the grammar has further effects which are important for the psycholinguistics of processing. The main point here is that WG accommodates deviant input because the link between tokens and types is guided by the 'Best Fit Principle' (Hudson 1990: 45fD: assume that the current token is-a the type that provides the best fit with everything that is known. The default inheritance process which this triggers allows known characteristics of the token to override those of the type. Let's take the deviant word /bAS9/ in the following example: (lOa) *TRU: xxx and warum waren keine bus(s)e [%pho: bAS9]? JenS. cha, line 331 why were there no buses /bAS9/ is phonologically deviant for German (Busse is pronounced /buS9/), and morphologically deviant for English, because the English plural suffix is (e)s, not -e. Although this word is deviant, 12 it can is-a its type, just like any other exception. But it will be shown as a deviant example. There is no need for the analysis to crash because of an 'error'. The replies to *TRU's question clearly show that the conversation does not crash:
126
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(lOb) *LIL: xxx [>] wegen einer bombe. *MEL: xxx [>] a bomb scare.
JenS. cha, lines 332-333
This is obviously a big advantage of WG for natural speech data. Another characteristic of natural speech data - and code-mixed data in particular - is that they are inherently variant Most syntactic theories aim at describing and explaining regularized and standardized linguistic data and therefore disregard inherent variability. Hudson (1997) outlines how a prototype-based network theory that is based on default inheritance and uses entrenchment, like WG, can incorporate variation. One of the strengths of the network approach is that it allows links to have different 'strength'; these are an essential ingredient of the model of spreading activation, and are highly relevant to quantitative work. Hudson (1997) stipulates that a language user who observes variation will arrive at generalizations about this variation. Each part of a variable network structure has some degree of'entrenchment' which reflects the experiences of the person concerned. The degree of entrenchment of a concept can be presented as a probability of that concept being preferred to any relevant alternatives. This is presented for word-final variable t/d loss in Figure 1, where the figures13 in angled brackets present the probabilities.
Figure 1 (Hudson f 997: Figure 5) This analysis of variation is declarative and non-procedural and requires just two elementary operations: pattern-matching and default inheritance. Speakers and hearers need to know that alternative forms can be used instead of the basic form, and in a real life context the choice between them is influenced by the linguistic and social context. Figure 2 just hints at how these extra variables could be introduced.
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
127
Figure 2 (Hudson 1997: Figure 6) This model of inherent variability is possible because WG assumes that linguistic concepts are closely linked to non-linguistic concepts and carry quantitatively different entrenchment values. The reason why I find the proposed model so appealing is because it is a model of competence - not performance. Inherent variability is generally (rightly or wrongly) associated with performance, and to my knowledge there is no other modelthat presents variability and sociolinguistic information as part of a speaker's competence. I believe that linguistic variation that is influenced by social factors is part of every speaker's competence and a (more fleshed out) model of how speakers exploit their sociolinguistic competence is therefore required within linguistic theory. In the following main section of this chapter I will present a quantitative/ variationist and qualitative analysis of monolingual and code-mixed subordinate clauses. As none of the syntactic restrictions on code-switching proposed in the literature hold absolutely and universally, several recent studies in the field (Mahootian and Santorini 1996; MacSwan 1999; Eppler 1999) have reverted to the null hypothesis. I take the same approach. Formulated in WG terms, the null hypothesis assumes that each word in a switched dependency satisfies the constraints imposed on it by its own language. Subordination was chosen as an area of investigation because the two languages in contact in this particular situation, German and English, display surface word order differences: English subordinate clauses are SVO whereas German subordinate clauses are SOV. The contrasting word order rules for English and German, stated in Word Grammar rules, are: El) In English any verb follows its subject but precedes all its other
128 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE dependents. This holds true for main as well as subordinate clauses and gives rise to SVO order in both clause types. E2) Subordinators, e. g. because, require a following finite verb as their complement. A word's complement generally follows it. 14 For German the most relevant rules15 concerning word order in main and subordinate clauses are: Gl) A default finite verb follows one of its dependents but precedes all other dependents. This gives rise to a verb second (V2) word order in German main clauses. G2) A finite verb selected by a lexical subordinator/complementizer takes all its non-verb dependents to the left, i. e. it is a 'late'16 verb. G3) Subordinators/complementizers, e. g. daft, select a 'late' finite verb as their complement. 17 According to G2 finite 'late' verbs follow all their non-verb dependents. An example illustrating rules G1-G3 would be: (11) Ich glaube nicht, da|3 wir die Dorit schon gekannt haben I think not that weDorit already known have JenS. cha, line 83 The utterance initial main clause displays V2 word order. The finite auxiliary haben which depends on the subordinates daft, on the other hand, is in clause final position following all other constituents including non-finite verbs like gekannt. In English finite verbs in subordinate clauses do not behave differently from finite verbs in main clauses. Therefore we do not have to override the default rule El in the 'isa-hierarchy' of grammar rules. Because German finite verbs depending on a subordinates take a different word order position than 'independent' finite verbs, we need a more specific rule (G2) that overrides the default rule (Gl) in the cases stated, i. e. finite verbs selected by German subordinators. The pre-minimalism constituent based models discussed in section 2 all have difficulties accounting for mixing between SVO and SOV languages because of the opposite setting of the branching parameter. I will show in the next section that this code-mixing pattern can be studied productively in terms ofWG. 4. Word Order in Mixed and Monolingual 'Subordinate' Glauses Code-switching between main and subordinate clauses was chosen as a research area for several reasons. First, it is interesting from a syntactic point of view. If German-English bilinguals want to code-switch subordinate clauses, they need to resolve the problem of English being SVO whereas German finite verbs depending on subordinating conjunctions generally being placed in clause-final position (SOV).18 How this word order contrast is resolved is relevant to the
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
129
underlying question in all grammatical code-switching research, i. e. whether there are syntactic constraints on code-mixing. Second, the code-switched corpus contains a considerable number of switches between main and subordinate clauses (37), not including the 27 switches involving because discussed in more detail below. Third, code-switching at clause boundaries has attracted much attention in the research area. As complementizers are one of the most commonly quoted examples of word classes that are functional categories in constituent-based models of syntax, the government and functional head constraints discussed in section 2, all rule out switching between C and the remainder of the CP. Gumperz (1982) also proposes that subordinate conjunctions must always be in the same code as the conjoined sentence. Sankoff and Poplack (1981: 34), on the other hand, observe that in their Spanish/English corpus subordinate conjunctions tend to remain in the language of the head element on which they depend. Bentahila and Davies' (1983) corpus of Arabic/French yields numerous examples of switches at various types of clause boundary: switches between main clauses and subordinate clauses, switching between complementizers and the clauses they introduce, and examples where the conjunction is in a different language from both clauses. Although my corpus also contains switches at all the points discussed by Bentahila and Davies (1983), my data largely support Gumperz' (1982) 'constraint', that is, subordinate conjunctions (apart from because) tend to be in the language of the subordinate clause that depends on them, and not the head element on which they depend. Examples illustrating switches between main and various types of subordinate clauses in both directions are: (12)
*MEL: ich hab(e) gedacht, there is going to be a fight. Jenl. cha, line 987 I have thought (13) *MEL: I forgot, dass wir alle wieder eine neue partie angefangen haben. that we all again a new game started have Jenl. cha, line 2541 (14) *TRU: die mutter wird ihr gelernt haben, how to keep young. her mother would her taught have Jenl. cha, line 2016 (15) *DOR: wenn du short hist, you -wouldn't talk. when you are *DOR: aber wenn man geld hat, you talk. but when one money has Jen3, line 581-2 (16) *TRU: er schreibt fuenfzehn, if you leave it in your hand. he counts fifteen Jen2. cha, line 932 (17) *LIL: das haengt davon ab, what 'nasty' is(t). that depends on Jen2. cha, line 1062 Note that the null hypothesis is born out in examples (12)-(17) and in the vast majority of monolingual and mixed dependencies19 in the German-English corpus. The WG rules determining the word order in main and subordinate clauses also hold. These findings are furthermore supported by the quantitative analysis of 1, 350 monolingual and 690 mixed dependency relations in a 2, 000 word monolingual sample corpus and a 7, 000 word code-mixed corpus (see Eppler 2004).
130 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE This study particularly focuses on because and well clauses. Several researchers (Gardner-Chloros 1991; Salmons 1990; Treffers-Daller 1994; Bolle 1995; Boumans 1998) studying code-mixing between SVO and SOV languages noticed that the clauses depending on switched conjunctions are frequently not SOV but V2. The conjunction in these examples, furthermore, is frequently the causal conjunction because, parce que and omdat. This led Boumans (1998: 121) to hypothesize that '... it is possible that foreign conjunctions do not trigger verb-final in Dutch and German clauses simply because they are used in functions that require main clause order'. He, however, found it 'hardly feasible to examine this hypothesis in relation to the published examples because these are for the most part presented out of context' (Boumans 1998: 121). I will show that a fully (ODES20) transcribed corpus of German and English data allows us to verify this hypothesis. Both types of analysis, qualitative structural and quantitative distributional, are considered to be necessary for a comprehensive description of the data, because different structural patterns are used to different degrees and for different purposes. The variation in the data can best be described quantitatively; the qualitative analysis provides an explanation for the structural patterns found. This combination of methodologies furthermore enables us to address Muysken's (2000: 29) statement that'... we do not yet know enough about the relation between frequency distributions of specific grammatical patterns in monolingual speech data and properties of the grammar to handle frequency in bilingual data'. I will compare the because- and «W/-clauses in mixed utterances with monolingual German and English examples and show that we do know enough about the syntax and pragmatics of this construction to explain both the frequency distribution of causal conjunctions and the use of verb second (rather than verb final) word order. 4. 1 The empirical issues 4. 1. 1 ASYMMETRY BETWEEN CONJUNCTIONS OF REASON The distribution of German and English subordinators/complementizers in the corpus is approximately 60: 40, which is in accordance with the general distribution of word tokens from the two languages in the data. If, however, we focus on because and the translation equivalent from the same word class, the subordinating causal conjunction well, we get a very different picture. The corpus yields twice as many tokens of the English subordinator as it does of well (see Table 1). A typical use of because, especially for speaker DOR, is: (18) DOR: es war unsere [... ] Schuld because man fiihlt sich it was our fault tone feels mit den eigenen Leuten wohler. with the own people happier. Ibron. cha, line 221 Because in the above example can be argued to be a single lexical item inserted
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
131
in otherwise German discourse. This particular usage of the English causal subordinator is not restricted to speaker DOR: (19)
LIL:
because
er he
ist is
ein a
aufbrausender hot-blooded
Irishman. Jenl. cha, line 389
Because also enters syntactic relations where the word on which it depends is English (eat) and its dependent is German (schmeckt), as in: (20)
DOR: eat it with der Hand-! because das schmeckt ganz anders. the hand it tastes very differently Ibron. cha, line 2214
or vice versa, e. g. because has a German head verb (habe) but an English complement (know): (21)
MEL:
ich hab's nicht einmal gezahlt because I know I'm going to lose. I have it not even counted Jenl. cha, line 881
The German subordinator of reason, weil, on the other hand, only enters into monolingual dependency relations: (22)
DOR: dann ist sie, weil sie so ungliicklich war, doit gestorben. then has she, because she so unhappy was, there died Ibron. cha, line 1002
So there is not only an asymmetry in the number of tokens each subordinator yields, but also in the language distribution of the immediate syntactic relations which because and weil enter into, i. e. their main clause head verb and the subordinate dependent verb. The results are summarized in Table 1. Table 1: Language of head and dependent of because and weil
Because Weil
headE - depE
headE - depG
headc - depG
headc - depE total
86 0
5 0
16 59
6 0
123 59
The phenomenon of single lexical item subordinate conjunctions in other language contexts is not uncommon in code-mixing literature.21 As far as directionality of the switch is concerned, the situation in my corpus is in sharp contrast with the findings of dyne (1973) who studies German/English codemixing among the Jewish refugee community in Melbourne, Australia. He reports that 'the words transferred from German to English are mainly conjunctions (denn, ob, und, weil, wie, wo)' (Clyne 1973: 104). The corpus from the refugee community in London also shows a high propensity for switching
132 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE conjunctions, however the vast majority of them are English conjunctions in otherwise German discourse. Lexical transfer of the same word class thus seems to work in the opposite direction in two bilingual communities with a very similar sociolinguistic profile mixing the same language pair. To rule out the possibility that English because is used in place of another German causal conjunction, I will now look at the other possibilities. Da is another causal subordinates, thus producing the identical word order effects to well, but normally used in more formal contexts. The whole corpus yields only one example of German da used as a subordinating conjunction. This token is embedded in formal discourse and was produced by a speaker who does not use the mixed code as a discourse mode. Denn is a causal coordinating conjunction. It was used once by a speaker from the group recordings (not DOR) and three times by a speaker in a more formal setting. Denn has increasingly gone out of use in colloquial German (Pasch 1997; Uhmann 1998), however, since it is used by my informants, we need to consider it as a possible translation equivalent of because. This possibility is interesting because it involves word order issues: as a coordinating conjunction, denn always takes V2 order in the clause following it. The relations between well and denn will be discussed further in section 4. 2. 2 on word order. 4. 1. 2 VERB SECOND WORD ORDER AFTER BECAUSE AND WEIL Examples (18)-(20) also demonstrate the structural feature under investigation: German finite verbs occur in main clause word order position in subordinate clauses introduced by because. In actual fact not one German finite verb depending on because is in clause final position (as in monolingual German subordinate clauses with an overt German subordinates; see example 20). Furthermore, not all finite dependent verbs follow their subject. Some of them follow fronted indirect objects as in (23), others follow adverbials as in (24): nicht zeigen. not show. Jen2. cha, line 729 (24) LIL: is' wahr -? because bei mir hat schon +... 22 at my place has already it's true Jenf. cha, line 298 (23)
DOR: because
dem the
Computer computer
brauchst' need you
es it
The word order in subordinate clauses after because is summarized in Table 2. Table 2: Word order in subordinate clauses after because
Because
dependent English
dependent German
SVX 92
SVX 15
xvs
sov
6
0
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
133
What are supposed to be German dependent verbs occur in second position after because, which shows that because, at least for my informants, has not taken over the syntactic characteristics of the German subordinating conjunction well which requires its dependent verbs to be clause final. Let us now take a closer look at this subordinator. Table 1 illustrates that well only has German complements. According to the rules of standard German (rules G2 & G3), finite verbs depending on an overt subordinator should follow all their dependents, i. e. be clause final. This is not borne out in the corpus. Note, however, that 58 per cent of dependent verbs are in final position after well, whereas none is in this position after because. Table 3 summarizes the position of the dependent finite verb in well clauses from my corpus. In order to see whether verb second after well is a parochial convention of my data or not, I also give the distribution of V2 and Vf from several other corpora of monolingual spoken German23 for comparison. Table 3: Verb position after weil partly based on Uhmann (1998: 98) Weil Eppler (2004) BYU (Vienna) Farrar (1998) BYU Schlobinski (1992) Uhmann (1998) Dittmar (1997)
Vf 34
62 1147 74 24 99
V2 25 11 517 22 19 29
Vf 58% 85% 69% 70% 56% 77. 3%
V2 42% 15% 31% 23% 44% 22. 7%
Table 3 shows that between 15 per cent and 44 per cent of dependent verbs in these corpora are not in final position. So weil+V2 word order is not just a peculiarity of the German spoken by my bilingual informants. We thus have two problems to solve: 1) the asymmetrical distribution of because and weil in the corpus; and 2) the word order variation in both mixed and monolingual causal clauses introduced by because and weil. In the next section I will suggest possible solutions to these two problems. 4. 2 Possible explanations 4. 2. 1 FOR THE ASYMMETRY OF BECAUSE AND WEIL
The frequencies with which because and weil occur in dependency relations (summarized in Table 1) suggest that for the asymmetry between because and weil a probabilistic perspective is required. Fourteen out of the sixteen tokens of because in an otherwise German context were produced by one speaker (DOR). This is even more significant if we remember that this speaker is German dominant. The data from this speaker only contain seven tokens of the German subordinator weil (and no denn). Because thus seems to replace weil for specific uses in the speech of this speaker. This use of the causal conjunctions is also to be found among the close-knit network of bilinguals who use the mixed code as a discourse mode
134
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(speakers TRU, MEL and LIL); but there is no significant asymmetrical relation between because and well in the rest of the corpus. Reasons for the discrepancy between the British and Australian corpora will have to remain speculative for the moment I will, however, come back to this point at the end of section 4. 2. 2. Why German-speaking Jewish refugees in Australia incorporate German conjunctions into their English - and the directionality of lexical transfer being reversed among the same speakers in Britain - could be due to the Australian corpus having been collected approximately 20 years before the London corpus. Michael dyne collected data, from this speech community in the 1970s. My corpus was collected in 1993. An additional two decades of exposure to English of the London-based refugees may be a possible explanation for this discrepancy. Data from American/German dialects that have been in contact with English for up to two centuries support this assumption. See example (25) from Salmons (1990: 472): (25)
Almost jedes moi every time is
is it
Zeit time
Supper recht soup properly
khat fer had for
Suppe gewen, because soup be
mir we
ban have
kei no
essen. to eat
Treffers-Daller (1994: 192-5) discusses (25) and (26) and suggests analyzing the conjunctions in these two examples as coordinators. For monolingual English Schleppegrell (1991: 323) argues that 'a characterisation of all because clauses as subordinate clauses [... ] is inadequate'. The possibility of a paratactic24 function for because will be discussed in the next section. Gardner-Chloros's (1991) French/Alsatian data also offer an interesting example of two Alsatian clauses linked by a French causal conjunction. (26)
Un and
noh isch now is
de Kleinmann nunter, parce que the Kleinmann down there
mi myself
dort mue melde. there must check in.
ich I
hab have
The German verbs selected by the English and French conjunctions in examples (25) and (26) follow just one dependent, in these cases their subjects. I will discuss the not strictly causal/subordinating use of English because, German weil and French parce que in the next section. 4. 2. 2 V2 AFTER BECAUSE AND WEIL
The clearest result of the quantitative analysis presented in Table 2 is that all German finite verbs in clauses after because are in second position and none in clause final position. The Word Grammar rules stated in section 3 account for the empirical data because English subordinates only require finite verbs as their complements (rule E2). German subordinators (rule G3), on the other hand, provide a
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
135
specific context that requires dependent verbs to take all their dependents to the left. As because is an English subordinates which does not specify that its complement has to be a clause final verb, we get main clause word order (SVO in monolingual English or V2 in mixed utterances). Supporting evidence for this interpretation comes from the six instances where the finite verb follows a dependent other than its subject (cf. examples 23-24 and 27 below). (27)
DOR: I lost because
# dreimal gab three times gave
sie mir drei Konige. she me three kings Jenl. cha, line 817
In the above example the verb is in second position, but the clause is clearly not SVO. The finite verb is preceded by an adverbial but followed by the subject. In other words, the clause displays the V2 order expected in German main clauses. But how do we know that because and the because-clause, are used in a restrictive subordinating way in examples (23), (24) and (27)? This question needs to be addressed because research conducted by, amongst others, Rutherford (1970), SchleppegreU (1991) and Sweetser (1990), cast doubt on the characterization of all because-danses as subordinate clauses. Especially in spoken discourse, because can be used in a variety of non-subordinating and not strictly causal functions. Several criteria have been proposed to distinguish between restrictive (i. e. subordinating25) and non-restrictive because-clauses (Rutherford 1970). In sentences containing restrictive because clauses yes/no questioning of the whole sentence is possible; pro-ing with so or neither covers the entire sentence; they can occur inside a factive nominal; and if another because clause were added, the two causal clauses would occur in simple conjunction. In semantic terms the main and the subordinate clause form one complex proposition and the because-clause provides the cause or reason for the proposition itself. This causal relationship is one of 'real-world' causality (Sweetser 1990: 81). Chafe (1984) asserts that restrictive because clauses have a reading that presupposes the truth of the main clause and asserts only the causal relation between the clauses. These clauses tend to have a commaless intonational pattern. I will now apply these characteristics to some of the causal clauses introduced by because in the corpus cited so far. Utterance (27) passes all of Rutherford's (1970) syntactic criteria for restrictive because-clauses. The main and because-clanses form one complex proposition with a reading in which 'her giving the speaker three kings' is the real world reason for the speaker losing the game of cards. The truth of the sentence-initial clause is presupposed and the causal relation between the two clauses is asserted. These properties of (27) speak for a restrictive analysis. The intonational contour of the utterance, however, displays a short pause after the conjunction.26 Note furthermore that the causal clause in (27) contains a pre-posed constituent that triggers inversion, i. e. a main clause phenomenon (Green 1976). So there are indicators for both a restrictive/subordinate reading but also syntactic and intonational clues that
136 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE point to a non-restrictive/epistemic reading in which the speaker's knowledge causes the conclusion. The latter interpretation suggests non-subordination, which would justify the V2 word-order pattern. Example (18), repeated here with more context (to facilitate the interpretation) and prosodic information as (28), contains the English conjunction because but is otherwise lexified with German words: (28)
DOR: wir waren nie mit richtige Englaender zusammen. 'we never mixed with "real" English people' DOR: man hatte konnen # man hat nicht wollen. 'we could have # but we didn't want to' DOR: es war unsere [... ] Schuld-. it was our fault because man ftihlt sick mit den eigenen Leuten wohler. one feels oneself with the own people better Ibronxha, line 217-22
This example passes none of Rutherford's (1970) 'tests'. The intonational drop before the conjunction which intonationally separates the two clauses also suggest a non-subordinate analysis for (28). A restrictive reading of the whole construction is awkward if not unacceptable: feeling relaxed in the company of fellow compatriots is not the cause or reason for feeling guilty. The nonrestrictive reading in which the because clause provides the reason why the speaker said 'it was our own fault' is far more plausible. The because clause, furthermore, indicates an interpretative link between clauses that are several utterances apart: the last utterance in (28) provides a 'long-distance' reason for the first utterance in this sequence. Schleppegrell (1991: 333) calls these uses of because 'broad-scope thematic links'. They can be only identified when a corpus provides the relevant context for the example. The wider context also identifies the clause preceding the causal clause as presupposed and thematic. The information provided in the causal clause is new and asserted. The analysis so far suggests that because is used in non-restrictive and nonsubordinating functions in code-mixed utterances in my corpus. Without repeating them, I will now briefly discuss the other examples in which because introduces a clause with predominantly German lexical items (Examples 19-20 and 23-24). Example (19) is a response to a preceding wh-question and thus an independent utterance, the information presented in the reply is not informationally subordinated, it forms the focus of the discourse and provides new information (Schleppegrell 1991: 31). Example (20) has two intonational contours. The intonational rise and the verb first order mark the initial clause as a command or suggestion, i. e. an independent proposition; the following because clause then represents an elaboration of that proposition. The content of the causal clause is therefore not presupposed. Example (20) displays all the characteristics of an 'epistemic' (Sweetser 1990) because, which indicates 'elaboration and continuation in non-subordinating and non-causal contexts' (Schleppegrell 1991: 323). The because clause in example (23) is preceded by a short pause, contains a main clause phenomenon (extraction), and is reflexive
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
137
on the previous discourse; finally, the because clause in (24) follows a rising intonation of the initial tag, and again explicitly mentions the speaker's knowledge state ('it's true'). We can conclude that those clauses in which because has a German (V2) verb as its complement, display more characteristics of 'non-restrictive' (Rutherford 1970) clauses and should therefore be analyzed as paratactic rather than subordinating. The Word Grammar rules formulated in section 3 still account for the data because if because is not analyzed as a subordinator, the default rule Gl is not overridden and G2 and G3 do not get activated. The analysis of the code-mixed data discussed so far indicates that the predominantly German clauses introduced by because fulfil functions that are not strictly causal but rather epistemic, broad-scope thematic link, etc. This distinct usage is also reflected in their structural and intonational patterns. We can therefore assume that we are dealing with non-restrictive because that is nonsubordinate and thus triggers main clause (V2) word order. However, we also need to consider the monolingual data. The monolingual German data from my corpus are more worrying at first sight Like because, well was traditionally analyzed as a subordinating conjunction with causal meaning which takes a finite verb as its complement These grammar rules are not absolutely adhered to by my informants and monolingual speakers of German. Only 58 per cent of verbs depending on well in the speech of my informants are in clause final/late' position. Table 3 shows, furthermore, that in corpora of similar, i. e. southern, varieties of German only 3. 1 -85 per cent (with an average of approximately 67 per cent) of the subordinate clauses introduced by well are grammatical according to the rules for monolingual German as stated in section 3. The recent German literature on well constructions (Giinthner 1993, 1996; Pasch 1997; Uhmann 1998), however, suggest an explanation for the monolingual German data and opens up the possibility for an interesting interpretation of the mixed data. There is agreement among the above named researchers that a) there is considerable variation in the use of well + V2 or well + Vf; b) well + V2 is most frequent in southern German dialects; and c) weil clauses with verb final placement and weil clauses with main clause (V2) word order are found to show systematic but not absolute differences. In a nutshell, the analysis for German weil is similar to the analysis proposed for English because: there are two types of weil clauses, one strictly subordinating one, and several non-restrictive paratactic uses. The factor that best seems to account for the data is the information structure of the construction. If pragmatics and syntax, which in German is a much clearer indicator than in English, fail to provide clear criteria as to which type of ^/-construction we are dealing with, intonation can once again help to disambiguate. Example (29) from my corpus illustrates epistemic weil + V2: (29)
LIL:
sie hat sich gedacht, die [/] die muss doch Wien kennenlernen, 'She thought she needs to get to know Vienna' weil die eltern sind beide aus Wien. because parents are both from Vienna JenS. cha, line 107-8
138 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE Note that in (29) well could be replaced by the German coordinating conjunction derm. Pasch (1997) and Uhmann (1998) agree that the nonrestrictive well seems to take the position of Standard German denn in the system of conjunctions of reason in colloquial German. In the analysis so far it has been established that there are 'restrictive' and 'non-restrictive' because clauses in English and 'restrictive' and 'non-restrictive' well clauses in German. A cross-linguistic comparison of these clause types revealed that they share many of their discourse-pragmatic, syntactic and intonational characteristics. My informants use both clause types from both languages in monolingual contexts. In addition to this, they employ because in code-mixed contexts. They treat English because as the translation equivalent of the non-restrictive weil+V2 or denn. Their linguistic competence tells them that these constructions are equivalent in syntax and pragmatic content. This was demonstrated for the quoted examples and also holds true for the because followed by weil+V2 examples not reproduced in this chapter. Furthermore, if we apply this analysis to the quantitative asymmetry found in the corpus between the two conjunctions because and weil and add the 21 tokens of because+V2 to the weil tokens, this asymmetry shrinks to a figure (80 weil: 120 because) which is in line with the general language distribution in the corpus. In addition to the syntactic and pragmatic reasons for using this 'congruence approach' (Sebba 1998: 1) to switching at clause boundaries, my informants may also be dialectally pre-disposed to the weil+VS construction because all of them are Lx speakers of a southern German variety. I will now briefly return to the discrepancy between the Australian (Clyne 1973) and London corpora mentioned in sections 4. 1. 1 and 4. 2. 1. The question was why German speaking Jewish refugees in Australia incorporate German conjunctions into their English, and the directionality of lexical transfer is reversed among the same speakers in Britain. I hypothesized that duration of language contact may have something to do with it. At the time of data collection, German speaking refugees in Australia had been mixing German and English for approximately 30 years. In London, on the other hand, these two languages had been in contact for more than half a century when I collected my data. Another situation where we can witness long-term contact between the two languages under investigation are German-American dialects. Note, furthermore, that example (25) from these data (Salmons 1990) also has main clause word order after because. The development in Pennsylvania German (Louden 2003) is particularly interesting in this respect. Louden (2003) illustrates the causal conjunction paradigm in Pennsylvania German (PG) data from the 19th century onwards. In the second half of this century he found the standard German distribution of weil + verb final and dann (< Germ, denn) + V2. In data from the beginning of the 20th century PG still has verbs depending on weil in final position; dann, however, has been replaced byfer (< Engl. /or) + V2. In modern sectarian PG weil is backed up with (d) ass, a historical merger of doss with comparative als, and for (originally dann < Germ, denn) has been replaced with because + V2. This development is interesting for several reasons: PG, in the late 19th,
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
139
early 20th century went through a phase that mirrors present-day English in the distribution between because and for. In modern Pennsylvania German, well does not seem to be able to function as subordinates in its own right any longer and it has to be backed up by another complementizer to trigger verb final placement. This supports rule G2 (section 3) which implicitly proposes a subordinate feature on lexical complementizers. Modern PG seems to have lost this feature and therefore needs to be 'backed up' by another subordinates to trigger verb final word order. Dann in modern PG, on the other hand, after having gone through the stage of fer (
•
•
Word Grammar requires less analysis than constituency-based models because the only units that need to be processed are individual words. Larger units are built by dependency relations between two words which can be looked at individually. As syntactic structure consists of dependencies between pairs of single words, constraints on code-mixing are less prone to over-generalization than constraints involving notions of government and constituency. Word Grammar allows a single, completely surface analysis (with extra dependencies drawn below the sentence-words). Code-mixing seems to be a surface-structure phenomenon, so this property of WG fits the data.
140
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
• Knowledge of language is assumed to be a particular case of more general types of knowledge. Word Grammar accommodates sophisticated sociolinguistic information about speakers and speech communities. This is important for language contact phenomena that are influenced by social and psychological factors. • In contrast with most other syntactic theories, Word Grammar recognizes utterances. • WG is a competence model which can handle inherent variability. I do not claim that the present work illuminates theories of language structure but it confronts a linguistic theory, Word Grammar, with statistical data, and shows that this theory of language structure can be successfully and illuminatingly used for the analysis of monolingual and code-mixed constructions. The WG formulation of the null hypothesis is born out with just a handful of exceptions, and the WG rules determining word order in monolingual German or English and code-mixed clauses also hold. The investigation of word order in subordinate clauses, furthermore, shows that the null hypotheses seems to be correct even in cases where we would expect restricitions on code-switching due to surface word order differences between the two grammars involved in mixing. The quantitative analysis of monolingual and code-mixed because and well clauses revealed that a) the core group of informants favour the English causal conjunction because over German weil or denn; the use of well and denn are restricted to monolingual German contexts, and because is also used to introduce mixed utterances; b) the word order in weil clauses varies between verb final, as required in subordinate clauses, and verb second, the main clause order; the coordinating conjunction denn only occurs once and with main clause order, as expected; mixed clauses introduced by because invariably have verb second structure. Independent research on the syntactic, intonational, semantic and pragmatic properties of monolingual because and weil clauses has shown that these properties cluster to form two main types of causal clauses: restrictive and non-restrictive (Rutherford 1970). The qualitative analysis of the monolingual causal clauses in the corpus revealed that they also fall into these two types and that the mixed utterances introduced by because predominantly have the grammatical properties of nonrestrictive clauses. Thus Boumans' (1998: 121) hypothesis that 'foreign conjunctions do not trigger verb-final in German clauses simply because they are used in functions that require main clause order' could be verified. The quantitative analysis of because and weil clauses has furthermore demonstrated how frequency distributions of a specific grammatical pattern in monolingual speech data can be combined with our knowledge about syntactic and pragmatic properties of grammars to handle frequency in bilingual data (Muysken 2000). The WG analysis of German (and Dutch) lexical subordinators having a 'subordinate' feature which triggers verb final placement was furthermore supported by data from two other language contact situations (Pennsylvania German and Brussels Dutch) in which certain subordinators seem to have lost this feature and therefore to require 'backing up' from overt complementizers.
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
141
References Belazi, H. M., Rubin, E. J. and Toribio, A. J. (1994), 'Code switching and X-bar theory: The functional head constraint5. Linguistic Inquiry, 25, 221-37. Bentahila, A. and Davies, E. E. (1983), 'The syntax of Arabic - French code switching'. Lingua, 59, 301-30. Bolle, J. (1995), 'Mengelmoes: Saranan and Dutch language contact', in Papers from the Summer School Code-switching and Language Contact. Ljouwerl/Leeuwarden: Fryske
Akademie, pp. 290-4. Boumans, L. (1998), The Syntax of Codeswitching: Analysing Moroccan ArabiclDutch Conversations. Tilburg: Tilburg University Press. Chafe, W. L. (1984), 'How people use adverbial clauses'. Berkeley Linguistics Society, 10, 437-49. Chomsky, N. (1981), Lectures on Government and Binding. Dordrecht: Foris. Clyne, M. G. (1973), 'Thirty years later: Some observations on "Refugee German" in Melbourne', in H. Scholler and J. Reidy (eds) Lexicography and Dialect Geography, Festgabefor Hans Kurath. Wiesbaden: Steiner, pp. 96-106. — (1987), 'Constraints on code-switching: how universal are they?' Linguistics, 25, 73964. DiSciullo, A-M., Muysken P. and Singh, R. (1986), 'Government and Code-Mixing'. Journal of Linguistics, 22, 1-24. Eppler, E. (1999), 'Word order in German-English mixed discourse'. UCL Working Papers in Linguistics, 11, 285-308. — (2004), '... Because dem Computer brauchst' es ja nicht zeigen': Because + German main clause word order'. International Journal of Bilingualism, 8, 127-43. — German/English LIDES database . Gardner-Chloros, P. (1991), Language Selection and Switching in Strasbourg. Oxford: Clarendon Press. Green, G. M. (1976), 'Main clause phenomena in subordinate clauses'. Language, 52, 382-97. Grosjean, F. (1995), 'A psycholinguistic approach to codeswitching', in P. Muysken and L. Milroy (eds), One Speaker, Two Languages. Cambridge: Cambridge University Press, pp. 259-75. Gumperz, J. J. (1982), Discourse Strategies. Cambridge: Cambridge University Press. Gunther, S. (1993), '... Weil-man kann es ja wissenschafuich untersuchen'- Diskurspragmatische Aspekte der Wortstellung in weil-Satzen'. Linguistische Berichte, 143, 37-59. — (1996), 'From subordination to coordination?'. Pragmatics, 6, 323-56. Hudson, R. A. (1980), Sociolinguistics. Cambridge: Cambridge University Press. — (1997), 'Inherent variability and linguistic theory'. Cognitive Linguistics, 8, 73-108. — (1990), English Word Grammar. Oxford: Blackwell. — (2000), 'Grammar without functional categories', in R. Borsley (ed. ), The Nature and Function of Syntactic Categories. New York: Academic Press, pp. 7-35. Joshi, A. K. (1985), 'Processing of sentences with intrasentential code-switching', in L. Dowry, L. Kartunnen and A. M. Zwicky (eds), Natural Language Parsing. Cambridge: Cambridge University Press, pp. 190-205. Lehmann, Ch. (1988), 'Towards a typology of clause linkage' in J. Haiman and S. Thompson (eds), Clause combining in grammar and discourse. Amsterdam/ Philadelphia: John Benjamins, pp. 181-226. Louden, M. L. (2003), 'Subordinate clause structure in Pennsylvania German'. FGLSj SGL Joint Meeting. London, 2003.
142
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
MacSwan, J. (1999), A Minimalist Approach to Intrasentential Codeswitching. New York and London: Garland. Mahootian, S. and Santorini, B. (1996), 'Code-switching and the complement/adjunct distinction'. Linguistic Inquiry, 27, 3, 464-79. Muysken, P. (1989), 'A unified theory of local coherence in language contact', in P. Nelde (ed. ), Language Contact and Conflict. Brussels: Centre for the Study of Multilingualism, pp. 123-9. — (2000), Bilingual Speech. A Typology of Code-Mixing. Cambridge: Cambridge University Press. Myers-Scotton, C. (1993), Duelling Languages: Grammatical Structure in Code-Switching. Oxford: Oxford University Press. Myers-Scotton, C. and Jake, J. L. (1995), 'Matching lemmas in a bilingual language competence and production model: Evidence from intrasentential code switching'. Linguistics, 33, 981-1024. Pasch, R. (1997), 'Weil mit Hauptsatz-Kuckucksei im denn-Nest?'. Deutsche Sprache, 25, 252-71. Poplack, S. (1980), 'Sometime I'll start a sentence in Spanish y termino en EspanoF. Linguistics, 18, 581-618. Poplack and Meechan (1995), 'Orphan categories in bilingual discourse: A comparative study of adjectivization strategies in Wolof/French and Fongbe/French'. Language Variation and Change, 7, 2, 169-94. Romaine, S. (1989), Bilingualism. Maiden, Mass.: Blackwell. Rutherford, W. E. (1970), 'Some observations concerning subordinate clauses in English'. Language, 46, 97-115. Salmons, J. (1990), 'Bilingual discourse marking: Code switching, borrowing, and convergence in some German-American dialects'. Linguistics, 28, 453-80. Sankoff, D. and Poplack, S. (1981), 'A Formal Grammar for Code-Switching'. Papers in Linguistics, 14, 3-46. Schleppegrell, M. J. (1991), 'Paratactic because'. Journal of Pragmatics, 16, 323-37. Schlobinski, P. (1992), 'Nexus druch weil', in P. Schlobinsky (ed. ) Funktionale Grammatik und Sprachbeschreibung. Opladen: Westdeutscher Verlag. 315-44. Scotton, C. M. (1990), 'Code-switching and borrowing: Interpersonal and macro-level meaning', in R. Jacobson (ed. ), Codeswitching as a Wordwide Phenomenon, New York: Peter Lang, pp. 85-110. Sebba, M. (1998), 'A congruence approach to the syntax of Codeswitching'. International Journal of Bilingualism, 2, 1-19. Sweetser, E. (1990), From Etymology to Pragmatics. Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press. Thorne, J. P. (1986), 'Because', in D. Kastovsky and A. Szwedek (eds) Linguistics across Historical and Geographical Boundaries (Vol. 12). Berlin: Mouton. 1063-6. Treffers-Daller, J. (1994), Mixing Two Languages: French-Dutch Contact in a Comparative Perspective. Berlin: de Gruyter. Uhmann, S. (1998), 'Verbstellungsvariationen in weil-Satzen'. ^eitschrift fur Sprachwissenschaft, 17, 29-139. Zwicky, Arnold (1992), 'Some choices in the theory of morphology', in Robert Levine (ed. ) Formal Grammar: Theory and Implementation. Oxford: Oxford University Press, pp. 327-71.
WORD GRAMMAR AND SYNTACTIC CODE-MIXING
143
Notes 1 The corpus was collected in 1993 from German-speaking Jewish refugees residing in London. All transcripts are available on
144
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
16 The term 'late' was chosen instead of 'final' because finite dependent auxiliaries in double infinitive constructions can be followed by their non-finite dependents; cf. endnote 15. 17 Support for this analysis comes from the fact that German subordinate clauses lacking a subordinator/complementizer are V2 (or verb initial). Cf.: Sie sagte, sie kennen Doris vs. Sie sagte, daft sie Doris kennen She said they know Doris She said that they Doris know According to G3, it is only subordinators/complementizers that select 'late' finite verbs. So if a verb depends directly on another verb (kennen directly depending on sagte and not daff) the default rule need not be overridden. 18 Exceptions to this rule are extraposition and double-infinitive constructions. 19 The null hypothesis is violated in five tokens of two construction types: word-order violations of objects and negatives (see Eppler 2004). 20 The data this study is based on are transcribed in the LIDES (Language Interaction Data Exchange) system. More information on the transcription system can be found on . 21 See for example Clyne (1987), Gardner-Chloros (1984), Salmons (1990), TreffersDaller (1994). 22 Example (24) is an incomplete subordinate clauses. This does not effect the analysis because the word order position of the relevant finite dependent verb is clear. 23 Since all my informants are from Vienna, I used only examples from the ten Viennese informants for the Brigham Young Corpus (BYU) corpus. Farrar (1998) counted all occurrences of weil in the speakers of southern German dialects from the BYU corpus. Schlobinski's (1992) data are standard Bavarian; and the Uhmann (1998) corpus is 'alemannisch-bairisch'. 24 Lehmann (1988) suggests that for clauses that are linked in a relationship of sociation rather than dependency, 'paratixis' is a more appropriate term than 'coordination'. 25 Two clauses (X and Y) have been defined as being in a subordination relationship 'if X and Y form an endocentric construction with Y as the head' (Lehmann 1988: 182). 26 Note that in the English literature, Rutherford (1970) and Thome (1986), the comma intonation is assumed to precede the conjunction. Schleppegrell (1991: 333) mentions the possibility of because followed by a pause.
7 Word Grammar Surface Structures and HPSG Order Domains* TAKAFUMI MAEKAWA
Abstract In this chapter, we look at three different approaches to the asymmetries between main and embedded clauses with respect to the elements in the left periphery of a clause: the dependency-based approach within Word Grammar (Hudson 2003), the Constructional Head-driven Phrase Structure Grammar (HPSG) approach along the lines of Ginzburg and Sag (2000), and the Linearization HPSG analysis by Chung and Kim (2003). We argue that the approaches within WG and the Constructional HPSG have some problems in dealing with the relevant facts, but that Linearization HPSG provides a straightforward accounta of them. This conclusion suggests that linear order should be independent to a considerable extent from combinatorial structure, such as dependency or phrase structure. 1. Introduction There are two ways to represent the relationship between individual words: DEPENDENCY STRUCTURE and PHRASE STRUCTURE. The former is a pure representation of word-word relationships while the latter includes additional information that words are combined to form constituents. If all work can be done just by means of the relationship between individual words, phrase structure is redundant and hence dependency structure is preferable to it. It would therefore be worth considering whether all work can really be done with just dependencies. We will look from this perspective at certain linear order asymmetries between main clauses and subordinate clauses. One example of such asymmetries can be seen in the contrast of (1) and (2). The former shows that a topic can precede a fronted wA-element in a main clause: (1)
a. b.
Who had ice-cream for supper? For supper who had ice-cream?
(2) illustrates, however, that this is not possible in an embedded clause: (2)
a. b.
Who had ice-cream for supper is unclear, * For supper who had ice-cream is unclear.
146
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
It is clear that main clauses are different from subordinate clauses with respect to the possibility of topicalization. It has been noted by a number of researchers that elements occurring in the left periphery of the clause, such as interrogative and relative pronouns, topic and focused elements, show such linear order asymmetries (see Haegeman 2000; Rizzi 1997; and works cited therein). The purpose of this overview chapter is to take a critical look at the current treatment of such asymmetries within the frameworks of WORD GRAMMAR (wo) and HEAD-DRIVEN PHASE STRUCTURE GRAMMAR (HPSG), and ask how they should be represented in the grammar. We compare the WG approach developed in Hudson (2003; see also Hudson 1995, 1999) with the two relatively recent versions of HPSG: what can be called CONSTRUCTIONAL HPSG in which grammars include hierarchies of phrase types (Sag 1997; and Ginzburg and Sag 2000), and so-called LINEARIZATION-BASED HPSG (or LINEARIZATION HPSG), in which linear order is independent to a considerable extent from phrase structure and is analysed in terms of a separate level of 'ORDER DOMAINS' (Pollard et al. 1994; Reape 1994; Kathol 2000, etc. ).1 It will be argued that trie WG and the Construction HPSG approaches have some problems, but that Linearization HPSG can provide a straightforward account of the facts. The organization of this chapter is as follows. In the next section we consider how a WG approach might accommodate the asymmetry between main and subordinate wA-interrogatives. Section 3 then looks at a Construction HPSG analysis along the lines of Ginzburg and Sag (2000). In section 4 we shall outline a Linearization HPSG analysis developed by Chung and Kim (2003). In the final section, we offer some concluding remarks.
2. A Word Grammar Approach Before looking at the WG analysis of the phenomenon in discussion, we should briefly outline how word order, w/z-constructions and extractions are treated in WG. In WG word order is controlled by two kinds of rule: general rules that control the geometry of dependencies, and word-order rules that control the order of a word in relation to other word(s): its LANDMARK(S) (Hudson 2005). In simple cases a word's landmark is its PARENT: the word it depends on. In the cases where the word has more than one parent, only the 'higher' parent becomes its landmark (PROMOTION PRINCIPLE; see Hudson 2003a). For example, let us consider the sentence It was raining. The raised subject it depends on two verbs, was and raining, so it has two parents. In this case, eligible as its landmark is was. This is because raining depends on was, so the latter is the 'higher' of the two. In a WG notation, It was raining is represented as shown below. (3)
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
147
The fact that it is the subject of the two verbs is indicated by the two arrows labelled V (subject). The arrow labelled V indicates that raining is a 'SHARER' of was. This is so named since it shares the subject with the parent verb. In a notation adopted here, the dependencies that do not provide landmarks are drawn below the words. Therefore, one of the V arrows, the one from raining to it, is drawn below the words. We thus pick out a sub-set of total dependencies of a sentence and draw them above the words. This sub-set is called SURFACE STRUCTURE. Word-order rules are applied to it, and determine the positioning of a word in relation to its landmark or landmarks. Thus, the surface structure is the dependencies which are relevant for determining word order. A word-order rule specifies that a subject normally precedes its landmark, and another rule specifies that a sharer normally follows its landmark, as illustrated by the representation in (3). Among the rules that control the surface structure, THE NO-TANGLING PRINCIPLE is the most important for our purpose: dependency arrows in the surface structure must not tangle.2 This principle excludes the ungrammatical sentence (4b): (4)
a. b.
He lives on green peas, * He lives green on peas.
The dependency structures of this pair are shown in (5):
(5)
(5b) includes tangling of the arrows. Its ungrammaticality is predicted by the No Tangling Principle. Let us turn to the WG treatment of wA-interrogatives. Consider the dependency structure of What happened? for example. As in the case of an ordinary subject such as it in (3), the grammatical function of the w/z-pronoun what to the verb happened is a subject. Therefore, what depends on happened, and this situation can be represented as follows. (6)
148 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE On the other hand, Hudson (1990: 361-82; 2003) argues that the verb is a complement of the wh-pronoun and thus depends on it (7)
The evidence for the headness of wh-pronoun includes the following phenomena (Hudson 2003). First, the pronoun can occur without the verb in sluicing constructions: (8)
a. b.
Pat I know he's invited a friend. Jo: Oh, who [has he invited] ? I know he's invited a friend, but I'm not sure who [he's invited].
Second, the pronoun is what is selected by the higher verb. In (9) wonder and sure require a subordinate interrogative clause as their complement For a clause to be subordinate interrogative, the presence of either a wh-pronoun, or whether or if is required. (9)
a. b.
I wonder *(who) came. I'm not sure * (what) happened.
Third, the pronoun selects the verb's characteristics such as finiteness and whether or not it is inverted. (10) illustrates that why selects a finite or infinite verb as its complement, but when only selects a finite verb: (10)
a. b.
Why/when are you glum? Why/*when be glum?
(11) indicates that why selects an inverted verb as its complement whereas how come selects a non-inverted verb: (11)
a. b. c. d.
Why are you so glum? * Why you are so glum? * How come are you so glum? How come you are so glum?
(12) illustrates that what, who and when select a to-infinitive, but why does not: (12)
I'm not so sure what/who/when/*why to visit
Hudson (2003) argues that all of these phenomena are easily accounted for if the 2£>/z-pronoun is a parent of the next verb. In the framework of WG, therefore, there is no reason to rule out any of (6) and (7); the sentence is syntactically ambiguous. Thus, in What happened? what and happened depend on each other, and the dependency structure may be either of (13a) and (13b):
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
(13)
149
a.
b.
Thus, w/z-interrogatives may involve a mutual dependency. In (13b), happened is the parent and the dependency labelled V is put in surface structure. In (13a), however, what is the parent, and the dependency labelled 'c' (complement) is put in surface structure. Finally, we outline how extraction is dealt with in WG. Let us consider (14a) with an preposed adjunct in the sentence initial position: (14)
a. b.
Now we need help, We need help now.
The preposed adjunct now would otherwise follow its parent need as in (14b), but it precedes it. This situation is represented in WG by adding an extra dependency 'EXTRACTEE' to now.
(15)
The arrow from need to now is labelled 'x <, > a', which means 'an adjunct which would normally be to the right of its parent (" > a") but which in this case is also an extractee ("x>")'. Thus the adjunct now is to the left of the parent verb need. With this background in mind, let us now turn to the asymmetry between main and subordinate clauses in question: adverb-preposing is not possible in subordinate interrogatives although it is possible in main interrogatives. (16)
a. b.
Now what do we need? * He told us now what we need.
As stated above, a w/z-pronoun and its parent are mutually dependent. In (16a)
150
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
do is the complement of what whereas what is the extractee of do. Thus, the dependency structure for (16a) would be either of (17a) and (17b). In the former, what is the parent, and the dependency labelled 'c' from what to do is put in surface structure. In the latter, however, do is the parent and the dependency labelled 'x<' from do to what is put in surface structure. The preposed adjunct now is labelled 'x <, > a', and precedes its parent do. As the diagram shows, the 'x <, > a' arrow from do to now tangles with the vertical arrow in (17a). Thus, it violates the No Tangling Principle. On the other hand, there is no tangling in (17b), so it is the only correct WG analysis of (16a).
(17)
a.
b.
Let us turn to the subordinate w/z-interrogative in (16b). In (16b) what is the object and the extractee of need while need is the complement of what. It has the structure represented in (18). What is the clause's subordinates and it has to be the parent of the subordinate clause. The dependency labelled 'c' should be put in surface structure since if the arrow labelled 'x <, o' were in the surface structure, what would have two parents and violate THE NO-DANGLING PRINCIPLE: words should not have more than one parent in surface structure (Hudson 2005). As the diagram shows, the arrow from need to now is tangled with the one from told to what. Unlike the main clause case in (17), it has no alternative structure, so (16b) is ungrammatical.
(18)
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
151
Thus, WG can capture the linear order asymmetries of the main and subordinate clauses in terms of dependencies in surface structure and general principles on dependencies. Although the WG analysis looks successful in accommodating the asymmetry between the main and subordinate clauses, there are some weaknesses. As surveyed above, the WG approach states that adjunct prepbsing is possible out of main w/z-interrogatives because the preposed adjunct avoids violation of the No-Tangling Principle due to the fact that it is a co-dependent of the ^/z-element (Hudson 2003: 636). The argument along these lines would suggest that extraction is allowed as long as it does not violate the No-Tangling Principle. However, there are cases in which extraction out of embedded whinterrogative is excluded although it does not violate the No-Tangling Principle. The data comes from the SUBJECT-AUXILIARY INVERSION (SAI) structures illustrated by (19). 3 (19)
Under no circumstances would I go into the office during the vacation.
In WG, a preposed operator of the SAI clauses, such as under no circumstances in (19), is a kind of extractee (Hudson 2005), so we should expect that it behaves like a preposed adjunct As expected, SAI operators cannot be extracted out of the subordinate zx'A-interrogatives, as illustrated by the following examples. A WG approach would suggest that this is due to the No-Tangling Principle. (20)
a. b.
* Lees wonders under no circumstances at all why would Robin volunteer. * I wonder only with great difficulty on which table would she put the big rock. (Chung and Kim 2003)
With this in mind, let us consider the main ^-interrogative clause. A WG approach would predict that preposing of an SAI operator is possible out of main w/z-interrogatives because it should not involve violation of the NoTangling Principle, as in the case of adjunct preposing. However, it is actually ungrammatical. (21)
a. b.
* In no way, why would Robin volunteer? * Only with great difficulty on which table would she put the big rock? (Chung and Kim 2003)
Here the preposed SAI operator precedes the wA-element. Note that the situation is completely on a parallel with the case of the adjunct preposing like (16a), which is repeated here. (22)
Now what do we need?
As we have seen, (22) is grammatical because it does not violate the No-
152 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE Tangling Principle. However, the sentences in (21) are ungrammatical though they do not violate the same principle. This makes the analysis in terms of the No-Tangling Principle less plausible. As we saw at the outset of this section, word order is controlled by two kinds of rule in WG: general rules, such as the No-Tangling Principle that control the geometry of dependencies; and word order rules that control the order of a word in relation to its landmark or landmarks (Hudson 2005). Someone might suggest that the No-Tangling Principle is simply irrelevant in (21) and that a word-order rule could exclude the ill-formed order. However, there are some problems in this approach. Let us suppose that WG has a rule which excludes the OP(ERATOR) < WH order. It is natural to suggest that the same rule could apply not only to main clauses but also to subordinate clauses. As predicted, the subordinate clauses with the same elements in the same order as (21) are ungrammatical. This is actually illustrated by (20) above. Now we should recall that they can also be excluded by the NoTangling Principle as well; a preposed operator is extracted from the subordinate w/z-interrogative clause. The situation is entirely on a parallel with (16b). A question arises: for which reason are the sentences in (20) excluded, by the No-Tangling Principle or by a word-order rule? If we took the first option, then the ungrammaticality of (20) would be accounted for by the NoTangling Principle, whereas that of the corresponding main clauses would be explained by a word-order rule. If we took the second option, then both main and subordinate clauses would be excluded by a word-order rule. It is clear that we cannot take the first option: it forces the word-order rule to refer to main clauses. Note that WG does not have a unit larger than a word, so it does not recognize clauses (Hudson 2005). It does not, therefore, have a way to distinguish main and subordinate clauses, apart from the assumption that the latter has a subordinator and a parent outside of the clause. (Hudson 1990: 375-6). It is, then, impossible for WG rules to refer to any clause. What about the second option, a word-order rule approach, where each illformed case is excluded by a rule which bans a particular word order? It could indeed account for the ungrammaticality of the OP < WH order in both main and subordinate clauses. However, this option also has a problem, to which we will turn in the following paragraph. Consider the following pair, which shows another asymmetry between main and subordinate clauses: (23)
a. * Why, in no way would Robin volunteer? b. Lees wonders why under no circumstances would Robin volunteer.
A w/z-extractee precedes a preposed negative operator in the main clause in (23a), and it is ungrammatical. However, the same order is allowed in the subordinate clause, as in (23b). There is clearly an asymmetry between a main and a subordinate clause. The same sort of asymmetry can be observed in the case of a w/z-extractee and a topic extractee as well. In (24), a w/z-extractee precedes a topic extractee in main clauses, and they are ungrammatical:
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS (24)
a. b.
153
* To whom, a book like this, would you give? (Koizumi 1995) * For what kind of jobs during the vacation would you give into the office? (Baltin 1982)
On the other hand, the same permutation of w/z-element and a topic is allowed in subordinate clauses, as in (25): (25)
a. b. c.
the man to whom, liberty, we could never grant. ? I wonder to whom this book, Bill should give. (Chung and Kim 2003) I was wondering for which job, during the vacation, I should give into the office.
Here we have yet another asymmetry between a main and a subordinate clause. Our observation in (23)-(25) indicates that the word order which is grammatical in subordinate clauses is ungrammatical in main clauses. Note that the NoTangling Principle cannot exclude the ungrammatical cases since they are all in main clauses. Therefore, the only option we have is to specify word-order rules to exclude ill-formed cases. Now the same problem as (21) and (20) arises again. Such word-order rules would have to state that it is applied to a main clause but not to a subordinate clause. However, it is impossible for WG rules to refer to a clause since WG does not recognize any unit larger than a word. Thus, we cannot adopt a word-order rule approach, either. We have pointed out that the No-Tangling principle is not effective enough to accommodate the cases of preposing of an SAI operator, another asymmetry between main and subordinate w/z-interrogatives. Recall that the most important assumption for a WG approach is that the wA-pronoun is the parent of the subordinate wA-interrogatives. We should note that this assumption itself is not without problem. Consider examples in (26a) and (26b); the former is the one cited by Hudson himself as a problematic data for his analysis (Hudson 1990: 365): 4 (26) a. Which students have failed is unclear, b. Who shot themselves is unclear.
In WG treatment of wh-pYonoun, which and who are not only the subject of have and shot, respectively, but also the subject of is. The verb should agree in number with its subject, so have/shot and is should both agree with which/who. Which in (26a) should share its plurality with students since the former is a determiner of the latter; who in (26b) should share its plurality with themselves since the former is the antecedent of the latter. This does not explain the morphology of the copula verb in both sentences, which requires the singular subject. This analysis would predict sentences like the following to be grammatical: (27)
a. b.
* Which students have failed are unclear, * Who shot themselves are unclear.
154
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
The copular verb is are, not is, agreeing with its subject which in (a) and who in (b). These sentences are, however, ungrammatical. Thus, the assumption that the w^-pronoun is the parent of the subordinate interrogatives has a weakness. We should also note that there are some cases where an extractee is allowed to precede the complementizer. The following examples are from Ross (1986): (28)
a. b.
Handsome though Dick is, I'm still going to marry Herman, The more that you eat, the less that you want.
In (28a), the first clause is the subordinate clause, and the adjective handsome, a complement of is, is in front of the complementizer though. In (28b) the more, which is an object of eat and want, is followed by the complementizer that.5 It would be natural to assume the fronted elements in these examples to be an extractee in WG's terms; but if so, the dependency arrow from the verb to the extractee would tangle with the vertical arrow to the complementizer, and hence the resulting structure in (29) violates the No-Tangling Principle. 6
(29)
It seems, then, that a WG approach to the asymmetry between main and subordinate ^-interrogatives has some problems.
3. An Approach in Constructional HPSG: Ginzburg and Sag 2000 We will now consider how adjunct preposing in main and subordinate whinterrogatives might be accommodated within the framework of HPSG. In HPSG, lexical and phrasal descriptions are formulated in terms of FEATURE STRUCTURES like (30): (30)
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
155
The value of the feature PHONOLOGY (PHON) represents phonological information of a sign. The value of SYNTAX-SEMANTICS (SYNSEM) is of type synsem, a feature structure containing syntactic and semantic information. The SLASH feature is for representing information about long-distance dependencies, which we will consider further below. The value of LOCAL (LOG) contains the subset of syntactic and semantic information shared in longdistance dependencies. The syntactic properties of a sign are represented under the path SYNSEM|LOC|CAT(EGORY). The HEAD value contains information standardly shared between a phrase and its head, information such as parts of speech. The semantic properties of a sign are represented under SYNSEM | LOG | CON (TENT). The value of the ARG-ST (ARGUMENT-STRUCTURE) is a list of synsem objects corresponding to the dependents which a lexical item selects for, including certain types of adverbial phrases (Abeille and Godard 1997; Bouma et al. 2001; Kim and Sag 2002; van Noord and Bouma 1994; Przepiorkowski 1999a, 1999b). Sag (1997) and Ginzburg and Sag (2000) hypothesize that a rich network of phrase-structure constructions with associated constraints is part of the grammars of natural languages. The hierarchies allow properties that are shared between different phrasal types to be spelled out just once. The portion of the hierarchy which will be relevant to adjunct preposing is represented in (31).
(31)
Phrases are classified along two dimensions: clausality and headedness. The clausality dimension distinguishes various kinds of clauses. Clauses are subject to the constraint that they convey a message. Core clauses are one subtype, which is defined not to be modifiers and headed by finite verbal forms or the
156
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
auxiliary to. The headedness dimension classifies phrases on the basis of their head-dependent properties, i. e. whether they are headed or not, what kind of daughters they have, etc. A general property of headed phrases (hd-ph} is the presence of head daughter, and this phrasal type is constrained as follows: (32).
Generalized Head Feature Principle (GHFP) hd-ph: [synsem /[!]]—»•... H[synsem /\T\]...
The GENERALIZED HEAD FEATURE PRINCIPLE (GHFP) states that the SYNSEM value of the mother of a headed phrase and that of its head daughter should be identical by default A subtype of hd-ph, head-filler-phrase (hd-Jill-ph), is associated with the following constraint: (33)
hd-fill-ph:
This constraint requires the following properties. First, the head daughter must be a verbal projection. Second, one member of the head daughter's SLASH set is identified with the LOCAL value of the filler daughter. Third, other elements that might be in the head daughter's SLASH must constitute the SLASH value of the mother. Ginzburg and Sag (2000) treat the topicalization constructions as a subtype of hd-fill-ph, and posit a type topicalization-clause (top-cl). It is also assumed to be a subtype of core-cl. The type top-cl is subject to the constructionparticular constraint which takes the following form: (34)
top-cl:
Topicalized clauses have an independent ([1C +]) finite clause as a head daughter. Consider (35) for example: (35)
a. b.
Problems of this sort, our analysis would never account for. * She subtly suggested [problems of this sort, our analysis would never account for]. (Ginzburg and Sag 2000: 50)
The topicalized sentence in (35a) is an independent clause (i. e. [INDEPENDENT-CLAUSE (1C) +]), hence its head daughter our analysis would never account for has [1C +]. A clause has the [1C — ] specification in an embedded environment, and hence the embedded clause in (35b) is [1C — ].
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
157
Topicalization of such a clause is ruled out by (34). The filler daughter of the topicalized clause is constrained to be [WH {}], the effect of which is to prevent any w/z-words from appearing as the filler or an element contained within the filler. The constraints introduced above are unified to characterize the topicalized clause constructions. Given the above constraints, a sentence with a preposed adjunct will have something like the following structure (Bouma et al. 2001; Kim and Sag 2002): (36)
As noted above, certain types of adverbial phrases are selected by the verbal head and listed in the ARG-ST list, along with true arguments. Thus, adjunctpreposing and standard cases of topicalization can be given a unified treatment. The ARG-ST of the verb visit thus contains an adverbial element, whose synsem is specified as a gap-ss. Gap-ss, a subtype of synsem, is specified to have a nonempty value for the feature SLASH. Its LOG value corresponds to its SLASH value, as indicated by the shared value [1]. The ARGUMENT REALIZATION PRINCIPLE ensures that all arguments, except for a gap-ss, are realized on the appropriate valence list (i. e. SUBJ(ECT), COMP(LEMENT)S or SP(ECIFIER), and hence are selected by a head. Note that in (36) the gap-ss in the ARG-ST list of visit does not appear in a COMPS list The nonempty
158 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE SLASH value is incorporated into the verb's SLASH value.7 The verb's SLASH value is projected upwards in a syntactic tree from the head daughter to mother, due to the GHFP. The termination of this transmission, which is effected by subtypes of the hd-fill-ph constructions, occurs at an appropriate point higher in the tree: a dislocated constituent as specified as [LOG [1]] combines with the head that has the property specified in the constraint for hdfill-ph in (33). Now we can consider how this approach might accommodate the asymmetry between main and subordinate w/z-interrogatives. The data observed in the last section can be summarized as (37): (37)
Distribution of SAJ operator, ro/z-element and topic (Based on Chung and Kim 2003) Main clause
TOP
Embedded clause
ok
(16a)
*
* *
(24) (21) (23a)
ok
*
*
ok
(16b) (25) (20) (23b)
We will begin with the asymmetry in terms of the interaction of a topic and a wA-element. The relevant data is repeated here for convenience with the labels and brackets added: (38)
a. b.
[Si Now [S2 what do we need]]? * He told us [Si now [32 what we need]].
SI is composed of the topic filler and the clausal head, S2. S2 of the two sentences in (38) is of the type ns-wh-int-cl. What we need to do is to check compatibility of a clause of the type top-d and that of the type ns-wh-int-cl, with the latter being the head of the former. We saw above that clauses of the type top-d are constrained by various constraints; the unification of the constraints is represented as follows:
(39)
Of note here is that the LOG value of the mother [2] is shared with that of the head, due to the GHFP (32). This means that the head daughter, in this case a clause of the type ns-wh-int-d, should have a finite verb as its head and its 1C value is +. According to the hierarchy in (31), a clause of this type is characterized as unification of core-d, int-d, hd-ph, hd-fill-ph, wh-int-ph and ns-
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
159
wh-int-ph. The following structure is the result of the unification, but is simplified with the details irrelevant to the discussion omitted: (40)
The shared value between the features 1C and INV(ERTED) guarantees that if the clause of this type is inverted ([INV +]) then its 1C value is -+, that is, it appears in a main clause; if it is uninverted ([INV — ]) then it should be in an embedded clause ([1C — ]). The S2 of (38a), the head daughter of the whole clause, is inverted, so its INV value is +, and hence its 1C value is +. This satisfies the requirement stated in (39) that the head daughter of a topicalization construction is an independent clause. The S2 in (38b) is an instance of ns-wh-int-cl as in the previous case, but it is not inverted (i. e. [INV — ]) in this case. The S2 should then be specified as [1C — ] due to constraint (40). As we saw above, the head daughter of a topicalization construction should be [1C +]. This is the reason why the embedded interrogative does not allow topicalization. Under Ginzburg and Sag's (2000) analysis, the asymmetry between main and subordinate whinterrogatives in terms of adjunct preposing is thus due to the conflict of the requirement from the topicalization constructions and the embedded interrogative constructions: the former requires [1C +] while the latter is specified as [1C — ]. We will move on to the data problematic to a WG approach. Let us first consider how Ginzburg and Sag's (2000) approach might deal with the asymmetry in terms of the order WH
a. b.
* [si To whom, [§2 a book like this, would you give?]] * [si For what kind of jobs [52 during the vacation would you give into the office?]]
As we observed in (25), however, the same order is acceptable in subordinate clauses. The data is repeated in (42): (42)
a. b. c.
the man (si to whom, [52 liberty, we could never grant]] ?I wonder [Si to whom [S2 this book, Bill should give. ]] I was wondering [51 for which job, [32 during the vacation, I should give into the office. ]]
160 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE In (41) and (42), SI is an instance of ns-wh-int-cl, and its head daughter S2 is of the type top-cl, so what we need to do is to check compatibility of top-cl as a head of ns-wh-int-cl. (40) states that the CAT value of ns-wh-int-cl should be shared by that of its head, top-cl in this case. The v and clausal specifications are compatible with those of top-cl. SI in the sentences in (41) has [1C +] since it is main clause. Its clausal head S2, therefore, should also have [1C +], according to (40). This indicates that the feature structure description given for the head daughter does not violate the constraint for top-cl in (39). Thus, their analysis makes a wrong prediction that the sentences in (41) are grammatical. Since subordinate interrogatives cannot appear independently, SI in (42) has the [1C — ] specification, and so does its head daughter top-cl. As we saw above, however, top-cl has [1C +]. Therefore, ungrammaticality of (42) is predicted; so we have the wrong prediction again. We will next turn to the interaction of preposed operator of SAI clauses and a ^^-element. As we observed in (20) and (21), the OP < WH is excluded in both main and subordinate clauses. We also observed in (23) that the WH < OP order is excluded in main clauses, but grammatical in subordinate clauses. The relevant data is repeated here in (43) and (44), with square brackets and labels added for expository purposes: (43)
a. b.
(44)
a. b.
* [si In no way, [§2 why would Robin volunteer]]? * I wonder [S1 only with great difficulty [S2 on which table would she put the big rock]]. * [si Why, [52 in no way would Robin volunteer]]? Lees wonders [51 why [$% under no circumstances would Robin volunteer]].
It is not clear exactly what sort of constraints preposed operators must satisfy in Ginzburg and Sag's (2000) system, but it is clear that the S2 in (43a, b) and the SI in (44a, b) are clauses of the type ns-wh-int-cl. Therefore, they should at least satisfy constraint (40). As we saw above, this constraint guarantees that the clause of this type is inverted ([INV +]) if it is in a main clause ([1C +]) and that it is uninverted ([INV —]) if it is in an embedded clause ([1C — ]). All the occurrences of ns-wh-int-cl in (43) and (44) are inverted, so they all should be independent ([1C +]), and that means they cannot appear in subordinate clauses. This correctly predicts that (43b) is ungrammatical, but it is problematic to (44b); we have here an example of the clause of the type ns-wh-int-cl, which appears in a subordinate clause ([1C — ]), but is inverted ([INV +]). Nothing in Ginzburg and Sag's (2000) constraints rules out the (a) examples in (43) and (44). It seems, then, that an approach to the asymmetry between main and subordinate wMnterrogatives within the framework of Ginzburg and Sag (2000) has some problems.
4. A Linearization HPSG Approach An analysis of English left peripheral elements given by Chung and Kim (2003) is based on a version of HPSG, which is a so-called linearization-based HPSG.
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
161
In this framework, word order is determined not at the level of the local tree, but in a separate level of 'order domains', an ordered list of elements that contain at least phonological and categorical information (see, e. g. Pollard et al. 1994; Reape 1994; and Kathol 2000). The list can include elements from several local trees. Order domains are given as the value of the attribute DOM(AIN). At each level of syntactic combination, the order domain of the mother category is computed from the order domains of the daughter constituents. The domain elements of a daughter may be COMPACTED to form a single element in the order domain of the mother or they may just become elements in the mother's order domain. In the latter case the mother has more domain elements than the daughters. For example, let us consider the following representation for the sentence Is the girl coming? (Borsley and Kathol 2000): (45)
The VP is coming has two daughters and its domain contains two elements, one for is and one for coming. The top S node also has two daughters, but its order domain contains three elements. This is because the VP's domain elements have just become elements in the S's order domain, whereas those of the NP are compacted into one single domain element, which ensures the continuity of the NP. Discontinuity is allowed if the domain elements are not compacted: is and coming are discontinuous in the order domain of the S. The notable feature of Chung and Kim's (2003) analysis is that each element of a clausal order domain is uniquely marked for the region that it belongs to (Borsley and Kathol 2000; Kathol 2000, 2002; Perm 1999). The positional assignment is determined by the following constructional constraints:
(46)
a
-
b.
c.
162 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE PKA-elements are assigned to position 3 in main clauses, and those in embedded (interrogative and relative) clauses are put in position 1. Topic elements are always assigned to position 2, and the operators are always assigned to position 3.8 Thus, left peripheral elements in English have the following distributions: (47)
Distribution of English left peripheral elements (Chung and Kim 2003)
Main clause Embedded clause
Marker field 1
Topic field 2
Focus field 3
WH/COMP
TOP TOP
WH/OP OP
An embedded wA-phrase competes for position 1 with a complementizer. This competition accounts for the fact that these two elements never co-occur in English (cf. Chomsky and Lasnik 1977). They further assume THETOPOLOGIGAL LINEAR PRECEDENCE CONSTRAINT, a linear precedence constraint which is imposed on the elements in order domains: (48)
Topological Linear Precedence Constraint 1 <2<3
(48) states that the elements in position 1 should precede those in 2, which should in turn precede those in 3. Now let us consider how this approach might accommodate the asymmetry between main and subordinate w/t-interrogatives. The summary of the relevant data given in (37) is repeated here in (49): (49)
Distribution of SAI operator, wh-element and topic Embedded clause
Main clause
TOP < WH WH < TOP OP < WH WH < OP
ok * * *
(16a) (24) (21) (23a)
*
ok *
ok
(16b) (25) (20) (23b)
As introduced above, Chung and Kim's approach assumes that a topic is in position 2 and a wA-element is in position 3 in main clauses. This accounts for the grammaticality of the TOP
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
163
This order domain does not violate the Topological Linear Precedence Constraint in (48), and hence accounts for the grammaticality of (16a), repeated here for convenience: (51)
Now what do we need?
Let us turn to embedded clauses, where the TOP < WH order is ungrammatical. (46c) states that wA-elements are assigned to position 1 in embedded clauses, whereas topic elements are always in position 2, no matter whether it is embedded or not. Thus, the TOP < WH order in embedded clauses leads to the following order domain. (52)
(52) violates the Topological Linear Precedence Constraint since its DOM element marked 2 precedes that marked 1. This explains the ungrammaticality of (16b). (53)
*He told us now what we need.
Thus, Chung and Kim's (2003) approach can accommodate the asymmetry between main and embedded clauses with respect to a topic and a wA-element. The fact that WH < TOP is excluded from the main clauses is accounted for along the same lines. This linear order leads to the following order domain: (54)
Here, the element with 3 precedes that with 2, which violates (48), which accounts for the ungrammaticality of the sentences in (24). (55)
a. b.
* To whom, a book like this, would you give? * For what kind of jobs during the vacation would you give into the office?
For embedded clauses, on the other hand, (46a) and (46c) require that a topic should be in 2 and a wA-phrase in position 1, respectively. The resulting order is (56): (56)
164
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
This conforms to constraint (48), which correctly predicts the grammaticality of the WH < TOP order in embedded clauses, illustrated by (25), which is repeated below: (57)
a. b. c.
the man to whom, liberty, we could never grant ? I wonder to whom this book, Bill should give. I was wondering for which job, during the vacation, I should give into the office.
Constraint (46b) states that w/z-elements and operators are both assigned to position 1 in main clauses. This accounts for the ungrammaticality of the WH < OP and the OP < WH orders in main clauses: the competition for a single position between these two elements entails that they cannot co-occur. (58) a. * In no way, why would Robin volunteer? b. * Only with great difficulty on which table would she put the big rock? (59) * Why, in no way would Robin volunteer? A wA-phrase is assigned to position 1 in embedded clauses while operators are assigned to position 3, embedded or not. This accounts for the grammaticality of the WH < OP order since its order domains has the 1 < 3 linear order. (60)
Lees wonders why under no circumstances would Robin volunteer.
The OP < WH order is correctly excluded since it entails 3 < 2, which violates (48). (61)
a. b.
* Lees wonders under no circumstances at all why would Robin volunteer. * I wonder only with great difficulty on which table would she put the big rock.
Thus, a linearization-based HPSG approach by Chung and Kim (2003) can provide an account for all the relevant data, including those problematic for an approach in Word Grammar and for the framework of Ginzburg and Sag (2000). Another advantage of Chung and Kim's (2003) approach is that it can also predict the grammaticality with respect to TOP < OP and OP < TOP. The positional assignment represented in (47) predicts that a topic precedes an operator in both main and embedded clauses, and it also predicts, with the Topological Linear Precedence Constraint (48), the ungrammaticality of the OP < TOP order in both types of clauses. This is borne out by the following examples, which illustrate that TOP < OP is no problem but OP < TOP is ungrammatical in main clauses (62) and in embedded clauses (63): (62) a. To John, nothing would we give. b. * Nothing, to John, would we give. (63) a. He said that beans, never in his life, had he been able to stand, b. * He said that never in his life, beans, had he been able to stand.
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS 5
165
Concluding Remarks
In this chapter, we have looked at three different approaches to the asymmetries between main and embedded clauses with respect to the elements in the left periphery of a clause. We compared the dependency-based approach developed within WG (Hudson 2003) with the Constructional HPSG approach along the lines of Ginzburg and Sag (2000), and the Linearization HPSG analysis by Chung and Kim (2003), and argued that the approaches within WG and the Constructional HPSG have some problems in dealing with the relevant facts, but that Linearization HPSG provides a straightforward account of them. As we discussed at the outset of this chapter, dependency structure is simpler than phrase structure in that the former only includes information on the relationship between individual words, but the latter involves additional information about constituency. Other things being equal, simpler representations are preferable to more complex representations. This might lead to the conclusion that WG is potentially superior to HPSG. We have shown, however, that both the dependency-based analysis in WG and the constituencybased analysis in Constructional HPSG are not satisfactory in accounting for the linear order facts. These two frameworks follow the traditional distinction between the rules for word order and the rules defining the combinations of elements.9 We should note, however, that the rules for word order are applied to local trees in Constructional HPSG and to dependency arrows in WG. Sisters must be adjacent in Constructional HPSG whereas in WG the parent and its dependent can only be separated by elements that directly or indirectly depend on one of them. This means that the linear order is still closely tied to the combinatorial structure. That these frameworks cannot accommodate certain linear order facts suggests that neither dependency structure nor phrase structure is appropriate as the locus of linear representation. We saw above that the linearization HPSG analysis gives a satisfactory account of linear order of elements in the left periphery. This conclusion suggests that we need to separate linear order from combinatorial mechanisms more radically than the above traditional separation of the rules. References Abeille, Anne and Godard, Daniele (1997), 'The syntax of French negative adverbs', in Danielle Forget, Paul Hirschbuhler, France Martineau, and Maria L. Rivero (eds), Negation and Polarity: Syntax and Semantics. Amsterdam: John Benjamins, pp. 1-17. Baltin, Mark (1982), 'A landing site for movement rules'. Linguistic Inquiry, 13, 1-38. Borsley, Robert D. (2004), 'An approach to English comparative correlatives', in Stefan Muller (ed. ), Proceedings of the HPSG04 Conference. Stanford: CSLI Publications, pp. 70-92. Borsley, Robert D. and Kathol, Andreas (2000), 'Breton as a V2 language'. Linguistics, 38, 665-710. Borsley, Robert D. and Przepiorkowski, Adam (eds), Slavic in Head-Driven Phrase Structure Grammar. Stanford: CSLI Publications.
166
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Bouma, Gosse, Malouf, Rob and Sag, Ivan A. (2001), 'Satisfying constraints on extraction and adjunction'. Natural Language and Linguistic Theory, 19, 1-65. Chomsky, Noam and Lasnik, Howard (1977), 'Filters and control'. Linguistic Inquiry, 8, 425-504. Chung, Chan and Kim, Jong-Bok (2003), 'Capturing word order asymmetries in English left-peripheral constructions: A domain-based approach', in Stefan Miiller (ed. ), Proceedings of the 10th International Conference on Head-Driven Phrase Structure Grammar. Stanford: CSLI Publications, pp. 68-87. Ginzburg, Jonathan and Sag, Ivan A. (2000), Interrogative Investigations. Stanford: CSLI Publications. Haegeman, liliane (2000), 'Inversion, non-adjacent inversion and adjuncts in CP'. Transaction of the Philological Society, 98, 121-60. Hudson, Richard A. (1990), English Word Grammar. Oxford: Blackwell. — (1995), 'HPSG without PS?'. Available: www. phon. ucl. ac. uk/home/dick/unpub. htm. (Accessed: 21 April 2005). — (1999), 'Discontinuity'. Available: www. phon. ucl. ac. uk/home/dick/disconthtm. (Accessed: 21 April 2005). - (2003), 'Trouble on the left periphery'. Lingua, 113, 607-42. — (2005, Feburuary 17-last update), 'An Encyclopedia of English Grammar and Word Grammar', (Word Grammar). Available: www. phon. ucl. ac. uk/home/dick/wg. htm. (Accessed: 21 April 2005). Kathol, Andreas (2000), Linear Syntax. Oxford: Oxford University Press. — (2002), 'Linearization-based approach to inversion and verb-second phenomena in English', in Proceedings of the 2002 LSK International Summer Conference Volume II: Workshops on Complex Predicates, Inversion, and 0 T Phonology, pp. 223-34. Kim, Jong-Bok and Sag, Ivan A. (2002), 'Negation without head-movement'. Natural Language and Linguistic Theory, 20, 339-412. Koizumi, Masatoshi (1995), 'Phrase Structure in Minimalist Syntax'. (Unpublished doctoral dissertation, MIT), van Noord, Gertjan and Bouma, Gosse (1994), 'Adjuncts and the processing of lexical rules', in Fifteenth International Conference on Computational Linguistics (COLING '94), pp. 250-6. Perm, Gerald (1999), 'Linearization and WH-extraction in HPSG', in R. D. Borsley and A. Przepiorkowski (eds) Slavic in Head-Driven Phrase Structure Grammar. Stanford: CSLI Publications, pp. 149-82. Pollard, Carl, Kasper, Robert and Levine, Robert (1994), Studies in Constituent Ordering: towards a Theory of Linearization in Head-driven Phrase Structure Grammar. Research Proposal to the National Science Foundation, Ohio State University. Pollard, Carl and Sag, Ivan A. (1994), Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press. Przepiorkowski, Adam (1999a), 'On complements and Adjuncts in Polish', R. D. Borsley and A. Przepiorkowski (eds) Slavic in Head-Driven Phrase Structure Grammar. Stanford: CSLI Publications, pp. 183-210. Przepiorkowski, Adam (1999b), 'On case assignment and "adjuncts as complements"', in Gert Webelhuth, Jean-Pierre Koenig and Andreas Kathol (eds), Lexical and Constructional Aspects of Linguistic Explanation. Stanford: CSLI Publications, pp. 231-45. Reape, Michael (1994), 'Domain union and word order variation in German', in John Nerbonne, Klaus Netter and Carl J. Pollard, (eds), German in Head-Driven Phrase Structure Grammar. Stanford: CSLI Publications, pp. 151-98. Rizzi, Luigi (1997), 'On the fine structure of the left-periphery', in Liliane Haegeman
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS
167
(ed. ), Elements of Grammar. Dordrecht: Kluwer Academic Publishers, pp. 281-337. Ross, John R. (1986), Infinite Syntax! New Jersey: Ablex Publishing Corporation. Sag, Ivan A. (1997), 'English relative clause constructions'. Journal of Linguistics, 33, 431-84. Webelhuth, Gert, Koenig, Jean-Pierre and Kathol, Andreas (eds) (1999), Lexical and Constructional Aspects of Linguistic Explanation. Stanford: CSLI Publications.
Notes * I would like to thank Bob Borsley and Kensei Sugayama for their helpful comments. Any errors are those of the author. 1 For comparison of WG with an earlier version of HPSG (Pollard and Sag 1994), see Hudson (1995). 2 In the current version of WG (Hudson 2005), the No-Tangling Principle has been replaced with ORDER CONCORD, whose effects are essentially the same as for its predecessor. In this chapter we will refer to the No-Tangling Principle. 3 The examples in the rest of this section are cited from Haegeman (2000) unless otherwise indicated. 4 (26b) was provided for me by Bob Borsley (p. c. ) 5 (28b) is not acceptable to some speakers (Borsley 2004). 6 The data in (28) could be accommodated in WG if we assumed a dependency relation between the complementizer and the extractee (Borsley (p. c. ); and Sugayama (p. c. )). Needless to say, however, an argument along these lines would need to clarify the nature of this apparently ad hoc grammatical relation. 7 This amalgamation of the SLASH values is due to the SLASH-Amalgamation Constraint (Ginzburg and Sag 2000: 169):
(i) 8 See Kathol (2002) for an alternative analysis for English clausal domains. 9 In constituency-based grammars such as HPSG, these two rule-types are LINEAR PRECEDENCE RULES and IMMEDIATE DOMINANCE RULES.
This page intentionally left blank
Part II Towards a Better Word Grammar
This page intentionally left blank
8. Structural and Distributional Heads ANDREW ROSTA
Abstract Heads of phrases are standardly diagnosed by both structural and distributional criteria. This chapter argues that these criteria often conflict and that the notion 'head of a phrase' is in fact a conflation of two wholly distinct notions, 'structural head' (SH) and 'distributional head' (DH). The SH is the root of the phrase and is diagnosed by structural criteria (mainly, word order and ellipsis),. Additionally, the distribution of the phrase may be conditioned by one or more words in the phrase: these are DHs. The SH is often a DH, but there are many English constructions in which a DH is not the SH and is instead a word subordinated within the phrase. The chapter discusses a variety of these constructions, including: that-dauses; pied-piping; degree words; attributive adjectives; determiners; just, only, even; not, almost, never, all but; the type-of construction; coordination; correlatives; adjuncts; subjects; empty categories.
1.
Introduction
The central contention of this chapter is that a number of constructions in English oblige us to recognize that the distribution of a phrase may be determined by a word subordinated within the phrase, rather than by, as commonly taken for granted, the structural head of a phrase - i. e. the highest lexical node in the phrase. By a phrase's 'distribution' is meant the range of environments - positions - in which it can occur. An example of such a construction is pied-piping (discussed in section 8), as in (la). The root of the phrase in the midst of which throng of admirers is in, but it is by virtue of containing which that it occupies its position before the inverted auxiliary, for (la) alternates with (Ib), but not with (2a-2b). (1) a. b. (2) a. b.
In the midst of which throng of admirers was she finally located? Which throng of admirers was she finally located in the midst of? *In the midst of this throng of admirers was she finally located, *This throng of admirers was she finally located in the midst of.
The head of a phrase is normally understood to be denned, and hence diagnosed, by both structural and distributional criteria. But the notion 'head of a phrase' is in fact a conflation of two wholly distinct notions: the distributional, or 'external', head, and the structural, or 'internal', head. These two types of head are
172
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
explained in sections 2-3. Although I use the term 'phrase' in a mostly theoryneutral way, it is important to realize that it doesn't entail the Phrase Structure Grammar notion that phrases are nonlexical nodes. In Word Grammar (WG), which is the grammatical model that serves as a framework for the discussion of grammatical analysis in this chapter, all nodes are lexical, and WG defines a phrase as a word plus all the words that are subordinate to it1 (A word's 'subordinates' are its 'descendants' in the syntactic tree, the nodes below it; its 'superordinates' are its 'ancestors', the nodes above it) The words in a sentence comprise all the nodes of a tree, and every subtree of the sentence tree is a phrase. 2.
Structural Heads
A phrase's structural head (henceforth 'SH') is, as stated above, to be defined as the highest lexical node in the phrase. In a model such as Word Grammar, in which all nodes are lexical, the SH is, therefore, the root of the phrase's tree structure. For determining which word is the root of a phrase, the principal diagnostic is word order. Take the phrase eat chocolate: if chocolate is the root then there cannot be a dependency between eat and a word that follows chocolate; and if eat is the root then there cannot be a dependency between chocolate and a word that precedes eat. The test shows that eat is the root of eat chocolate: (3) a. b. (4) a. b.
*Do Belgian eat chocolate. ['Do eat Belgian chocolate. '] *Do your eat chocolate. ['Do eat your chocolate. '] Eat chocolate today, Don't eat chocolate.
These restrictions follow from the general (and probably inviolable) principle of grammar that requires phrases to be continuous: no parts of a phrase can be separated from one another by an element that is not itself contained within the phrase. The principle is discussed further in section 15. Diagrammatically, the principle can conveniently be captured as a prohibition against a branch in the syntactic tree structure crossing another, as in (5).
(5) * Do Belhian eat chocolate. [Do eat Belgian chocolate.'] 3.
Distributional Heads
The distribution of a word (or phrase) is the range of grammatical environments it can occur in. In the broadest sense, this includes a word's co-occurrence with both its dependents, e. g. the fact that eat can occur with an object (eat chocolate), and its regent, e. g. the fact that eat can be complement of an auxiliary (will eat). (The term 'regent' is used in this chapter as the converse of 'dependent'. ) But in the narrower and more usual sense employed here, a word's distribution essentially concerns what it can be a dependent of. 'Distribution' in the latter sense contrasts with 'Valency' (or 'Selection'), which concerns what a word can
STRUCTURAL AND DISTRIBUTIONAL HEADS
173
be regent of. As a first approximation, we can therefore say that the distribution of X is the product of rules that such and such a regent permits or requires a dependent belonging to a category that X belongs to. But the topic of this chapter is such that instead of that first approximation we need, at least pretheoretically, to formulate this in terms of the notion 'distributional head': when a word permits or requires a dependent of category X, it permits or requires a dependent that is a phrase whose distributional head is of category X. Models of syntax have generally held that something is a distributional head (henceforth 'DH') if and only if it is a SH - in other words, that a phrase has just one sort of head, and that this single head determines both the structural and the distributional properties of the phrase. But my first aim is to show that the two sorts of head must be distinguished. Normally the two sorts of head coincide, so that one word is both SH and DH of a phrase - i. e. that the root of a phrase determines its distribution. This is generally known as 'endocentricity'. But a fair number of constructions in English suggest that that norm cannot be exceptionless. (And, as we will see later, once we acknowledge that the norm has exceptions, there is reason to question whether it is in fact even much of a norm at all. ) In these constructions, the SH is not the DH. This is exocentricity. But the constructions involve a very particular kind of exocentricity: in them, the DH is subordinate to the SH. That is, the distribution of the phrase is determined not by the root of the phrase but by a word subordinated more or less deeply within the phrase. I will call this phenomenon 'hypocentricity', since the DH is below the SH in the tree. Although the notion 'structural head', defined as the root of a phrase, has a role in the formal analysis of hypocentricity, the notion 'distributional head' does not, and is purely descriptive. This is because it turns out that a phrase may have many distributional heads. This can be illustrated as follows. Section 11 argues that in 'determiner phrases' (i. e. 'noun phrases' in the traditional sense), the determiner is SH and the noun is DH. This is illustrated in (6a), where, as in subsequent examples, small capitals indicate the SH and italics the DH. And section 8 argues that in pied-piping in wh-relative clauses, the whword is DH, so SH and DH are as indicated in (6b-6c). But in (6c) the locus of the DH also follows the pattern of (6a), giving (6d), where there is one SH, the, and two DHs, news and which. (6) a. [THE news] had just reached us b. [NEWS of which] had just reached us c. [THE news of which] had just reached us d. [THE news of which] had just reached us In the formal analysis of hypocentricity introduced in section 4 and presented in full in section 6, a phrase's DHs are defined relationally relative to the SH. So, although I said in section 1, in framing the discussion of hypocentricity, that the notion 'head of phrase' is a conflation of two sorts of phrasal head, the structural and the distributional head, it would be more accurate to say that the traditional notion 'head of a phrase' remains valid, but that it is defined by structural criteria, as the phrase root, and, contrary, to what is usually thought,
174
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
not by distributional criteria. The distribution of a phrase may be conditioned by categorial properties of its (structural) head, but it may equally well be conditioned instead by categorial properties of words more or less deeply subordinated within the phrase. As we will see in section 15 and section 17, once the criteria for identifying the phrasal head are solely structural and not distributional, we are led to transmogrify the familiar WG analyses of the structure of many constructions into radically new but more satisfactory forms. In the following sections I discuss a number of constructions where there is prima facie reason to think that they might be hypocentric. It is beyond the scope of this chapter to agonize over the details of the structure of each construction, so by and large my identification of the SH in each construction will rest more on prima facie plausibility than on detailed argumentation. 4.
77m*-Clauses
In a ^/-clause, the SH is that, which explains why it must be at the extreme left edge of the clause. But the DH of the thai-clause is the finite complement of that. The evidence for this is that the clausal complement of certain verbs, such as require and demand, must be subjunctive.2 So the DH is the subjunctive word; it is the presence of the subjunctive word that satisfies the selectional requirements of require I demand: (7) a. I require [THAT she be/*is here on time]. b. I demand [(THAT) she give/*gives an immediate apology]. A satisfactory analysis of this phenomenon is provided in Rosta (1994, 1997) (from whose terminology I deviate in this chapter without further comment). That is defined (in its lexical entry) as 'surrogate' of its complement, the finite word. As a general rule, every word is also surrogate of itself; so the finite word is surrogate of itself. Require /demand select for a complement that is surrogate of a finite word. Since the surrogates of a finite word are itself and that (if it is complement of that), the selectional requirements of require/demand can be satisfied by that or by a finite word. Surrogacy accounts for some hypocentric constructions, but not all. We return to this point in section 6. 5.
Extent Operators
I adopt 'extent operator' as an ad hoc term to cover such items as all but, more than, other than, almost, not, never, which do not necessarily form a natural grammatical class but do at least have certain shared properties that warrant their being discussed together here. Reasons will be given why, when an extent operator modifies a predicative word, as in (8a-8f), or a number word, as in (9a-9f), the extent operator appears to be SH and the number or predicative word to be DH. (8) a. She had [ALL but expired}. b. My rent has [MORE than doubled]. c. She was [OTHER than proud of herself].
STRUCTURAL AND DISTRIBUTIONAL HEADS
175
d. My rent has [ALMOST doubled]. e. Her having [NOT had a happy childhood], he was inclined to be patient with her. f. Her having [NEVER seen the sea before], this was a real treat. (9) a. [MORE than thirty] went. b. [ALMOST thirty] went. c. [BARELY thirty] went. d. [OVER/UNDER thirty people] went. e. [NoT many] know that. f. [NOT two minutes] had elapsed before the bell rang. The identification of the DH is probably not very controversial, but the justification for it is most apparent in (8a, b, d, e, f), where auxiliary have requires a past participle as its complement, and it is the DH that satisfies this requirement. Note also that as demonstrated by (lOa-lOb), verbal number inflection is triggered by the number of the DH rather than the SH or the meaning of the whole phrase: (10) a. [MORE than one] is/*are. b. [LESS/FEWER than two] are/*is. More controversial is the identification of the SH, and the evidence for this will now be presented. First of all, there is the evidence of meaning: the bracketed phrases could all be described as 'semantically exocentric'. For instance, (8b, 8d) don't refer to an event of doubling, and (9a-9d) don't refer to a quantify of 30. Rather, the meanings are roughly thus: (8a, 8d): 'My rent has increased by a factor that is more/slightly less than 2' (8c): 'She was in a state that is other than a state of being proud of herself (9a-d): 'a set whose cardinality is a number more than/almost/barely/over/ under thirty' (9e): 'a set whose cardinality is a number that is not many' (9f): 'a set (of minutes) whose cardinality is a number that is not two', or 'a period that is not two minutes' There is no prior theoretical reason to suppose that the 'semantic head' should in general be the SH rather than the DH; if anything, one would expect the DH to align with the semantic head, given that it is the DH that seems to be the more visible from outside the phrase. But it remains the case that in these constructions the extent operator is closest to being the semantic head. For example all in (8a) might be taken to mean 'an act that stops slightly short of outright expiring', and in (9d), over/under might be taken to mean 'a number - a place in number space - that is over/under thirty', just as under the table means 'the place under the table' in (lla-c).3 (11) a. Under the table is an obvious place to hide. b. Let's paint under the table black. c. The pupils always cover under the table with chewing gum.
176
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
The second kind of evidence for the identification of the SH comes from certain of the extent operators' more core variants that have a nominal complement, notably all but JVP, more than JVP and over/under JVP. It is easy to demonstrate that in these variants, all, more and over /under is both SH and DH. For instance, in (12a) we have me rather than / and are rather than am because the subject is all rather than me/L4' (12) a. All but me/*I are/*am to come. b. More than me/*I are/*am here. c. Over/Under us/*we seems/*seem unsuitable for the storage of radioactive waste. The third and most telling sort of evidence comes from ellipsis. Ellipsis involves the deletion of the phonological content of some syntactic structure, and it seems to operate rather as if (the phonology of) a branch of the syntactic tree were snipped off. Thus if the phonological content of one node is deleted, then so must be the phonological content of all nodes subordinate to it. 5 So, if we have established that a branch links two nodes, X and Y, and X's phonology remains when Y's is deleted, it must follow that Y is subordinate to X. And we find that with certain extent operators, including not, nonstandard never (meaning 'not' rather than 'nowhen') and almost, their phonology can remain when the phonology of the DH is deleted. (Words whose phonology is deleted are shown in subscript. ) (13) a. %I would prefer that you be so not mde. b. %I know you want to do it, but try to not c. Would I do it? I wouldn't not do it d. We'll make him not e. I know it's unmanly to flinch, but how can you stand there and not fi^ch? f. You can't go out without knickers - not go out without knickers and still stay decent g. %She never stoie your cigarette lighter- ['She didn't'] h. %Did she do it? No, but she almost (13g) is dialectal, and I have also marked (13a, b, h) as subject to variant judgements, because some speakers reject them, but all of (13a-13h) are acceptable for some speakers, and that is what matters here. The conclusion is that the deleted DH is subordinate to the extent operator, which is therefore the SH.6 We have established, then, that the internal structure of these phrases is as shown in (14a-14b). This raises two questions. The first, which is addressed in section 6, concerns the structure of (15a-15b): how can structures (14a-14b) be reconciled with the fact that it is perished that satisfies the selectional requirements of had?
(14)
all but perished almost perished
STRUCTURAL AND DISTRIBUTIONAL HEADS
177
(15) a. She had all but perished, b. She had almost perished. The second question concerns the structure of (16a-16b). Other things being equal, we would expect (16a-16b) to be ungrammatical due to illicit word order, as diagrammed in (17a-17b), while we would expect (18a-18b), whose word order remedies the apparent illicitness of (17a-17b), to be grammatical. (16) a. I know she all but perished, b. I know she almost perished. (17) a. I know she all but perished.
b. I know she almost perished.
(18) a. * I know all but she perished. ['She all but perished. ']
b. * I know almost she perished. ['She almost perished. '] It seems, then, that (16a-16b) must involve something along the lines of obligatory 'leftwards-extraposition' of the subject; the subject moves from its ordinary position and ends up as a subordinate of the extent operator, as diagrammed in (19a-19b).7 We return to this matter in section 17. 2, where a far more satisfactory solution is provided. (19) a.
I know she all but perished. SUBJECT
I know she almost perished. SUBJECT
6.
Surrogates verus Proxies
In section 4 we observed that require/demand requires as its complement the surrogate of a finite word. This requirement is satisfied by the finite word itself, (20a), by clausal that, (20b), and by extent operators (20c). (The list is not exhaustive. )
(20) a.
She demanded he go.
b. She demanded that he go. c.
She demanded he almost go.
178
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
But as (21a-21c) show, the complement of clausal that can be a finite word, or an extent operator, but not another that. The same pattern holds for complements of extent operators, (22a-22f). (Structurally, almost in (21c) and (22d) occurs in the position where the complement of but and that is expected. For this reason, I conclude that almost (rather than perished] is indeed the complement of but and that. And, as pointed out in section 5, almost is the semantic head in almost perished: it means 'an event near to being an event of perishing'. ) Hence it cannot be the case that the selectional requirements of that or of extent operators are such that any surrogate of a finite word will satisfy them. (21) a. I know that she went. b. * I know that that she went. c. I know that she almost went. (22) a. I know that she almost perished. b * I kno'w she almost that perished. c. I know that she almost almost perished. d. I know that she all but almost perished. e. I know that she almost all but perished. f- Anybody not not happy should raise their hands now. (20-22) show that there are two types of hypocentric phrase. In one type, the SH is surrogate of the DH, and the SH can be that, an extent operator, or the DH, the finite word. In the other type, the SH can be an extent operator or the DH, but not that. To capture this pattern, which, as we will see in later sections, generalizes across many diverse constructions, we need to posit a subtype of of the Surrogate relation, which I will call 'Proxy'. So, whereas require/demand select for a complement that is surrogate of a subjunctive word, that selects for a complement that is proxy of a finite word. Likewise for extent operators: almost and but (in all but} select for a complement that is proxy of a finite word (or of whatever other sorts of word extent operators can modify). In general, the format for selectional rules will be not (23a), but rather (23b23c). Rules of type (23a) seem to be surprisingly scarce: I am currently aware of only one instance, which is discussed in section 16 (in examples (73a-73c)).
STRUCTURAL AND DISTRIBUTIONAL HEADS
179
(23) a. the complement of X is a word of category Y b. the complement of X is proxy of a word of category Y c. the complement of X is surrogate of a word of category Y Given that rules of form (23b-23c) are more cumbersome than the rules of form (23a) that we are accustomed to, we will introduce an abbreviating equivalent for (23b-23c), and say that X 'targets' Y for its complement (but, implicitly, will accept Y's surrogate or proxy in lieu of Y); Y is X's 'complement target'. The key rules defining Surrogate and Proxy are (24a-24c): (24) a. If X is proxy of Y, then X is surrogate of Y. b. X is proxy of X. c. If X is surrogate of Y, and Y is surrogate of Z, then X is surrogate of Z. More specific rules of the grammar define what is surrogate or proxy of what. It defines that as surrogate (but not proxy) of its finite target, and it defines extent operators as proxy of the modified word. Informally, we can distinguish between different degrees of hypocentricity. In its strongest form, phrases [X[Y]] and [Y] are in free variation: i. e. [Z[X[Y]]] is possible if and only if [Z[Y]] is possible. The analysis for this sort of case is that X is proxy of Y, and the complement of Z must be proxy of Y. Extent operators are examples of strong hypocentricity. In a weaker form of hypocentricity, phrases [X[Y]1 and [Y] are in free variation only in certain environments, e. g. [Z[Y]] alternates with [Z[X[Y]]], but [W[Y]] does not alternate with *[W[X[Y]]]. The analysis for this sort of case is that X is surrogate of Y, and the complement of Z must be surrogate of Y. That clauses are an example of this weaker form. But there is potentially a still weaker form of hypocentricity, 'quasihypocentricity', in which [Z[X[Y]]] does not alternate with [Z[Y]] at all, but nonetheless Z and Y are sensitive to each other's presence; for instance, it might be that X is eligible to be complement of Z only if Y is complement of X, or it might be that the presence of Z is a condition for Y's inflectional form, or vice versa. Some (somewhat rarefied) examples of quasihypocentricity are discussed in section 17. 2-3, where the analysis of quasihypocentricity is elaborated on a little. 7.
Focusing Subjuncts: just, only, even
Just, only and even, called 'focusing subjuncts' in Quirk et al. (1985), behave rather like extent operators with regard to hypocentricity. The DH of even Sophy in (25a) is Sophy, and it is to the DH that the inflectional morphology is sensitive, as (25b) shows. The DH is complement target of the subjunct, and the subjunct is proxy of its complement target: (25) a. EVEN Sophy} would b. Even I/*me am/*is.
180
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
The identification of the focusing subjimct as the SH needs some justification. If the subjunct is not SH, then the structure is one of (26a-26b). (Classing focusing subjuncts as adverbials, as Quirk et al. (1985) do, implies (26b). ) (26) a. Even Sophy would. b. Even Sophy would. (26a) fails to account for why the subjunct must be at the extreme edge of the focused phrase. With the subjunct as SH, as in (25a), the ungrammaticality of (27a) is predicted by the crossing branches. But no such prediction is made with structure (26a), as in (27b): (27) a. * Sophy even's parents would. b. Sophy even's parents would.
(26b), with the subjunct as dependent of the verb, incorrectly predicts that (2829) should be ungrammatical. In (28a-28b) there are crossing branches. In (29a-29b), Edgar does not occupy the position immediately following its regent (gave and will), even though that normally results in ungrammaticality, as with (30a-30b). But the incorrect predictions vanish if the structures are as in (3132). (28) a. She stepped in only two puddles. b. Pictures only of eyelashes were recommended. (29) a. She gave even Edgar flowers. b. Will even Edgar relent? (30) a. * She gave today Edgar flowers. b. * Will today Edgar relent? (31) a. She stepped in only two puddles. b. Pictures only of eyelashes were recommended. (32) a. She gave even Edgar flowers.
STRUCTURAL AND DISTRIBUTIONAL HEADS
b. 8.
181
Will even Edgar relent?
Pied-piping
The roots of the bracketed phrases in (33a-33b) are under and on. But the phrases occupy their position before the inverted auxiliary by virtue of containing a negative element, no, or an interrogative wh-word, which. The root of the bracketed phrase in (33c) is on or should, depending on which is subordinate to which in one's analysis, but the DH is which: the rule for finite wh-relative clauses is that they contain a subject or topic phrase that contains a relative wh-word. In contrast to certain other hypocentric constructions, the semantic head of the phrase appears to be the DH, the wh-word, since semantically it is the relationship of binding or equation that connects the relative clause to the modificand. (33) a. [UNDER no circumstances] would she consent. b. [ON the corner of which streets] should we meet? c. streets [on the corner of which we should meet] As with the inside-out interrogative construction discussed in section 13, the SH in pied-piping is surrogate of the DH, and - in relative clause pied-piping at least - the surrogate relation is long-distance, i. e. there is no upper limit to the number of nodes that can be on the path up the tree from DH to SH. 9.
Degree Words
In the phrases bracketed in (34a-34d), the degree words (too, more, as) modify the adjectives that are the distributional head. If the DH were also the SH, we would expect complements of the degree word to appear on the same side of the adjective as the degree word does, as in the ungrammatical (35a-35d), in order to avoid the sort of crossing branches diagrammed in (36). (34) a. This is [TOO heavy for him to lift]. b. He is [TOO tough to shed the odd tear]. c. She is [MORE sophisticated than him]. d. She is [AS sophisticated as him]. (35) a. *This is too for him to lift heavy. b. *He is too to shed the odd tear tough. c. *She is more than him sophisticated. d. *She is as as him sophisticated. (36) He is too tough to shed the odd tear. A standard solution to this problem would have the adjective as SH and the complement of the degree word obligatorily extraposed. But as far as I am aware, this solution is motivated solely by the lack of any mechanism for
182
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
handling hypocentricity, and not by independent evidence. By purely structural and not distributional criteria, it is the degree word that is prime candidate for being SH. Accordingly, we take the modificand to be complement target of the degree word, and the degree word to be surrogate of its complement target. 10. Attributive Adjectives The same kind of argument that suggests that degree words are structurally superordinate to the adjectives they modify also suggests that attributive adjectives are structurally superordinate to the common nouns they modify. In (37a), to read is complement of easy, but if book is SH then we would expect the word order to be impossible, (37b). The word order favours easy as SH, (37c). Again as with degree words, a construction-specific rule of obligatory extraposition is an alternative solution. (37) a. an [EASY book to read] b. an [EASY book to read] c. an [EASY book to read] The nonextrapositional analysis copes well when a degree word modifies an attributive adjective, for it correctly predicts that a complement of the degree word must follow the noun modified by the adjective, as (38):
(38) a more sophisticated person tfran him To rule out (39), the extrapositional analysis would have to posit that than him first extraposes to become some sort of postdependent of sophisticated, in line with the extraposition rule that applies to degree words, and then extraposes again to become a postdependent of the noun, in line with the extraposition rule that applies to attributive adjectives. (39) *a more sophisticated than him person The suggested analysis is that attributive adjectives are not adjuncts. Rather, they take the proxy of a noun as their complement, and the adjective is proxy of its complement. Further evidence for this analysis comes from ellipsis:
(40) She chose an easy puzzle and he chose a difficult puzzle 11. Determiner Phrases It was suggested in section 10 that attributive adjectives target a noun for their complement, the resulting phrase having the adjective as SH and the noun as DH. In this section I present other examples of hypocentric NPs (or DPs).
STRUCTURAL AND DISTRIBUTIONAL HEADS
183
Hudson (2004) notes the contrast (41a-41b). It is the presence of an instance of the lexeme WAY that allows the noun phrase to be an adverbial. Hence way is the DH: (41) a. She did it this way. b. *She did it this manner. Generalizing beyond this example, the DH of the determiner phrase is the noun that is its complement target. Evidence for this comes from extraposition out of subjects, for only dependents of the DH can extrapose out of a subject. For example, (42a) can yield (42b) by extraposition, since the extraposee is a dependent of statement, the DH of the subject phrase. But in (42c), the DH of the subject phrase is author, so extraposition of a dependent of statement is ungrammatical. On the assumption, justified below, that the determiner is SH, (42d) shows that it is not the case that dependents of the actual subject (the SH, those] can extrapose. And the apparent exception presented by die grammaticality of (42e), where a dependent of statement is extraposed but at first glance the DH would seem to be sort, in fact serves to confirm that, as argued below in section 12, sort of phrases are hypocentric, so statement is DH of sort of statement and of the whole subject phrase. (42) a. A statement that denies the allegation has been released. b. A statement _ has been released [that denies the allegation]. c. *The author of a statement _ has been arrested [that denies the allegation]. d. * Those _ have been released [statements that deny the allegation], e. A curious sort of statement _ has been released [that denies the allegation]. But it is the determiner that is SH. The evidence for this comes from word order and ellipsis. The word order evidence is that the determiner always occurs at the extreme left of the phrase, a fact that follows automatically if the noun is subordinate to the determiner but that would be unexplained if the determiner is subordinate to the noun. The ellipsis evidence is that the noun but not the determiner can be deleted. Thus (43a-43b) are synonymous, whereas (44a-44b) are not, and there is no reason to suppose that any determiner is present in (44b). (43) a. b. (44) a. b.
These scales are not working properly, These scaies are not working properly. This milk is off. Milk is off.
Indeed, the ellipsis can perfectly well apply to way, as in (45a), the phonological visibilia of a syntactic structure whose lexical content is given in (45b). (45) a. You do it your way and I'll do it mine. b. You do it you's way and I will do it me's way. There is not much in the way of argument against treating the determiner as SH. Van Langendonck (1994) argues that treating the determiner as head in
184
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
this booklthat book fails to capture the analogy with the adjective phrases this big I that big. But section 9 has argued that in this big/that £fg, the SH is this /that, and big is DH, so the analogy is captured. There are other examples that can be used to make the point made by (41a41b), but none that are quite so convincing. The verb crane appears to require X's neck as its object, but it's hard to prove that this is not merely the consequence of the verb's meaning, which specifies that the cranee is the neck of the craner. A somewhat more convincing example is (46): wreak for many speakers requires an object whose DH is havoc, and no synonym will suffice in its stead. (46) The storm will wreak the usual havoc/%devastation. The same appears to hold for cognate objects: (47) a. She smiled her usual smile/*grin. b. She slept a deep and placid sleep/* slumber/* somnolence/*kip. But the contrast in (48a-48c) shows havoc and cognate objects to be unlike way adverbials, and makes it hard to maintain that the presence of havoc and smile in (48b-48c) is a syntactic requirement. 8 (48) a. *She did it something that fell short of a wholly sensible way. b. The storm will wreak something that falls short of outright havoc. c. She smiled something that fell short of the sweet smile we had come to expect from her. The relationship between determiner and noun is analogous to that between clausal that and finite word. Just as multiple that is ungrammatical (See that (*that) she does), so are multiple determiners: the (*the) book, or, more plausibly, *a my book, fa book of mine'. Clausal that takes the proxy of a finite word as its complement, and is surrogate of its complement Likewise, the determiner takes the proxy of a common noun as its complement, and is surrogate of its complement This analysis predicts that (49) should be ungrammatical. I am not sure that that prediction is correct, though. (49) ?She did it the opposite of a sensible way. (49) ?She did it the opposite of a sensible way. The bracketed phrase in (50a) is hypocentric. The SH and DH are as shown. The SH is proxy of the DH. The of in this construction takes as its complement the proxy of a common noun. Since determiners are surrogate but not proxy of their complement target, this rules out (50b) (at least as an instance of this hypocentric construction). (50) a. these [TYPES/KINDS/SORTS/VARIETIES/MANNERS/CLASSES of dog] b. * these types of a dog Some evidence for the hypocentricity of this construction has already been given in section 11. Further evidence is as follows.
STRUCTURAL AND DISTRIBUTIONAL HEADS
185
First, there is the grammatically of (51a-51b). The adverbial function of the noun phrase is licensed by virtue of having way as its DH: (51) a. Do it the usual sort of way. b. Do it the same kind of way you always do. Second, dog in (50) needn't have the coerced mass interpretation that it gets in There was dog all over the road. Normally, a noun can receive a count interpretation only if it is the complement target of a determiner; bare, determinerless nouns must receive a mass interpretation.9 If types in (50) is proxy of dog, then dog can be complement target of these and hence receive a count interpretation. It seems that in type of X, type is optionally rather than obligatorily proxy of X. (52a) is ambiguous between a reading equivalent to (52b), with cake receiving a count interpretation, and a reading equivalent to (52c), with cake receiving a mass interpretation. When it receives the count interpretation, cake (and type) is complement target of a determiner (presumably a), and type is proxy of cake I brick. When it receives the mass interpretation, type is not proxy of cake/brick, and the only complement target of a is type. (52) a. A strange type of cake was on display. b. A cake of a strange type was on display. c. Cake of a strange type was on display. The third and last piece of evidence for the identification of the DH is as follows. (53a) is paraphrasable as (53b), (54a) as (54b), and (56a) as (56b56c). 10 But (55a) is trickier to paraphrase. (55b) is ungrammatical for some reason. 11 (55c)/(56b) is a possible paraphrase of (55a), but it is ambiguous, because it also paraphrases (55b). The only unambiguous paraphrase of (55a) is (55d). And in (55d) we find that these agrees in number with cakes but not type. Hence cakes is DH of type of cakes. (53) a. b. (55) a. b. c. d.
(a) cake of this type this type of cake cakes of this type *this type of cakes these types of cakes %these type of cakes
(54) a. b. (56) a. b. c.
cake of these types these types of cake cakes of these types these types of cakes these types of cake
13. Inside-out Interrogatives The italicized phrases in (57a-57f) are instances of what I will call the 'insideout interrogative' construction. (57) a. b. c. d. e. f.
She always chooses nobody can ever guess which item from the menu. It was hidden in the middle of nobody could tell where. She's been going out with I've no idea who. She managed to escape nobody was able to fathom how. She smokes goodness only knows how many cigarettes a day. The drug makes you you can never be sure how virile.
186
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
The construction is functionally motivated by the impossibility of relativizing out of an interrogative clause, as in (58), so from a functional if not a structural perspective it ought to be seen as a kind of relative clause. (58) *in the middle of somewhere^ nobody could tell which place _ z was. By all appearances these phrases have the internal structure of a clause that itself contains an interrogative clause in which sluicing has occurred, as in (59a59f). This is why the structure is 'inside-out': (57a) means something like if not (59a) then at least 'It was hidden in the middle of a place such that nobody could tell which place it was'. (59) a. Nobody can ever guess which item from the yS Cmenu h00sesshe aiwa b. Nobody could tell where it was hidden in ^ middle of. c. I've no idea who she>s been going out withd. Nobody was able to fathom how she managed to escape. e. Goodness only knows how many cigarettes a day she smokesf. You can never be sure how virile ^ drug makes you. But inside-out interrogative phrases have a distribution equivalent to that of the interrogative wh-word they contain, as in (60a-60f) - setting aside for a moment the shift to question-meaning. (60) a. b. c. d. e. f.
She always chooses which item from the menu? It was hidden in the middle of where? She's been going out with who? She managed to escape how? She smokes how many cigarettes a day? The drug makes you how virile?
To put it another way, this in (6la-6If) can be replaced by an inside-out interrogative to yield (57a-57f). (61) a. b. c. d. e. f.
She always chooses this item from the menu. It was hidden in the middle of this place. She's been going out with this person. She managed to escape this way. She smokes this many cigarettes a day. The drug makes you this virile.
The SH of the inside-out interrogative is the root of the clause, i. e. a proxy of a finite word. The essence of the construction is that this proxy of a finite word is licensed to be surrogate of an interrogative wh-word that is complement of (a subordinate of) the finite word. Since the SH is surrogate of the wh-word, the SH is also surrogate of whatever the wh-word is surrogate of. The key surrogacy relations in (57a-57f) are indicated by the dotted arrows in (62a-62f), which assume that where, who and manner (but not degree) how are the phonological expression of what is, syntactically, which place, which body (meaning 'person', as in somebody) and which way. The SH of the bracketed phrases is in small capitals.
STRUCTURAL AND DISTRIBUTIONAL HEADS
187
(62) a. She always chooses [nobody CAN ever guess which item from the menu].
b. It was hidden in the middle of [nobody COULD tell which place].
c. She's been going out with [I'VE no idea which body].
d. She managed to escape [nobody WAS able to fathom which way].
e. She smokes [goodness only KNOWS how many cigarettes a day].
f. The drug makes you [you CAN never be sure how virile]. Section 11 explains why which is surrogate of items/place I body/way. Section 9 explains why how is surrogate of many and virile. Section 10 explains why many is surrogate of cigarettes (on the assumption that many is some kind of attributive adjective). Because how is surrogate of many, and many is surrogate of cigarettes, how is surrogate of cigarettes. Hence, in (62a, b, c, e), the SH in small capitals is surrogate of a noun, thus satisfying the requirement of chooses, o/and with for a complement that is surrogate of a noun. In (62d), the SH is surrogate of way, which makes it eligible to function as a manner adverbial. In (62f), the SH in small capitals is surrogate of an adjective, thus satisfying the requirement of makes for a complement that is surrogate of an adjective. 12 14. 'Empty Categories' WG has so far not embraced the empty categories so beloved of other models, chiefly Transformational Grammar. But there is no fundamental incompatibility between WG and empty categories, if empty categories are taken to be phonologyless words. As I will briefly detail below, empty categories would be a beneficial enhancement to WG, so it is worth considering how they would work in WG. As also explained below, though, they do raise a certain problem, but this problem is solvable by means of the Proxy relation, though not by means of hypocentricity. This is why empty categories warrant a short section in this chapter. Since all nodes in WG are words, the WG counterpart of empty categories would in WG be a word, an instance of the lexical item '<£>', which is phonologyless and has the semantic property of expressing a variable.13 (Phonologyless words are notated within angle brackets. ) Positing affords both better analyses of the data, and significant simplifications to the overall model. The principal simplification comes if
188 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE syntactically bound occurs in positions where, in a transformational model, traces (or other bound empty categories) occur. This would give us the sort of structure shown in (63a), in contrast to the traditional WG analysis shown in (63b). ('Binder' is, needless to say, a syntactical relation between a word and another word that binds it. ) OBJECT (63) a. Whatdid she say he had been hoping to eat ? BINDER
b. What did she say he had been hoping to eat? OBJECT Traditional WG makes a distinction between dependencies - dependency tokens, that is, not dependency types - that form branches of the sentence tree, and dependencies that don't. For example the object dependency from eat to what in (63b) doesn't form a branch in the tree. Word order rules apply only to dependencies that form branches. (64b) is ungrammatical because the indirect object (every child in the class) must follow its regent, give. (64c) is ungrammatical because the indirect object must precede the direct object (a gold star). But (64d-64e) are grammatical, even though the indirect object does not follow given in (64d) and does not precede the direct object in (64e), because the indirect object is not a branch dependent of given. (64) a. b. c. d. e.
The teacher will give every child in the class a gold star. *The teacher will every child in the class give a gold star. *The teacher will give a gold star every child in the class. Every child in the class was given a gold star. Also given a gold star were all the children in the class.
But the < e > analysis allows us to do away with the distinction between branch and nonbranch dependencies: with the sole exception of Binder, all dependencies are branches. The syntactic structure of a sentence is just a tree with labelled branches, supplemented by nonbranch relations of types Binder, Surrogate and Proxy. (Even the branch labels are potentially redundant, given that a branch is distinguished from its siblings by its position. ) Thus, the whole apparatus of syntactic structure can be significantly simplified, for the price of merely one extra lexical item among thousands. On the assumption that unbound is interpreted as 'something/ someone', we are then in a position to posit structures for (65a-65d)14 that yield the meaning that the sentences actually have. Furthermore, the presence of < e > in (65c-65d) provides a way to capture the fact that even though there is no overt or deleted object of keep or subject of alive, semantically the object of keep is still understood to be the subject of alive. ((65c) is the structure one would have if unbound < e > is added to otherwise orthodox WG. (65d) is the structure I am proposing. )
STRUCTURAL AND DISTRIBUTIONAL HEADS
(65) a.
189
She was reading .
b.
Thou shalt not kill ...
c.
... but need'st not strive officiously to keep alive. SUBJECT
d. ... but need'st not strive officiously to keep alive. BINDER
The main snag with < e > has to do with the phenomenon of connectivity, whereby traces have to have the categorial properties of their binder, i. e. of what they're traces of. An adjective leaves an adjectival trace, a noun leaves a nominal trace, and so forth. This is so that the trace can satisfy the categorial selectional requirements imposed on the position the trace occupies. For example, the subject-raising verb wax requires an adjectival complement. So if is Section 11 explains why which is surrogate of items/place I body/way. Section 9 wroth. (66)
How wroth did she wax ? BINDER
If we had to introduce invisible words of every conceivable word class, and add rules requiring them to agree in word class with their binder, then this would be very much the opposite of a simplification to the grammar. But the Proxy relation provides a simple solution, if is proxy of its binder. The selectional requirements of wax are that it takes a complement that is surrogate of an adjective, and this requirement is satisfied in (66), since (i) how is binder of and hence is proxy of how; (ii) being a degree word, how is surrogate of wroth; and (iii) is therefore surrogate of wroth. In all the other cases discussed in this chapter where the Proxy relation is to be found, it occurs in a hypocentric phrase, where the SH is proxy of the DH, which is subordinate to the SH. But this clearly does not apply to the proxy relation holding between and its binder. The conclusion to be drawn from this is that rather than the Surrogate and Proxy relations being merely convenient ways to formalize hypocentricity, they are in fact fundamental, and hypocentricity is merely a convenient label for phrases whose root is surrogate of one of its subordinates. 15. Coordination Since its beginnings, WG has analyzed coordination as exceptional to major and otherwise exceptionless principles. The first exception is that whereas the rest of syntax consists solely of dependencies between lexical nodes, i. e.
190
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
words, coordinate structures employ nonterminal, nonlexical nodes, which are linked to other nodes not by dependencies but by part-whole relations. The nonterminal nodes are of types Conjunct and Coordination. For example, in (67), the coordination node, marked by curly brackets, is mother of two conjunct nodes, marked by angle brackets, and of and. The first conjunct node is mother of Sophy and of roses, and the second is mother of Edgar and of tulips. (67)
Give {< [Sophy] [roses] > and < [Edgar] [tulips] >}.
Coordination is thus an exception to the principle of Node Lexicality, which requires all nodes to be lexical. The second exception is that branches in the tree can cross only where there is coordination, as shown in (68):
(68)
He thinks she made her excuses and left.
Latterly, WG has handled this exception by doing away with a No Crossing Branches principle, and replacing it with a principle of 'Precedence Concord', which states that if X is a subordinate of Y, and Y is a subordinate of Z, then the precedence of X relative to Z must be the same as the precedence of Y relative to Z; so if X precedes Z then so must Y, and if X follows Z then so must Y. To this principle, (67-68) are not an exception. But nor is (69) an exception to it either, and (69) is ungrammatical, due to the crossing branches. Hence a principle of No Crossing Branches is still required, and (68) is still an exception to it.
(69) * Give students tulips of linguistics. ['Give students of linguistics tulips. '] Third, coordination is an exception to the principle of 'Branch Uniqueness', which requires each dependency between words and its dependents to be of a different type. Hence a word cannot have more than one subject or more than one object, and so on. 15 But in WG's analysis of (68), give has two indirect objects and two direct objects. It is easy enough to reformulate Branch Uniqueness so that it applies only to dependents that aren't conjoined with each other, but that just raises the question of why there should be such an exception. Other things being equal, WG's model of grammar would be both simpler and more plausible if the principles of Node Lexicality, No Crossing and Branch Uniqueness were exceptionless. This calls for a wholly different analysis of coordination. 16 I propose that coordinations are hypocentric phrases whose SH is the conjunction. Each conjunct is DH. The conjuncts are dependents of the conjunction, and the conjunction is proxy of its dependents.
STRUCTURAL AND DISTRIBUTIONAL HEADS
191
(70) She ate [apples AND oranges], At a stroke, the exceptions to Node Lexicality and Branch Uniqueness are eradicated. There are no nonlexical nodes. Branch Uniqueness is preserved, because ate in (70) has only one object, namely and. As for No Crossing, and the analysis of (68), we return to this in section 17. 2, which provides an analysis that does not violate No Crossing. The obvious glaring objection to this analysis of coordination comes from complex coordination, as in (7la), where the conjuncts appear not to be single phrases. (As pointed out in Hudson (1976), the position of the correlative shows that (7la) cannot be derived by deletion from (71b). ) But the objection can be turned on its head, and (7la) can be taken as evidence that Sophy tulips is in fact a single phrase. Section 17. 4 provides a vague sketch of how this could be. (71) a. Give both Sophy tulips and Edgar roses. b. * Give both Sophy tulips and give Edgar roses. 16. Correlatives Another instance of hypocentriciry in coordination arises with correlatives (both, either, neither). The correlative's position at the extreme edge of the phrase follows if it is SH. The conjunction is complement of the correlative, and the correlative is proxy of the conjunction.
(72) a. She eats [BOTH apples and oranges]. b. She eats [EITHER apples or oranges]. c. She eats [NEITHER apples nor oranges]. Correlatives are one of the very rare instances, mentioned in section 6, of words whose complement is their complement target rather than a proxy of their target. This can be seen from the ungrammaticality of (73a) in contrast to (73b73c). In (73a), the complement of both is or, which is proxy of each and: this is ungrammatical, because the complement of both must be and. (73) a. *Find both [[Alice and Bill] or [Carol and Dave]]. b. Find (either) both Alice and Bill or both Carol and Dave. c. Find Alice and Bill or Carol and Dave. 17. Dependency Types In WG, dependencies are of different types, such as Subject and Object. In section 141 suggested that these types could be reduced to labels on branches
192
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
in the sentence tree. But in this section I will argue that branches are unlabelled, and that there is no distinction between branches and dependencies; and so-called 'dependency types' are in fact lexical items in their own right. Thus, instead of X being subject of Y, there is a word, an instance of the lexical item < PREDICATION >, that has two dependents, X (the subject) and Y (the predicate). These words that take over the job of grammatical relations ('GRs'), I will call 'GR-words'. GR-words belong to a class of function word characterized, in part, by phonologylessness. This proposals is relevant to this chapter for two reasons. First, the phrases whose root is a GR-word are strongly or weakly hypocentric. And second, many of the other analyses made elsewhere in the chapter converge on the GR-word analysis as a more or less inescapable conclusion.
17. 1
Adjuncts
Semantically, adjuncts are the converse of complements, in that whereas X is a semantic argument of Y when X is a complement of Y, Y is a semantic argument of X when X is an adjunct of Y. For instance, in She snoozed during the interval, the snoozing is the theme argument of 'during'. A natural logical corollary of the fact that the modificand is an argument of the modifier is that modification is recursive: after one modifier has been added to the modificand, another modifier (of the same type or another type) can always be added. This is, of course, because a predicate's argument 'place' ('attribute') can have only one 'filler' ('value'), but the filler of one argument place can also be the filler of many others. We would therefore predict that recursibility is a default property of adjunction. One cannot rule out, a priori, the possibility of special rules prohibiting adjunct recursion in certain constructions, but it is hard to imagine what systemic or functional motivation there could be for such a prohibition. So the null hypothesis is therefore that all adjuncts are recursible. 17 If adjuncts were simply dependents of the word they modify, then the principle of Branch Uniqueness ought to make them irrecursible. I propose instead that Adjunct is not a dependency but rather a GR-word. Rather than X being adjunct of Y, X and Y are dependents of an Adjunction GR-word; X is the modifier dependent and Y is the modificand dependent. Adjunction is a word class; the words it contains are the different kinds of adjuncts, such as < manner-adverbial >, < depictive >, and so forth. Adjunction phrases are hypocentric: the adjunction (the SH) is proxy of first dependent, the modificand (the DH). This can be seen from (74), where it is dozed that satisfies the requirement of had for a past participle as its complement target:
(74) She had [ [dozed off] [during the interval]]. The adjunction serves as the locus of constructional meaning. For example, (75a) has the meaning (75b) and the structure (75c), < depictive > being an adjunction. It is the word < depictive > that adds the meaning 'while', i. e. that
STRUCTURAL AND DISTRIBUTIONAL HEADS
193
the relationship between her going to bed and her being agitated is that the former occurs during the latter. (75) a. She went to bed agitated. b. She went to bed while (she was) agitated. c. She [ [went to bed] [agitated]]. A further merit of adjunctions is that they explain what Hudson (1990) calls 'semantic phrasing'. For example, (76a-76b) are not synonymous. (76a) says that what happens provocatively is her undressing slowly, while (76b) says that what happens slowly is her undressing provocatively. This nuance of meaning is reflected directly in the structure, (77a-77b). (76) a. She undressed slowly provocatively, b. She undressed provocatively slowly. (77) a. She undressed slowly provocatively.
She undressed provocatively slowly. Noun+noun premodification structures present a conundrum soluble only by means of adjunctions. The conundrum rests on the difficulty of reconciling evidence from word order with evidence from ellipsis. Ellipsis, as in (78), demonstrates that the modifying noun cannot be a dependent of the modified noun, since the modified noun can delete while the modifying noun remains: (78) On one reading, it receives a count interpretation and on the other reading it receives a mass interpretationWe might for a moment suppose that the modifying noun is like an attributive adjective, and its complement is a proxy of the modified noun (cf. Section 10). In this case, (79a-79b) would have the indicated dependency structure. Their ambiguity would then hinge not on the dependency structure but on the complement targets, shown in (80-81) by dotted arrows pointing to the complement target. (79) a. old clothes bag b. work clothes bag (80) a.
Oid
clothes bag ['bag for old clothes']
b. work clothes bag ['bag for work clothes']
194
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(81) a. old clothes bag ['old bag for clothes'] b. work clothes bag ['work bag for clothes', 'clothes bag for work'] But then we find that this analysis falls foul of word order evidence. The structure given to (82a) fails to rule out (82b):
(82) a. revenge kitchen implement attack ['revenge attack with kitchen implement'] b. | * kitchen revenge implement attack ["revenge attack with kitchen implement"] But the traditional WG analysis, where the modifying noun is a dependent of the modified noun, makes the right predictions here, as shown in (83a-83b), even though it is incompatible with the ellipsis facts:
(83) a. revenge kitchen implement attack ['revenge attack with kitchen implement'] b. kitchen revenge implement attack ['revenge attack with kitchen implement'] The solution is to be found if this construction involves an adjunction, ' < n + n >'. This adjunction allows its second dependent to delete, as in (84). It gives the structures in (85-86). And these structures succeed in excluding (87b) as a No Crossing violation. (84) On one reading, it receives a count interpretation and on the other reading it receives a [ [mass] [interpretation! ]• (85) a. [ [old [clothes]] [bagj] ['bag for old clothes'] b. [ [ [work] [clothes]] [bag]] ['bag for work clothes'] (86) a. [old [ [clothes] [bag]]] ['old bag for clothes'] b. [ [work] [ [clothes] [bag]]] ['work bag for clothes', 'clothes bag for work'] (87) a. revenge kitchen implement attack ['revenge attack with kitchen implement']
STRUCTURAL AND DISRTIBUTIONAL HEADS 195
b.
17. 2
* kitchen revenge implement attack ['revenge attack with kitchen implement'] Subjects
Conjoined predicates, as in (88a), present a problem. If the structure is as in (88b), then No Crossing is violated. If the structure is as in (88c) or (88d), then some kind of rule of leftwards extraposition of subjects is required: (88) a. He thinks she made her excuses and left, b. He thinks she made her excuses and left.
c.
He thinks she made her excuses and left.
d. He thinks she made her excuses and left.
As we saw in section 5, exactly the same problem arises with extent operators:
(89) a. He thinks she made her excuses and left.
b. He knows she all but perished. c.
He knows she all but perished.
d. He knows she all but perished. But under the GR-word analysis, the problem evaporates. The GR-word has two dependents, 18 the first corresponding to the subject and the second corresponding to the predicate19:
196
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(90) a. He thinks she made her excuses and left.
b. He knows she all but perished. < Predication > phrases are quasihypocentric in the sense defined in section 6. A phrase [X[ [Y][Z]]] does not freely alternate with [X[Z]]. In nontechnical and atheoretical terms, predicative phrases of category C do not freely alternate with nonpredicative phrases of category C. But X and Z are nevertheless sensitive to one another's presence, as can be seen from (91a-91b). The complement of auxiliary have must be a whose second dependent is proxy of a past participle. The complement of wax must be a whose second dependent is proxy of an adjective. (91) a. She had [ perished\. b. She waxed [ wroth], That is not proxy of its second dependent can be seen from the fact that one cannot be second dependent of another, i. e. that multiple subjects cannot occur. (92) a. *
She he went.
b- * She he went. As it stands, the analysis makes it look coincidental that it is only the second ('predicate') dependent of < predication > and not the first ('subject') that has DH-like properties. Therefore the grammar should perhaps formally accord the second dependent in this construction a special status. Let us therefore call < predication > the 'guardian' of its second dependent, the metaphor being that the second dependent is a legal minor and its intercourse with its superordinates must always be mediated by its guardian. And let us add rules ((93a-93b); (93b) replaces (24c)): (93) a. If X is surrogate of Y, then X is guardian of Y. b. If X is guardian of Y, and Y is guardian of Z, then X is guardian of Z. In this case, the complement of auxiliary have must be a that is guardian of a past participle, and the complement of wax must be a that is surrogate of an adjective. 17. 3
Topics and finiteness
Topics are phrases, like white chocolate in (94), that have been moved to the position immediately preceding the preverbal subject.
STRUCTURAL AND DISTRIBUTIONAL HEADS (94)
197
White chocolate, I can't help gorging myself on _.
Like Subject and Predicate, both the topic and the 'comment' phrases can be a coordination. (95) a. White chocolate, she keeps on giving me and I can't help gorging myself on. b. Both white chocolate and Cheshire cheese, I can't help gorging myself on. As with , these facts motivate a GR-word for the topic-comment structure, its first dependent being the topic and its second the comment. The second dependent of this GR-word is a that is guardian of a verb or auxiliary. (96)
[<'topic-comment'> [White chocolate], [ [I] [can't help gorging myself on]]].
Topics occur only in finite clauses. On the unproblematic assumption that the structure of It is is (97), it can also be maintained that all finite clauses contain topics. (97)
[<'topic-comment'> [it] z [ [z] [is]]]
< Topic—comment > can therefore be equated with finiteness: a finite clause is one that contains < Topic-comment >, which we could equally well call . I leave for future investigation issues about the relationship between and mood and tense, about verbal inflection, and about whether mood exists as a grammatical category in English.20 At any rate, it is clear that phrases are at most weakly hypocentric. If Indicativity and Subjunctivity are subtypes of , then phrases are not hypocentric at all, since know can select for a surrogate of < indicative >, require for a surrogate of < subjunctive > and insist for a surrogate of . If, on the other hand, the mood distinctions are located lower down within the phrase, then know /require /insist will select for a surrogate of , but know and require will further stipulate that their complement must be guardian of wherever the appropriate mood distinction is located. 17. 4 Complements An inescapable corollary of the proposed analysis of coordination is that conjuncts are complete phrases.21 In this section I sketch how this must work, though the sketch is of a solution strategy rather than an analysis worked through in detail.
198
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
(98a) shows that - uncontroversially - eats cheese is a complete phrase. But (98b) shows that core is a complete phrase too. So in a verb+object construction, V+O is a complete phrase, but so too is V on its own. How can this be? (98) a. She will [[eat cheese] and [drink wine]], b. She will [[core] and [peel]] the apples. The answer has to be that there is an extra GR-word present, whose function is like that of an X' node, uniting into a single phrase its two separate dependents, V and O. This gives the structures in (99a-99c).
(99) a. She will [[ eat cheese] and [ drink wine]],
b. She will [[core] and [peel]] the apples.
c.
She will eat [cheese and bread].
Similarly, (lOOa) shows that Sophy is a complete phrase in (lOOd), and (lOOb) shows that Sophy is a complete phrase in (lOOd). But (lOOc) shows also that Sophy roses is a complete phrase in (lOOd). (100)a. b. c. d.
She She She She
will will will will
give give give give
Sophy Sophy Sophy Sophy
and Edgar roses. roses and tulips. roses and Edgar tulips. roses.
There must therefore be an additional GR-word present - let's call it ''. (lOOa-lOOd) must have structures (lOla-lOld).
(101)a. She will give Sophy and Edgar roses.
b. She will give Sophy roses and tulips. c.
She will give Sophy roses and Edgar tulips, d. She will give Sophy~roses.
STRUCTURAL AND DISTRIBUTIONAL HEADS
199
(102a) shows that roses today is aphrase. Today is an adjunct, but not roses: it is not the roses that occur today but rather their being given. Today must be an adjunct of a GR-word that marks the second object, as in (102b). (103a) therefore has structure (103b). (102) a. She will give Sophy roses today and tulips tomorrow. b. She will give Sophy roses today and tulips tomorrow. (103)a. She will give Sophy roses. b. She will give Sophy roses. With the exception of < transitive >, the GR-words involved in complementation would be guardians rather than surrogates or proxies, since the GRwords are not freely omissible in the way that surrogates and proxies are. As for the kind of hypocentricity, if any, involved with < X' >, I leave this for future investigation. 18. Conclusion This chapter has demonstrated the existence - and indeed the prevalence - of hypocentricity, the syntactic phenomenon whereby the distribution of a phrase is determined not by the root of the phrase but by a word subordinate to the phrase root. Hypocentricity comes in different 'strengths'. In the strongest form of hypocentricity, a phrase with a given SH is in free variation with a version of the phrase with the SH absent. These are the hypocentric constructions that involve the Proxy relation. The instances discussed in this chapter involve (i) 'extent operators' like almost, not and all but, (ii) 'focusing subjuncts' like even; (iii) attributive adjectives; (iv) the type-of construction; (v) coordinating conjunctions; (vi) correlatives like both; and (vii) adjunctions, which are the invisible words that link adjuncts to their modificands. Apart from coordination, these could all be called 'modifier constructions'. In addition it has been suggested that invisible bound variables are proxy of their binder, even though, exceptionally, the binder would not be a subordinate of its proxy. In hypocentricy of 'intermediate strength', a phrase with a given SH is in distributional alternation with a version of the phrase with the SH absent, but the variation is limited to certain environments. These are the hypocentric constructions that involve the Surrogate relation. The instances discussed in this chapter involve (i) clausal that; (ii) inside-out interrogative clauses, which behave like clausal determiners; (iii) pied-piping; (iv) degree words; and (v) determiners. In the weakest form of hypocentricity, there is no distributional alternation, but the DH is nevertheless sensitive to material external to the hypocentric
200
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
phrase. These are the hypocentric constructions that involve the Guardian relation. The instances discussed in this chapter involve the invisible GR-words , which is the root of the subject+predicate construction, and < finite >, which is the root of the topic+comment construction, and various other GR-words that form the structural basis of complementation. The relations Proxy and Surrogate are initially motivated as mechanisms that provide an analysis for constructions that cannot otherwise be satisfactorily handled by WG. Once this mechanism is admitted, it opens the way - or the floodgates - for a series of increasingly radical (and increasingly sketchy and programmatic) analyses of coordination and of grammatical relations, which aim to simplify WG by drastically reducing the range of devices from which syntactic structure is constituted, while still remaining consistent with WG's basic tenets. The devices that are done away with are (i) exceptions to the principle of Node Lexicality, i. e. nonlexical phrasal nodes, which orthodox WG uses for coordination; (ii) exceptions to the No Crossing principle barring crossing branches in the sentence tree; (iii) dependencies that are not associated with branches in the sentence tree; and, perhaps, (iv) dependency types tout court. In their most extreme form, these changes result in a syntactic structure consisting of nothing but words linked by unlabelled branches forming a tanglefree tree, supplemented by Binder, Proxy, Surrogate and Guardian relations. While I believe the necessity for Proxy and Surrogate relations is demonstrated fairly securely by the earlier sections of the chapter, their extended application in the analysis of coordination, empty variables and grammatical relations is of a far more speculative nature. But my aim in discussing these analyses in this chapter has been to point out how they are possible within a WG model and why they are potentially desirable. References Cormack, Annabel and Breheny, Richard (1994), 'Projections for functional categories'. UCL Working Papers in Linguistics, 6, 35-62. Hudson, Richard (1976), 'Conjunction reduction, gapping and right node raising'. Language, 52, 535-62. — (1990), English Word Grammar. Oxford: Blackwell. — (2004), 'Are determiners heads?'. Functions of Language 11, 7-42. Jaworska, Ewa (1986), 'Prepositional phrases as subjects and objects'. Journal of Linguistics, 22, 355-74. Payne, John (1993), 'The headedness of noun phrases: Slaying the nominal hydra', in Greville G. Corbett, Norman M. Fraser and Scott McGlashan (eds), Heads in Grammatical Theory. Cambridge: Cambridge University Press, pp. 114-39. Quirk, Randall, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (1985), A Comprehensive Grammar of the English Language. Harlow: Longman. Rosta, Andrew (1994), 'Dependency and grammatical relations'. UCL Working Papers in Linguistics, 6, 219-58. — (1997), 'English Syntax and Word Grammar Theory'. (Unpublished doctoral dissertation, University of London). Van Langendonck, Willy (1994), 'Determiners as Heads?'. Cognitive Linguistics, 5, 243-59.
STRUCTURAL AND DISTRIBUTIONAL HEADS
201
Notes 1 In standard Phrase Structure Grammar, lexical nodes are terminal and nonlexical nodes are nonterminal. If a nonterminal node is defined as one that contains others, then in WG all nodes are terminal. But this terminology is a bit misleading, since in a WG tree structure, terminal nodes are ones that have no subordinates. Hence it is more perspicuous to define WG as maintaining that all nodes are lexical. 2 This problem posed by require was pointed out by Payne (1993); cf. also Cormack and Breheny (1994). 3 See Jaworska (1986) for examples of such nonpredicative prepositions. 4 These judgements are for conservative Standard English. Admittedly there is the famous line 'The boy stood on the burning deck, whence all but he had fled' (Felicia Hemans, 'Casabianca'), but that could be a solecism induced by the register, hypercorrectively, and by the disconcerting unfamiliarity of the wordsequence him had in contrast to he had. It is also true that for many speakers of contemporary English, the rules for the incidence of personal pronouns' subjective forms, especially in less colloquial registers, seem to be pretty much of the makethem-up-as-you-go-along or when-in-any-doubt-use-the-subjective-form sort. 5 It has to be admitted that this claim flies in the face of a certain amount of evidence to the contrary, notably determiner-complement ellipsis, as in (i), and pseudogapping, as in (ii). (i) Do students of Lit tend to be brighter than those students °f Lang? (ii) She will do her best to bring the food, as will he a0 his best to bring me wine. 6 (13f) raises its own analytical curiosities, which I won't investigate further here. Logically, the structure is 'not [[go out without knickers] and [still stay decent]]'; that is, the ellipsis is of one conjunct Egregiously unexpected though such a phenomenon is, I find that the analogous structure in (i) is, also surprisingly, acceptable. (i) Nobody likes to complain, but she should [[COmpiaiJ and [be the happier for it]]. 7 The arrows below the words represent dependencies that don't form branches in the tree structure. (See section 14 on the eradication of such dependencies. ) 8 More generally, I would maintain that open class lexemes are invisible to syntax and hence that selectional rules cannot refer to them. Only word classes are visible to syntax and can be involved in selectional rules. WG has always held that lexemes are word classes, but my contention is that this is true only of closed class lexemes, in that closed class lexemes are word classes that are associated with particular phonological forms, whereas open class lexemes are morphological stems (which is why processes of derivational morphology, which output stems, can output only open class lexemes). Way (and a few other similar words, such as place) would be a subclass of Common Noun. 9 More precisely, the rule is that by default, nouns receive a mass interpretation, but the complement of certain determiners, such as an, must be proxy of a noun that receives a count interpretation. 10 I don't know why (56c) is grammatical. Cake has a count interpretation, so ought to be complement target of a determiner, but if it is complement target of these then it ought to agree in number with these. 11 I suggest that the reason is that common nouns take plural inflection only when complement of a plural determiner. (This supposes that bare plurals are complement of a phonologically invisible plural an. } In (55b) there is no plural determiner present to trigger the plural inflection on cakes.
202
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
12 This is an approximation. Make actually requires a complement that is surrogate of a predicative word. In the analysis of predicativity given in section 17. 2, would be complement of make, and the surrogate of virile would be dependent of . 13 By 'empty category', I mean traces and suchlike, that have a privileged syntactic status, and are empty not only of phonological content but also of ordinary lexical content. As noted in section 5, it is also possible for ordinary words in particular environments to lack phonology; cf. also Creider and Hudson (this volume). 14 (65b-65c) from Arthur Hugh Clough's 'The latest decalogue'. 15 Besides coordination, adjuncts appear to constitute an exception to Branch Uniqueness. A word can have more than one adjunct - indeed, that is part of the definition of the Adjunct relation. But section 17. 1 provides an analysis of adjuncts that removes the exception to Branch Uniqueness. 16 A further problem with the WG analysis of coordination is that it cannot easily accommodate the fact that the definitional boundary between coordination and subordination is gradient rather than clearcut as one would expect were coordination and subordination handled by fundamentally different mechanisms. (See Rosta 1997 for a full demonstration of this point. ) 17 This null hypothesis stands up extremely well to the data, but there are some constructions where a dependent is irrecursible but is not subject to selectional restrictions and is not an argument of the modificand and hence has selectional and semantic properties more typical of adjuncts than of complements. But such dependents are best seen as atypical complements. One example is the result dependent in the resultative construction, e. g. soggy in (i). Another is the indirect object me in (ii). (i) She sneezed the hankie soggy. (ii) Fry me some bacon. Another example is bare relative clauses (BRCs), in those dialects in which BRCs are not recursible. (iii) %The book [I'd been asking for _] [she finally bought me _] turned out to be crap. Another respect in which BRCs are unlike adjuncts is that, as (ii) shows, they don't extrapose, though this point should be taken as suggestive rather than conclusive, since it is not clear how consistently extraposability distinguishes adjuncts from complements or other nonadjuncts. (iv) [That book _J has arrived [*(that) you ordered Ji. 18 In the absence of evidence to the contrary, GR-words are assumed to precede their dependents, since this is the default order for English. Indeed, for English it may eventually turn out to be an exceptionless principle that dependents follow their regent. But there are plenty of exceptions that are hard to explain away, such as too 'also' ({[She] [too] went]), degree enough, possessive 's ([[Sophy]'s [father]), and nonfmal dependents of conjunctions. 19 A corollary of this analysis is that the 'inverted subject' in subject-auxiliary inversion is not in fact a subject Rather, it is an object (of the auxiliary) that has not been raised to subject position. 20 The data to be accounted for is summarized in (i-ix). (i) though she is/ was/ %be/ *were/ goes/ %go/ went mad (ii) if she is/ was/ %be/ were/ goes/ %go/ went mad (iii) insist that she is/ was/ be/ *were/ goes/ go/ went mad (iv) require that she *is/ *was/ be/ were/ goes/ go/ went mad
STRUCTURAL AND DISTRIBUTIONAL HEADS
203
*be/ *were/ goes/ *go/ went mad know that she is/ was/ She would, *is/ %was/ *?be/ %were/ *goes/ *go/ *went she mad. She would, is/ %was/ *?be/ %were/ *goes/ *go/ *went you to. (vii) (viii) I would prefer that you not be/%be/*is/*is. (ix) She almost *be/*be/is/%is. 21 In determining what sorts of phrase can be coordinated, it is important to factor out the extraneous but distorting effects of Right Node Raising-type operations, which delete the phonology of part of one conjunct. See Rosta (1997). (v)
(vi)
9 Factoring Out the Subject Dependency NIKOLAS GISBORNE
Abstract
This chapter offers a revision to the English Word Grammar (EWG) model by factoring out different kinds of dependency. This is because the information encoded in the EWG model of dependencies is not organized at the appropriate level of granularity. It is not enough to say, for example, that by default the referent of a subject is the agent of the event denoted by the verb. 1. Introduction The English Word Grammar model treats dependencies as asymmetrical syntactic relations (Hudson 1990: 105-8), where the critical information is the asymmetry and the relative ordering of head and dependent. Hudson (1990: 120-1) goes on to treat grammatical relations as a particular subclass of dependency relation, and to identify certain semantic roles as being prototypically linked to certain grammatical relations. For the English Word Grammar model, therefore, dependencies are labelled asymmetrical syntactic relations which are also triples of semantic information, syntactic relation information and word-order information bound together by default inheritance. The theory is syntactically minimalist: all syntactic phenomena are analyzed in terms of dependency relations and the categorization of the words that the dependencies relate to. The result is a highly restrictive model of grammar, where all relationships are strictly local. It differs from other lexicalist frameworks, such as Lexical Functional Grammar (LFG), in that there are not different domains of structure which represent different kinds of information. All grammatically relevant information for WG is read off the lexicon and the dependency information. Within this theory of grammar, Hudson (1990) uses an inventory of dependencies which is pretty much what you find making up the set of grammatical relations in both traditional grammar and classical transformational grammar. In this chapter, I offer a revision to the English Word Grammar model by factoring out different kinds of subject dependency. This is because the information encoded in the EWG model is not organized at the appropriate level of granularity. It is not enough to say, for example, that by default the referent of a subject is the agent of the event denoted by the verb. This is because there are at
FACTORING OUT THE SUBJECT DEPENDENCY
205
least three kinds of subject in English: subjects triggered by finiteness (on the grounds that English is not a 'pro-drop' language); subjects triggered by predicative complementation; and 'thematic' or 'lexical' subjects such as the subjects of gerunds and other inherently predicating expressions. Subjects triggered by finiteness are not required to be in any kind of semantic relationship with the event denoted by the verb. Similar observations about the nonuniformity of the subject relationship are found in McCloskey (1997, 2001). In this chapter, therefore, I review the inventory of dependencies in Word Grammar, and establish a more fine-grained account of subjecthood than the model of Hudson (1990) envisages. I focus on data introduced in Bresnan (1994). Bresnan (1994) explores locative inversion, shown in (1), and shows that in inverted sentences like (Ib), the subject properties are split between the italicized PP and the emboldened NP: (1)
a. A lamp was in the corner. b. In the comer was a lamp. 1
Bresnan's (1994) account explains the split subject properties in terms of the parallel architecture of LFG where grammatical information is handled in terms of a-structure, f-structure and c-structure. I show that the revised Word Grammar account can capture the same kind of data as LFG within a more parsimonious ontology. The chapter is organized into 5 sections. In section 2, 1 discuss the different dimensions, of subjecthood, and explore the different properties that subjects have been claimed to display since Keenan (1976). In section 3, I lay out the data that needs to be discussed (drawn from Bresnan 1994), and explain the problems that this data presents. In section 4, I present the refined view of subjecthood that this chapter argues for, and show how it accounts for the data. The final section, section 5, presents the conclusions and some prospects for future research. 2. Dimensions of Subjecthood Subject properties have been gathered up in several different places - for example, Keenan (1976), Keenan and Comrie (1977), and Andrews (1985). Subjects have been shown to have diverse properties across languages, and it has been shown that not every subject property is always displayed by all subjects in a given language. It is this observation that drives Falk's (2004: 1) claim that 'a truly explanatory theory of subjecthood has yet to be constructed'. In this section, I itemize and exemplify some of the major features of subjecthood, which are generally held to apply crosslinguistically, and present three diagnostics which apply parochially to English. I have relied on the presentation of these properties in Falk (2004: 2-5), where they are usefully gathered together. Not all of the subject properties laid out here are directly relevant to the analysis of the split-subject phenomena found in locative inversion, but they are relevant to the broader conclusions about subjecthood that this case study takes us to, and which are laid out in section 5.
206
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
2. 1 Typical subject properties Subjects are typically the dependent which expresses the agent argument in the active voice. This is shown in (2): (2)
a. The dog chased the cat b. The cat was chased by the dog.
In (2a), the subject is also the agent of the action. In order for the subject not to have to be the agent, passive voice is available as in (2b). Voice phenomena are devices for re-arranging the arguments of the verb so that the agent no longer has to be presented as the subject Of course, it is not always the case that subjects are agents, because there are verbs that do not have agentive subjects, as in (3), but many linguists follow Jackendoff (1990) in assuming a hierarchy of semantic roles, where the most agent-like is always the one which links to the subject. (3)
a. Jimmy weighs 90kg. b. The glass shattered.
So we can use semantic role as a subject-diagnostic. The second diagnostic is that sole arguments of intransitives typically show (other) subject properties. For example, tag questions and subject-auxiliary inversion are diagnostics of subjecthood in English, and the subject of (3b) above and (4a) can have the relevant diagnostic applied to it (4)
a. The glass shattered, did it/*he/*she/*they? b. Did the glass shatter?
From this, we can say that the glass in (3b) is shown to be the subject by the diagnostics in (4). The addressee of an imperative is a subject In the following examples, the addressee has the status of the subject, irrespective of its semantic role. (5)
a. Go away! b. Be miserable, see if I care!
From the imperative examples, we can see that it is also possible for subjects to be covert.2 One widely noted diagnostic is to do with anaphora. There is a subjectobject asymmetry, which becomes evident when subject and object are coreferential. In the case of co-reference, it is the object which is expressed as a (reflexive)3 pronoun. This is shown in (6): (6)
a. Jane hurt herself. b. * Herself hurt Jane.
There is cross-linguistic variation in this construction. In English, there is a
FACTORING OUT THE SUBJECT DEPENDENCY
207
hierarchy of grammatical functions so that the reflexive pronoun has to be lower in the hierarchy than its antecedent. In some other languages, only subjects may be antecedents of reflexive pronouns. The subject is the only argument which may be shared in a predicative complementation structure (in fact, in both varieties of predicative complementation - raising and control). This is shown by the examples in (7) for 'control' verbs, and (8) for 'raising' verbs. The subject of the xcomp is shared with either the object or the subject of the matrix verb. (7)
a. Jane persuaded the doctor to see Peter. b. Jane persuaded Peter to be seen by the doctor. c. The doctor was persuaded to see Peter. 4 d. *Jane persuaded Peter the doctor to see.
In (7a), the doctor is shared between persuaded and to see Peter, because to is the xcomp of persuade. The passivization facts in (7b) show us that the doctor in (7a) is the subject of to (and see). The passivization facts in (7c) show us that the doctor is an argument that is shared with persuaded, because it is also the object of persuaded. The ungrammatical (7d) shows that Peter cannot be the object of persuaded and of see at the same time. Therefore, the property of being sharable with a higher predicate is a property of subjects, not other arguments. (8)
a. It seems that Jane likes Peter. b. Jane seems to like Peter. c. Peter seems to be liked by Jane. d. * Peter seems Jane to like .
The relationship between (8a) and (8b) shows that in (8b), Jane is the subject of both seems and to like. The example in (8c) shows that the passive subject of (to be) liked can also be shared with seems. The ungrammatical (8d) shows that it is not possible to exploit the object of like as the shared subject of seems. Falk (2004) claims that the subject is the only argument which can be shared in coordination. The examples in (9) show that a subject can be shared by two conjoined verbs, but not an object: (9)
a. Jane kissed Peter and hugged Jim. b. *Jane kissed Peter and Cassandra hugged .
However, this observation is not quite right. In Right Node Raising, the object of the second conjunct can be shared, as in Jane kissed, and Cassandra hugged, Peter. Right Node Raising needs to be treated as a special construction type because, among other things, it comes with particular intonation - indicated here by the commas - which is not a necessary part of the argument sharing in (9a). But, it is also the case that it is possible to say Cassandra peeled and ate a grape. Here, both the object and the subject are shared by the conjoined verbs. There are two remaining properties of subjects which can be stated very generally. The first is that in many languages the subject is obligatory, as it is in
208
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
English (except in the case of imperatives). This observation gives rise to the Projection Principle of Chomsky (1981), and its later incarnation as the Extended Projection Principle (EPP). The second fact is that subjects are usually discourse topics. In the next section, I identify some subject properties that are found parochially in English. 2. 2 Parochial subject diagnostics for English The first is that subject-inversion is found in main-clause interrogatives. (10)
a. Jane was running. b. Was Jane running? c. Jane ran. d. Did Jane run?
As (lOa-lOb) show, where there is an auxiliary in the corresponding declarative clause, it inverts with the subject in interrogatives. The examples in (lOc-lOd) show that where there is no auxiliary in the corresponding declarative clause, one has to be supplied in the interrogative. The next diagnostic for English is that tag-questions have properties that are unique to subjecthood: the pronoun in a tag question has to agree with the person, number and gender 'features' of the noun or noun phrase in the matrix clause it replaces. If we look back at the example in (4a), we can see that the only legal pronoun in the tag question is it, which has features appropriate to the glass. The last diagnostic that I want to look at concerns extraction. There are two main properties in English: the Condition on Extraction Domains shows us that it is easier to extract out of complements than out of adjuncts, which in turn are easier to extract out of than subjects. And the THAT-trace effect shows that - in general terms - English subjects resist extraction. (In other languages subjects are often more extractable than other arguments. Keenan and Comrie (1977) showed that in terms of the THAT-trace effect, English is atypical: they found that cross-linguistically, subjects were more, not less, likely to be extracted. ) There are relevant data in (11): (11)
a. Jane thinks that Peter is a drunk. b. * Who does Jane think that is a drunk? c. Who does Jane think is a drunk? d. What does Jane think that Peter is ?
The example in (lla) gives the basic declarative sentence, (lib) shows that it is impossible to extract a subject after that, even though (lie) shows that it is possible to extract a subject out of a finite complement clause when there is no that, and (lid) shows that it is possible to extract other arguments out of a finite complement clause, like the object.
FACTORING OUT THE SUBJECT DEPENDENCY
209
2. 3 Subject-verb agreement Subject-verb agreement is not universally found as a subject property; however, it is not a parochial property of English either. As a phenomenon, agreement is complex - some languages have agreement that works across a range of dimensions, whereas English only shows agreement in terms of number (Hudson 1999). Although English has subject-verb agreement, which is very common in Semitic, Bantu and Indo-European languages, some other languages have no agreement morphology at all - for example the modern Scandinavian languages and the Sinitic languages. English subject-verb agreement is shown in (12).5 (12)
a. The girl likes the dog. b. The girls like the dog. c. *The girl like the dog. d. *The girl like the dogs.
The examples in (12a-12b) show that the number feature of the finite verb covaries with the number of the subject. If the subject is plural, so is the verb: girls triggers like. The example in (12c) shows that a plural subject requires a plural verb, and the example in (12d) shows that English does not have agreement with objects, so a plural object cannot rescue a plural verb that has a singular subject. The agreement phenomena of English are significant in the discussion of locative inversion that follows. This, then, completes the review of subject properties. A number of these properties were exploited by Bresnan in her (1994) article, which is discussed in the next section, but before we turn to section 3, I shall just summarize the subject properties in three bullet-point lists here: General subject properties. • subjects are typically the dependent which expresses the Agent argument in the active voice; • the sole arguments of intransitives typically show (other) subject properties; • the addressee of an imperative is a subject; • there is a subject-object asymmetry, such that where subject and object are co-referential, it is the object which is expressed as a (reflexive) pronoun; • the subject is the only argument of an xcomp which may be shared in a predicative complementation structure (in both raising and control); • the subject can be the shared argument in coordination; • the subject is often obligatory; • subjects are usually the discourse topic. Parochial subject diagnostics for English • English main-clause interrogatives show subject-inversion; • tag questions show agreement between the pronoun tag and the subject;
210
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
• English subjects resist extraction. (In other languages, subjects are often more extractable than other arguments). Agreement • subject-verb agreement: subjects agree with their verb. The issue, at least in as much as the locative inversion data constitute a problem for a story of subjecthood, is to do with which of these subject properties belong together. In the next section, I look at the locative inversion data presented in Bresnan (1994), and then in section 4, 1 look at these subject properties in the light of Bresnan's findings about the arguments in the locative inversion construction. 3. The Locative Inversion Data Bresnan (1994) presents an account of locative inversion which carefully details the circumstances within which locative inversion can take place, and which also describes the discourse factors, as well as the grammatical factors, which affect locative inversion. Locative inversion in English is shown in (1) above, and (13) and (14) below. (13) (14)
a. My friend Rose was sitting among the guests. b. Among the guests was sitting my friend Rose. a. The tax collector came back to the village, b. Back to the village came the tax inspector.
As Bresnan (1994: 75) puts it, 'locative inversion involves the preposing of a locative PP and the postposing of the subject NP after the verb. The positions of the locative and subject arguments are inverted without changing the semantic role structure of the verb. ' Bresnan (1994: 75-6) sets out the limits of locative inversion, excluding other kinds of inversion around be from the discussion, and limiting the phenomenon to examples like those in (15). (15)
a. Crashing through the woods came a wild boar. b. Coiled on the floor lay a one-hundred-and-fifty-foot length of braided nylon climbing rope three-eighths of an inch thick.
The examples in (15) can be included in the set of locative inversion data because the inverted VPs involve a locative PP, and the verbs GOME and LIE number among the verbs which support locative inversion. Bresnan goes on to demonstrate that the verbs which allow locative inversion are unaccusative thus the grammaticality difference between (16a) and (16b) - or passive (but with the BY phrase suppressed), as in (17): (16) (17)
a. Among the guests was sitting my friend Rose. b. * Among the guests was knitting my friend Rose. a. My mother was seated among the guests of honour, b. Among the guests of honour was seated my mother.
FACTORING OUT THE SUBJECT DEPENDENCY
211
From these data, Bresnan concludes that locative inversion 'can occur just in case the subject can be interpreted as the argument of which the location, change of location or direction expressed by the locative argument is predicated' (1994: 80) - to put this another way, the subject must be a 'theme' in the terms of Jackendoff (1990). This is consistent with Bresnan's account of unaccusativity, where it is claimed that unaccusativity is not a syntactic phenomenon, but one where the unaccusative subject's referent is always the theme of the sense of the verb. The final aspect of the grammar of locative inversion is that the locative PP is always an argument of the verb, not an adjunct. The argument/adjunct distinction is hard to draw in the case of locative expressions, and I do not want to get bogged down in the debate, but the evidence that Bresnan (1994: 82-3) brings to bear on the issue is compelling enough. She shows that adjuncts can be preposed before the subjects of questions, although arguments cannot, and she uses the so-anaphora test to show that adjuncts can be excluded from the interpretation of so-anaphora, whereas locative arguments cannot. To summarize, locative inversion: • occurs with unaccusative verbs or passivized verbs; • requires the subject NP's referent to be the theme of the sense of the verb; • requires the locative PP to be an argument of the verb. There are other facts that apply in the treatment of locative inversion. These are: • • •
presentational focus; sentential negation; other subject properties.
Presentation focus is not strictly syntactic. I shall not return to this. Sentential negation is more important. Bresnan gives the examples in (18). The significance is that in (18a), sentential negation is not possible, whereas in (18b), constituent negation of the postverbal NP is possible: (18)
a. *On the wall never hung a picture of U. S. Grant. b. On the wall hangs not a picture of U. S. Grant but one of Jefferson Davis.
Bresnan (1994: 88) quotes Aissen (1975: 9) as saying that this restriction is due to the way in which the locative expression sets a backdrop for a scene. Negating the main clause undermines this discourse function, whereas contrastive negation on the postverbal NP does not have such an effect Bresnan (1994: 88) on the other hand, contrasting English with Chichewa, argues that sentential negation in Chichewa excludes the subject, so the restriction comes down to a statement about the scope of negation.
212
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
3. 1 Evidence that the subject properties are split between the locative PP and the postposed NP We shall see in this section that by a number of different diagnostics for subjecthood, the subject properties are split between the locative PP and the postposed NP that expresses the theme argument. Agreement In the case of agreement, we see that the locative PP does not agree with the finite verb. (19)
a. In the swamp was /*were found a child. b. In the swamp were /*was found two children.
In English agreement is with the NP Theme. Control of attributive VPs (participial relatives) In this case too, we see that the locative PP cannot be the controller (or subject) of an attributive participle. As Bresnan (1994: 95) points out, this constitutes a difference between English and other languages: Chichewa does allow examples like (21b). In borrowing Bresnan's examples, I have also taken her representational system.6 (20)
a. On the corner stood a woman fcp who was standing near another woman], b. On the corner stood a woman [0 standing near another woman] cp.
Note that the locative PP cannot control the participle in the participial relative. (21)
a. She stood on the corner [cp on which was standing another woman] cp. b. *She stood on the corner [0 standing another woman].
Subject-raising However, English does allow apparent subject-raising of locative PPs as in (22). (22) a. Over my windowsill seems to have crawled an entire army of ants. b. On that hill appears to be located a cathedral. c. In these villages are likely to be found the best examples of this cuisine. Bresnan (1994: 96) observes that only subjects can be raised in English. She compares die two examples in (23) as evidence of this: (23)
a. It seems that John, you dislike, b. *John seems you to dislike.
In (23a), John is the focused, and leftward-moved, object of dislike. This movement is entirely acceptable in the context of the finite verb. In (23b), however, we can see that it is not possible for the object to be focused and then raised over seems as its subject. From this, she concludes that any word or
FACTORING OUT THE SUBJECT DEPENDENCY
213
phrase which is the subject of a predicate like seem is also the subject of the xcomp of seem. Tag questions The argument from tag questions is a negative one: the claim is that the NP theme cannot be the subject by this diagnostic. In English tag-questions, a declarative clause expressing a statement is followed by an auxiliary verb and a pronoun which expresses a questioning of the prepositional content of the main clause. Examples are given in (24). The pronoun must agree with the subject of the main clause. (24)
a. Mary fooled John, didn't she/*he? b. John was fooled by Mary, wasn't he/*she?
As Bresnan points out, tags are in general unacceptable with locative inversion. The examples in (25) show this: (25)
a. ?Into the garden ran John, didn't he? b. *Into the garden ran a man, didn't one/he?7
The example in (25a) is less unacceptable than that in (25b). Bresnan (1994: 97) quotes Bowers (1976: 237) who gives the example in (26) and argues that this shows that the postposed NP in locative inversions cannot be the subject. (26)
In the garden is a beautiful statue, isn't there?
The claim is that in (26) there is coreferential with [i]n the garden, which indicates, if anything, that in the garden is a more likely candidate for subject status than the postposed NP. 8 Bresnan also quotes *A man arrived didn't onej he? - an example from Gueron (1980: 661) to show that tags are in general difficult to establish with locatives even when they do not involve inversion. However, this example does seem to be set up to make the situation more, rather than less, problematic: replace a man by the man, and the problem of pronoun choice vanishes. With appropriate context, as in the train arrived at 3, didn't it? a tag question is fine with locative inversion. The tag-question data are difficult to interpret, therefore; it seems that the best solution is to put them on one side as inconclusive. Subject extraction/THAT-trace effect In this section, I simply quote part of Bresnan's (1994: 97) section 8. 2, although with renumbered examples. Bresnan is discussing the THAT-rrace effect. [The] preposed locatives in locative inversion show the constraints on subject extraction adjacent to complementizers: (27) a. It's in these villages that we all believe can be found the best examples of this cuisine.
214
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE b. *It's in these villages that we all believe that can be found the best examples of this cuisine.
Nonsubject constituents are unaffected by this restriction, as we can see by comparing extraction of the uninverted locatives: (28) a. It's in these villages that we all believe the finest examples of this cuisine can be found . b. It's in these villages that we all believe that the finest examples of this cuisine can be found . Only subjects show the effect. (29) a. It's this cuisine that we all believe can be found in these villages. b. *It's this cuisine that we all believe that can be found in these villages. Extraction from coordinate constituents Bresnan (1994: 98) gives the examples in (31-32) which show the constraint in (30): (30)
'subject gaps at the top level of one coordinate constituent cannot occur with, any other kind of gap in the other coordinate constituent. ' (31) a. She's someone that loves cooking and hates jogging. b. She's someone that cooking amuses and jogging bores . (3la) has two subject gaps; (31b) has two non-subject gaps. (32) a. * She's someone that cooking amuses and hates jogging. b. She's someone that cooking amuses and I expect will hate jogging. In (32a), a non-subject gap is coordinated with a subject gap, leading to ungrarnmaticality. In (32b), we see that a non-subject gap can be coordinated with an embedded subject gap, hence the careful formulation of the constraint in (30). Bresnan (1994: 98) suggests that judgments are delicate with examples like those in (33)-(34), which involve locative inversion examples, but gives these examples: (33) a. That's the old graveyard, in which is buried a pirate and is likely to be buried a treasure, [subject-subject] b. That's the old graveyard, in which workers are digging and a treasure is likely to be buried . [nonsubject-nonsubject] (34) a. PPThat's the old graveyard, in which workers are digging and is likely to be buried a treasure, [nonsubject-subject] b. That's the old graveyard, in which workers are digging and they say is buried a treasure, [nonsubject-embedded subject] The crucial point here is that the examples in (33) show that 'the inverted locative PPs show the extraction patterning of subjects' (Bresnan 1994: 98). As Bresnan points out, (34a) is fine with there in the subject gap 'which channels the extraction to a nonsubject argument (the oblique)', which argues in favour of a subject treatment of the locative PP.
FACTORING OUT THE SUBJECT DEPENDENCY
215
Another diagnostic for subjecthood is inversion in interrogatives (Bresnan, 1994: 102). As the examples in (35) show, it is clearly the case that the locative PP is not a subject by this criterion. We shall use this fact in the next section where I argue that the locative inversion data can be best handled by treating syntactic subjects as distinct from morphosyntactic subjects. (35) a. *Did over my windowsill crawl an entire army of ants? b. *Did on that hill appear to be located a cathedral? As these examples show, the locative PPs are not able to appear as the subjects of auxiliary do in closed interrogatives. Moreover, as we can see in (36), they cannot occur as subjects in open interrogatives, either: (36) a. *When did over my windowsill crawl an entire army of ants? b. *Why did on that hill appear to be located a cathedral? c. Why did an entire army of ants crawl over my windowsill? d. Why did a cathedral appear on that windowsill? The examples in (36) show that the locative PP cannot appear as the subject in (36a-36b) although the theme NP can in (36c-36d) in non-inverted examples. However, Bresnan (1994: 102), in a section arguing against a null expletive subject analysis, provides the following data, which make the situation reported here more complex: (37) a. Which portrait of the artist hung on the wall? b. * Which portrait of the artist did hang on the wall? The examples in (37) show that when a subject itself is questioned, as in (37a), which is the interrogative correlate of a portrait of the artist hung on the wall, subject-inversion is not triggered. In fact, as (37b) shows, auxiliaries cannot occur. We see the same facts with locatives: (38) a. On which wall hung a portrait of the artist? b. *On which wall did hang a portrait of the artist? The examples in (38) correlate to on the wall hung a portrait of the artist. In these examples, on which wall behaves just like a subject in a subject-interrogative. 3. 2 Results and conclusions It is possible to organize these results into a table - we can then explore the hypothesis that the split in subject properties shown in locative inversion corresponds to a split in subject properties which can be explored elsewhere in grammar. Table 1 shows whether a given subject property applies to the locative PP or the theme NP in a locative inversion structure; for this reason the table says that it is not possible for the NP to undergo subject-to-subject raising in the case of into the room ran a child.
216
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Table 1 Subject property applied to the locative PP or the theme NP in a locative inversion structure Subject property
Found on PP?
Found on NP?
Agreement Subject of participial relative Subject raising Tag questions^ THAT-trace effect Extraction from coordinated constituents Inversion in interrogatives Subject interrogatives
X X / / / / X
/ /
/
X
X X10 x 11 X X
The evidence from Bresnan's paper, which I have reviewed in this section, shows that several subject properties may occur on either the locative PP or the postposed subject NP. Only three properties are not able to occur on the PP: these are agreement, being the subject of a participial relative, and an inversion in non-subject interrogatives. The tag question data appears to favour a subject analysis of the locative PP rather than the postposed NP. In the next section, I set out to accommodate those facts within Word Grammar. As I stated in the introduction, these facts are problematic for the model put forward in Hudson (1990), because there is only a single account of subjects in that theory. In the next section, I review some of the dimensions of subjecthood, in the light of the discussion in section 2. 1 go on to argue that we need to split subjects into three kinds - lexical subjects, syntactic subjects and morphosyntactic subjects - and that this division can take account of the pattern of data reported in section 3. 4. Factored Out Subjects In this section, I relate the data in section 3 to the more general discussion of subjects presented in section 2. The problem for the English Word Grammar typology of dependencies is that in this model there is only one dependency which has the 'subject-of label - and yet in the locative inversion data, as we have seen, there are two candidates for the subject dependency. If we start with the general observations made about subjects in section 2, we can see that the subject properties detailed there gather around three poles: there are subject properties which are - broadly speaking - lexical; syntactic; and morphosyntactic. 12 We can put the subject properties into three lists. Bracketed items feature in more than one list13 Lexical properties • Subjects are typically the dependent which expresses the Agent argument in the active voice. • (The sole arguments of intransitives typically show (other) subject properties. )
FACTORING OUT THE SUBJECT DEPENDENCY
217
• (The addressee of an imperative is a subject) Syntactic properties • The subject can be the shared argument in coordination. • The subject is the only argument of an xcomp which may be shared in a predicative complementation structure (in both raising and control). • English subjects resist extraction. (In other languages subjects are often more extractable than other arguments). • Auxiliary inversion fails with a subject interrogative. • Subjects are usually the discourse topic. Morphosyntactic properties • (The addressee of an imperative is a subject. ) • The subject is often obligatory. • Subject-verb agreement. As we can see, the lexical properties are those properties which are primarily to do with the mapping of semantic roles to grammatical functions. The first two items listed as lexical properties concern the mapping of semantic roles in particular in nominative-accusative languages (like English) rather than absolutive-ergative languages (like West Greenlandic). I have the second two items which are listed as lexical properties in brackets, because these are shared with other parts of the grammar: the identification of the subject of imperatives is not only a lexical property. What makes it a 'lexical' subject is that the subject of an imperative picks out the same semantic role as the first two criteria - the linking facts apply here as well. But this subject-criterion could also be morphosyntactic, because the imperative is a mood, and mood is a morphosyntactic feature. Excepting certain well-known construction types, it is only the imperative mood that permits subjects not to be represented by an overt noun or noun phrase in English. The syntactic properties are those that have to do with the grammatical phenomena that are commonly called 'movement' or 'deletion'. The first three can be subsumed under the descriptive generalization that the subject is the element which can be an argument of more than one predicate. The English extraction data are at odds with the more typical extraction data, but they still show that subjects can be identified by their extraction properties. I have included the non-inversion of subject interrogatives as a syntactic fact (rather than a morphosyntactic one) on the grounds that subject-inversion is a word order rule, and is therefore syntactic. For this reason, the fact that PP locatives behave like other subjects shows that they have the same syntactic properties as other subjects: they resist subject-inversion in interrogatives. The final observation listed under syntactic subjects is arguably not even grammatical - but there are constructional interactions between topic and syntactic structure and, indeed, focus and syntactic structure. Again, there is a descriptive generalization to be captured, that subjects generally are topics.14 The morphosyntactic properties of subjects tend to be linguistically specific.
218
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Tenseless languages will not show any morphosyntactic subject properties, and as Hudson (1999) shows, such properties are in decay in English. I have already discussed the issue to do with the imperative. I have put the obligatory criterion here, because it is related to agreement: in languages which have a highly developed agreement morphology in both the verbal conjugation system and the nominal declension system it is possible for subjects to be omitted. English has obligatory subjects - which can be expletive - which appear to be obligatory because of the impoverished inflectional morphology. 15 It is possible, on the basis of this discussion, to make some general predictions about what might be found cross-linguistically. The lexical properties of subjects will vary according to whether the language is nominative or ergative. The morphosyntactic properties of subjects will vary according to whether the language has a rich inflectional system, an impoverished inflectional system, or no inflectional system. And the syntactic properties of subjects should be relatively consistent across languages: to the extent that there is variation in the syntactic properties of subjects, it should be attributable to the interaction between this dimension of subjecthood and one of the other dimensions. From the point of view of Word Grammar, these observations about subjects are only salient if factored-out subjects can do two things: exist in mismatch structures, so that no single grammatical relation can be held to obtain between a verb and another element in a clause; and help capture descriptive facts better than simply treating subject-of as a single unitary relation. If we return to the data presented in Table 1, we can see that all of the subject properties that are found on the inverted PP are syntactic subject properties in that they are all subject properties that are relevant to the ability of a single noun or noun phrase to be construed as an argument of more than one predicate. The properties that were found on the locative PP were: • subject raising; • tag questions; • subject extraction; • extraction from coordinated constituents; • subject interrogatives. Excepting the tag-question data (which are arguably morphosyntactic, and which were, in any case, moot) all of these subject properties are properties that are related to the ability of a single subject entity to be an argument of more than one predicate. What we find is that locative PPs can behave like syntactic subjects, and that when they do behave like syntactic subjects, the NP theme argument of the verb cannot behave like a syntactic argument. This is half of an argument that subject properties are split. The other half of the argument is that the morphosyntactic properties are found on the NP. The properties that were found on the NP in English, but not on the preposed PP, were:
FACTORING OUT THE SUBJECT DEPENDENCY
219
• agreement; • subject of participial relative. Agreement is clearly morphosyntactic, and in any case, it is not possible for a category which does not have number to show agreement. The more difficult case is that of being subject of a participial relative. Bresnan (1994: 94) introduces this diagnostic because in Chichewa it is possible for a PP to be the subject of a participial relative. Given the cross-linguistic variation, this has to be assigned to an arbitrary difference between languages. It is probably due to the fact that participial relatives are adjuncts of the noun they modify, and in the semantics of English adjuncts take their head as their argument. The adjuncts which do not take their heads as their arguments are limited in number to a small set of exceptions: adjective adjuncts of verbs in, for example, resultative constructions like Jane ran her trainers threadbare where threadbare is an adjunct of ran, and its 'er' is (her) trainers. The Chichewa data suggest that, in general, the participial relative facts need to be treated as syntactic rather than lexical or morphosyntactic. The datum which I have not discussed is the inability of PP subjects to undergo subject-inversion in interrogatives. I repeat examples (35) and (36) here: (35) a. *Did over my windowsill crawl an entire army of ants? b. *Did on that hill appear to be located a cathedral? (36) a. *When did over my windowsill crawl an entire army of ants? b. *Why did on that hill appear to be located a cathedral? c. Why did an entire army of ants crawl over my windowsill? d. Why did a cathedral appear on that windowsill? The examples in (35) and (36a-36b) show that the locative PP in locative inversion cannot undergo subject-inversion. I think that the crucial thing here is that this is not a syntactic property of subjecthood, but a morphosyntactic one. 16 It is not a syntactic one, because the syntactic constraints were generally concerned with the ability of a single phrase to occur as an argument of more than one predicate. The restriction in (35) and (36) is different; in fact, it is attributable to the PP's lack of morphosyntactic properties. I take it that an auxiliary cannot invert with a phrase that it cannot agree in number with. 17 However, the examples in (37) and (38), which I repeat here, show that the locative PP behaves like a subject with respect to subject interrogatives: (37) a. Which portrait of the artist hung on the wall? b. * Which portrait of the artist did hang on the wall? (38) a. On which wall hung a portrait of the artist? b. *On which wall did hang a portrait of the artist? The split between morphosyntactic subjects and syntactic subjects permits an elegant account of the interrogative data. In locative inversion, the locative PP patterns with subjects in subject interrogatives, simply because this property is a
220
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
word-order property, which belongs in the domain of syntactic subjecthood. But the locative PP does not undergo subject-inversion because this is a morphosyntactic property. I propose that, in the case of locative inversion, we reject the data from tag questions on the grounds that even uninverted theme NPs cannot be antecedents for the pronouns in tag-questions, as we saw in the discussion of (24)-(26) above. Tag questions should be diagnostic of morphosyntactic subjecthood, given that the auxiliary verb behaves like a resumptive pronoun with respect to the main verb, and given that the agreement pattern should reflect that of the main clause. 4. 1 Summary and discussion The locative inversion data argue for a differentiation between morphosyntactic subjects and syntactic subjects - a split which is supported by an examination of subject properties more generally. The situation is slightly more complicated in that the locative inversion data argues for a two-way split, but the more general discussion of subject properties suggests that there ought to be a three-way split. However, in other general discussions of subjecthood, such as Dixon (1994), Falk (2004), and Manning (1996), only a two-way split between 'subjects' and 'pivots' is maintained. In that work, 'subjects' correspond to my lexical subjects, and the subject, as opposed to the pivot, is the argument where linking is controlled. On the other hand, in Anderson (1997) a distinction between morphosyntactic subjects and syntactic subjects, such as I have been arguing for here, is maintained. I think that the way forward is to treat the investigation of subjecthood in a similar way to the commutation-series approach to the phoneme inventory of a language. We have seen from the locative inversion data that morphosyntactic and syntactic subjects can be factored out from each other. I propose, briefly, to show that lexical and morphosyntactic subjects can be factored out, and then that lexical and syntactic subjects too can be factored out The account presented here contrasts with Bresnan's in two dimensions. Bresnan (1994, 103-5) argues that the locative PP is a subject in LFG's domain of f-structure, but that it does not occupy a subject position in LFG's domain of c-structure. 18 The clause-initial property of the locative PP in locative inversion is attributed to its also being a topic in f-structure. The other part of Bresnan's (1994: 105) analysis is that the postposed NP is identified as the f-structure object, for both English and Chichewa. The evidence and arguments that Bresnan puts forward for her analysis are largely that because it is a PP, the locative PP cannot fulfill certain structural roles associated with subjecthood, which are contingent on the property of being a nominal category. PPs do not normally have the distribution of NPs - it is only in the particular construction of locative inversion, with the additional overlay of topichood, that locative PPs may behave as subjects. 19 Bresnan claims that the f-structure element in her analysis accounts for what I have described here as syntactic subject properties; and the c-structure part accounts for the morphosyntactic subject properties.
FACTORING OUT THE SUBJECT DEPENDENCY
221
Within Word Grammar, we cannot exploit a c-structure/f-structure mismatch. Nor is it possible to assert that certain subject properties reside in one domain of structure. However, by splitting the subject properties in the way I have here, we can account for the same phenomena within a single domain of structure: the dependency relation. This buys an advantage over Bresnan's account: as we have seen, the postposed NP has the properties of a morphosyntactic subject. Bresnan treats the postposed NP as an object, but this analysis cannot account for its agreement properties. By treating it as a morphosyntactic subject, and not an object, we can account for the agreement facts without assuming, for example, that English has object agreement. In the next section, I discuss the distinction between lexical and other subjects a little further. This discussion does not add to the analysis of locative PPs in locative inversion, but it does complete the discussion of a three-way split in subject properties. 4. 1. 1 LEXICAL AND OTHER SUBJECTS We can see that lexical and morphosyntactic subjects must be factored out from each other by looking at raising and gerunds. The examples in (39) show raising, the examples in (40) show gerunds. (39) a. Jane seems to be running. b. I expect Jane to be running. In both examples in (39), Jane is the 'er' of'run'. However, in neither example can Jane be thought of as the morphosyntactic subject, because there is no agreement: the infinitive does not have a feature-value 'number'. The examples in (40) are even more acute: the gerund running also has no value for number, but it does have a subject - the pronoun me in (40a) and the pronoun my in (40b). Again, in neither case can this be thought of as morphosyntactic subjecthood. (40)
a. Everyone laughed at me running, b. My running was funny.
Additionally, the evidence from gerunds shows that lexical subjects have to be distinguished from syntactic subjects: in (40a), me is the head of running, and in (40b), my is the head of running. From these examples, we have to conclude that there are cases where lexical subjecthood has to be distinguished from syntactic subjecthood and from morphosyntactic subjecthood. Another example, although a negative one, comes from weather IT. The example in (41) shows that weather IT can be simultaneously a morphosyntactic and a syntactic subject, even though it cannot be a lexical subject given that the verb rain does not have any semantic arguments. (41)
It seems to be raining and to be sleeting.
The example in (41) is also important because it shows that the property of
222
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
being a syntactic subject is not co-extensive with being a topic. Both Bresnan (1994) and Falk (2004) argue that syntactic subjects are topics of some kind, but an expletive pronoun is not a candidate for topichood. 5. Conclusions In this section, I argue that the treatment of the data presented in this chapter handles the facts and the data more satisfactorily than the mismatch account of Bresnan (1994), and that it is more compatible with other general assumptions about the architecture of grammar that Word Grammar adopts. On the basis of the locative inversion evidence, I have made a distinction between morphosyntactic and syntactic subjects, and on the basis of further evidence from other constructions have made a further distinction which separates lexical subjects out from the other kinds of subject. Bresnan also argues for a three-way distinction, but in her case the factorization of subjecthood is over three of the domains of structure that LFG recognizes: a-structure, f-structure and c-structure. She effectively argues that the locative PPs can only be construed as subjects because they are also topics. The problem with this account is that it treats 'topic' as fundamentally syntactic, located in f-structure, when it is clear (a) that subjects need not be topics; and (b) that some subjects cannot be topics. Furthermore, we have seen that the properties of subjects itemized in section 4 do not require there to be a separate dimension of topichood - it is simply the case that some subjects are syntactic rather than morphosyntactic. In some senses, the different approaches between this chapter and Bresnan (1994) are due to underlying assumptions that the two models have, which make them different from each other. LFG does not permit there to be a mapping of more than one f-structure relation between two elements; Word Grammar does not distinguish between argument structure and the instantiated dependencies in a given construction. But it is also the case that the WG account espoused here allows the theory of subjects to be elaborated so that it can account for a wide range of differences in the spectrum of subject properties. There are some obvious avenues for future research: for example, both West Greenlandic and Mandarin are tenseless. For this reason, Mandarin has been argued not to have the subject and object dependencies that are witnessed in other languages. However, while Mandarin has long-distance reflexives, West Greenlandic does not. One salient difference is that Mandarin is a nominative language while West Greenlandic is an ergative language, and so the question is begged whether these facts are attributable to differences in lexical subjects in these languages. Certainly more research is required on the cross-linguistic typology of dependencies. Meanwhile, it is clear that the English Word Grammar model needs to be revised, to admit at least three different kinds of subject.
FACTORING OUT THE SUBJECT DEPENDENCY
223
References Aissen, J. (1975), 'Presentational-£/z
Jackendoff, R. S. (1990), Semantic Structures. Cambridge, MA: MIT Press. Keenan, E. L. (1976), 'Towards a universal definition of "subject"', in Charles Li (ed. ), Subject and Topic. New York: Academic Press, pp. 303-33. Keenan, E. L. and Connie, B. (1977), 'Noun phrase accessibility and universal grammar'. Linguistic Inquiry, 8, 63-99. McCloskey, J. (1997), 'Subjecthood and subject position', in L. Haegeman (ed. ), Elements of Grammar: Handbook of Generative Syntax. Dordrecht Kluwer, pp. 197-235. — (2001), 'The distribution of subject properties in Irish', in W. D. Davies and S. Dubinsky (eds), Objects and Other Subjects. Dordrech: Kluwer, pp. 157-92. Manning, C. (1996), Ergativity. Stanford, CA: CSLI Publications. Notes 1 Unless I explicitly state otherwise, the examples in section 1 and section 3 (where I present Bresnan's locative inversion data) are taken from Bresnan's (1994) paper, and the grammaticality judgements are hers. I have, however, silently amended Bresnan's spelling to British English norms. 2 This is a pre-formal statement, and I do not intend it to commit me to any particular theoretical position. 3 English makes a distinction between disjoint pronouns - the forms him, her, me and so forth, and anaphoric pronouns like himself, herself, myself and so forth. Not all languages make this distinction, and English has not always made the same distinction in the course of its history. 4 The underscore represents the subject position for to. I do not mean by this representation to suggest that there is actual movement - like Hudson (1990) I reject a movement account The representation is intended to be pre-formal, and is borrowed from Bresnan (1994), whose examples I borrow in section 3 - and in borrowing some of these examples, I import the representation. 5 I adopt the analysis of subject-verb agreement presented in Hudson (1999), which
224
6
7 8 9 10
11 12 13
14 15
16 17 18 19
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE argues, in summary, that present-day English represents a transitional stage where, except in the case of be, number is the only remaining agreement feature. The italicized part of the sentence shows that in (22a) [o]ver my windowsill appears to have been raised from the subject position of to have crawled an entire army of ants into the subject position of seems. However, in borrowing this representation, I do not commit myself to a movement analysis of these data. These examples are not drawn from Bresnan's paper. This claim is debatable: there in this example is not the deictic there of there it is, but the empty one of there's a problem. It might make more sense to say that it was coreferential if it were the deictic there. The table shows Bresnan's (1994) evaluation of this evidence, although it seems clear that the tag-question data is rather more moot than her presentation would suggest. I return to this in section 4 below. This is not, purely, a subject diagnostic: the point is that the parallelism shows that if the extracted PP is to be treated as a subject in one conjunct, it cannot be an object or other argument in the second conjunct, which suggests strongly that it is actually a subject. Of course, when the theme NP is in the normal subject position it can undergo inversion: the point of these cells in the table is that neither the NP nor the PP can undergo inversion when it is the PP that is in subject position. I am using these terms pre-formally as a descriptive heuristic. I refine the terms, and the analysis, below. I am leaving the reflexive binding facts out of the lists. These facts could theoretically belong in all three lists - and for that reason, more research needs to be done about the relationship between reflexive binding and the dimensions of subjecthood. I shall come back to this briefly in section 5; here it suffices to point out that binding has been treated in terms of clause domains, which is either syntactic or morphosyntactic, depending on how clauses are denned, and in terms of hierarchies of arguments, which is clearly lexical. Of course, a subject need not be a topic: weather it, the it of extraposition, and expletive THERE cannot be treated as topics given that they are not referential. It might be objected that Mandarin has no inflectional morphology whatever, yet subjects can be omitted in Mandarin when they can be pragmatically recovered. This point is, however, consistent with my observation: Mandarin has no tense; indeed, it could be argued that it has no finiteness at all. Given that lack of morphosyntax in Mandarin, it is unsurprising that it does not have the category of morphosyntactic subjects. And given that subjects can be omitted in languages with a rich morphosyntactic system, as well as in languages lacking a morphosyntactic system, we can deduce that obligatory subjecthood is neither a lexical property, nor a syntactic property. Hudson (1990: 240) also treats subject-inversion as a morphosyntactic property. This claim argues for a treatment of THERE-insertion where THERE acquires its number from its 'associate', given that THERE can invert with an auxiliary. She goes on to reduce the typological differences between English and Chichewa to a difference in the c-structure representations: in Chichewa, the f-structure PP subject is also a c-structure subject For this reason, Bresnan (1994: 110) distinguishes between 'place' and 'time' denoting PPs, which can have the same distribution as nominal elements, and the PPs found in locative inversion.
Conclusion KENSEI SUGAYAMA
The movement of Word Grammar began largely as an approach to the analysis of grammatical structure and linguistic meaning in response to constituencybased grammar and generative grammar. In this book, we have focused on the analyses of morphology, syntax, semantics, and discourse based on the fundamental hypotheses presented in the Preface and Chapter 1: WG is monostratal; it uses word-word dependencies; it does not use phrase structure; and language is viewed as a network of knowledge, linking concepts about words, their meanings, etc. We conclude our survey by pointing out some of the ways Word Grammar has gone, and should go, beyond its boundaries. The monostratal character of WG is an advantage, especially the absence of transformations, even of movement rules. Their role has been taken over by the acceptance of double dependency within certain limits. Although word-word dependencies are difficult to accept for a number of grammarians, it makes the grammar simpler and is also important in determining the default word order of a language. The notion of phrase is not completely lost in WG since phrases can be seen as dependency chains. It is also a good idea to see grammatical relations (subject, object, etc. ) as a subclass of dependents. WG presents language as a network of knowledge, linking concepts about words, their meanings, etc. In this network, there are no clear boundaries between different areas of knowledge - e. g. between 'lexicon' and 'grammar', or between 'linguistic meaning' and 'encyclopedic knowledge'. It is rarely known that this hypothesis was advanced much earlier than the contemporary movement of cognitive linguistics. Thus WG has implied since its very start in the early 1980s that conceptual structures and processes proposed for language should be essentially the same as those found in nonlinguistic human cognition. It uses 'default inheritance', as a very general way of capturing the relation between 'model' or 'prototype' concepts and 'instances' or 'peripheral' concepts. 'Default inheritance' and especially 'prototypes' are now widely accepted among linguists. As Richard Hudson puts it in his conclusion to Chapter 1, WG addresses questions from a number of different research traditions. As in formal linguistics, it is concerned with the formal properties of language structure; but it also shares with cognitive linguistics a focus on how these structures are
226
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
embedded in general cognition. Within syntax, it uses dependencies rather than phrase structure, but also recognizes the rich structures that have been highlighted in the phrase-structure tradition. In morphology it follows the European tradition which separates morphology strictly from syntax, but also allows exceptional words which contain the forms of smaller words. And so on through other areas of language. Every theoretical decision is driven by two concerns: staying true to the facts of language, and providing the simplest possible explanation for these facts. The search for new illuminating insights is still under way, and more widespread beliefs may well have to be abandoned; but the most general conclusion so far, as Richard Hudson says, seems to be that language is mostly very much like other areas of cognition. Thus, Word Grammar in its architecture has the potential to make a contribution to a theory of cognition that goes beyond language.
Author Index
Abeille, A. 155 Anderson, J. M. 6, 220 Andrews, A. 35, 72, 205 Baltin, M. 153 Biber, D. et al, 92, 94, 95, 97, 99 Boguraev, B. 83 Borsley, R. D. 161, 167n. 4, n. 5, n. 6 Bouma, G. 155, 157 Breheny, R. 201 Bresnan, J. 6, 84, 205, 210, 220, 223n. 13, 224n. 7, n. 9, n. 19 Briscoe, T. 83 Chomsky, N. (including Chomskyan) 9, 35, 41, 84, 87, 162 Chung, C. 145, 146, 151, 153, 158, 160-5 Connie, B. 205 Conrad, S. 92, 94, 95, 97, 99 Copestake, A. 83 Cormack, A. 201n. 2 Creider, C. 202n. 13 Croft, W. 83 Cruse, D. 92 Davis, A. Ill Dixon, R. M. W. 220 Dowty, D. 95, 111, 112 Eppler, E.
118, 122, 127, 129
Falk, Y. 205 Fillmore, Ch. 6 Finegan, E. 92, 94, 95, 97, 99 Gazdar, G.
75
Ginzburg, J. 145, 146, 154-60, 164, 165, 167n. 7 Godard, D. 155 Goldberg, A. 8, 83, 87, 111, 113 Haegeman, L. 146, 167n. 3 Halliday, M. A. K. 5, 6 Henniss, K. 51 Holmes, J. 84, 112, 113, 114 Huddleston, R. 68, 69 Hudson, R. A. 35, 42, 43, 50, 69, 70, 84, 87, 92, 99, 121, 122, 125, 126, 145-53, 165, 167n. 1, n. 2, 183, 191, 202n. 13, 204, 205, 218, 223n. 4, n. 5, 224 Jackendoff, R. 24, 111, 206, 211 Jaworska, E. 20In. 3 Johansson, S. 92, 94, 95, 97, 99 Rasper, R. 146, 161 Kathol, A. 146, 161, 167n. 8 Keenan, E. L. 205 Kim, J. -B. 145, 146, 151, 153, 155, 157, 158, 160-5 Koenig, J. -P. Ill Koizumi, M. 153 Lakoff, G. 3 Lamb, S. 9 Langacker, R. 3, 8 Lasnik, H. 50, 72, 162 Lecarme, J. 35 Leech, G. 92, 94, 95, 97, 99 Lemmens, M. 83 Levin, B. 83, 95-6 Levine, R. 146, 161
228
Lyons, J. 6
AUTHOR INDEX Rosta, A. 202n. 16, 203n. 21 Rutherford, W. E. 135, 136, 137
McCawley, J. 6 McCloskey, J. 205 Sag, I. A. 35, 51, 145, 146, 154-60, 164, Malouf, R. 155, 157 165, 167n. 1, n. 7 Manning, C. 220 Sankoff, D. 120, 129 Muysken, P. 118, 119, 121, 124, 130, 140 Shaumyan, O. 6 Shibatani, M. 87 Payne, J. 201n. 2 Sigursson, H. 50 Penn, G. 161 Sugayama, K. 7, 167n. 6 Pinker, S. 11 Pollard, C. J. 35, 51, 146, 161, 167n. 1 Taylor, J. 3 Przepiorkowski, A. 155 Tesniere, L. 4 Pustejovsky, J. 83 Trask, R. 111-2 Trier, J. 92 Quicoli, A. C. 35 van Langendonck, W. 183 Quirk, R. et al. 180 van Noord, G. 155 Rappaport Hovav, M. 83 Warner, A. R. 76 Reape, M. 146, 161 Weisgerber, L. 92 Rizzi, L. 146 Ross, J. R. 154 Williams, E. 87
Subject Index
a(n) 202 accusative case 36, 38, 39, 41, 45, 50 accusative subject (see subject(s)) actor (see participant role(s)) adjunct 149-51, 154, 157, 159, 192 adjunction 192 adjective (s) attributive adjectives 182 predicative adjective 35 adverbial 155, 157 agent (see participant role(s)) agreement 95 all but 174ff, 178 almost 174ff, 178 argument role 113 attraction 36, 40, 41, 46, 50, 53n. 3, 53n. 4 atypical complement 202 auxiliary quasi-auxiliary 67 semi-auxiliary 67
Code-mixing 117, 118, 119, 120, 122-4, 128, 129, 131, 139 Code-switching 118, 124, 127, 128 cognition, 3, 15, 22, 24, 28 cognitive linguistics 3, 28 comment 197 compaction 161 complementizer 154, 162, 167n. 6 complex coordination (see coordination) concept 5, 8, 9, 10, 11, 12, 13, 16, 24, 25 conjunct 189 connectivity 189 constituency 165, 167n. 9 constraint 117-20, 122, 127, constructional constraint 161 coordination 21, 22, 23, 190-1, 202 complex coordination 191, 198 correlative 191 count interpretation (see also mass interpretation) 201n. 9
be 68, 69 be to construction 67, 70, 72, 73, 75, 78, 79 because 117, 128, 129-40 beneficiary (see participant role(s)) Best Fit Principle 15 binder 188, 200 both 191 branch 188 Branch Uniqueness principle 190, 192,
default definiteness 63 default inheritance 5, 6, 8, 12-13, 16, 20, 123, 124, 126 degree words 181 deletion 176 VP deletion 48 demand 174, 177
202
case agreement 35, 41-2, 45, 51 clausal that 174 clitic 21-2
dependency 6, 21-4, 27-8, 122, 125, 127, 139, 145-54, 165, 167n. 6, 204 long-distance dependency dependency types 191 depictive 192 determiner 182 difficulty 27, 28 distribution 172
155
230
SUBJECT INDEX
ditransitive 198 DOMAIN (DOM) (see HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR) eat type verbs 56, 59 ee (see also er) 101, 104, 110 either 191 ellipsis 176, 182, 183, 193, 201n. 5, 201n. 6 ellipsis, anaphoric 48 ellipsis, determiner-complement 201n. 5 embedded clause 145-6, 156, 159, 160, 162-5 empty category 187ff, 202 empty element 35 endocentricity (see also hypocentricity) 173 English 117, 118, 123, 125, 127-40 er (see also ee) 101, 104, 110 even 178ff event 104-5, 108-10 existence propositions 35 exocentricity 173 extractee 149-54, 167n. 6 extraction 21, 23 extraposition 183, 202 feature structure (see HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR) filler (see HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR) finite 197 form 13, 15, 18, 20-1, 25 'future in the past' 74 gap-ss 157 Generalized Head Feature Principle (GHFP) 156, 158 generative grammar 5-6 German 117, 118, 123, 125, 127, 128, 130-40 goal (see participant role(s)) grammatical relation 192 GR-word 192, 195 guardian 196, 200 head 154-60, 173 Head-Driven Phrase Structure Grammar (HPSG) 35, 42, 51, 52, 56,
145-6, 154, 160, 164, 165, 167n. l, n. 9 DOMAIN (DOM) 161-3 feature structure 154 filler 156, 158 HEAD (in HPSG) 154 SLASH 154-59, 167n. 7 synsem (value) 155, 157 SYNTAX-SEMANTICS (SYNSEM) 154-6 hypocentricity (see also endocentricity) 173, 178, 199-200 immediate dominance rules 167n. 9 incremental theme 95, 112 INDEPENDENT-CLAUSE (1C) 156,
158-60 infinitival subject (see subject(s)) inflection 4, 6, 14, 20, 25, 26 Inheritance Principle 70 inside-out interrogative construction 185ff instance (see also model) 57, 58, 69, 67, 70, 72, 77, 78, 79, 81 interrogative 146-54, 158-60, 162 interrogative clause 185ff Inversion 75 inversion, locative 21 Off, inversion, subject-auxiliary (see SAI) INVERTED (INV) 159, 160 isa 8, 9, 10, 11, 12, 13, 14, 15, 16, 57, 58 just
178ff
landmark 105, 110, 112, 146, 152 lexeme 5, 14, 18, 21, 24, 27, 34, 36 ACCORD 91 AT 105 BE 107 COVER 114 DEVOUR 93 DO 96 FOR 89-90 GIVE 84, 85, 91, 107 HAVE 95, 96, 107 IN 105 LOAD 114 ON 105 OPEN 90
231
SUBJECT INDEX lexeme, cont. POUR 114 SPRAY 114 TO 89, 107 WITH 108 Lexical Functional Grammar (LFG) 204, 220, 222 lexical relationship 27 lexical subjects (see subject(s)) lexical unit 67, 77, 81 linear order 145, 146, 151, 163-5 linear precedence rule 165 local tree 161, 165 locative inversion 21 Off, long-distance dependency (see dependency) mass interpretation 185, 201n. 9 mental nodes 43 model (see also instance) 57, 69, 70, 79 modularity 11-12 more than
174ff
morpheme 5, 15, 17, 18, 19, 20 morphological case 41 morphology 4, 7, 18-21, 22, 28 morphology, derivational 18, 20 morphology, inflectional 18, 19 morphology-free syntax 4 morphosyntactic subjects (see subject(s)) multiple inheritance 12, 13, 14, 91 neither
191
network 5, 6, 7-10, 11, 13, 14, 15, 17, 20, 22, 24, 27, 123, 124, 126 never 174ff NICE properties 72 No-Dangling Principle 150 No-Tangling Principle 147, 151-4, 167n. 2 not 174ff, 178 noun + noun premodification 193-5 null subject (see subject(s)) object 92-7, 101 object, indirect 84-92, 102, 107 objects, suppressed 54, 56, 59, 60, 62 objects, understood 59, 65n. 7 object pro-drop 47 only 178ff Order Concord 167n. 2 order domain 146, 161-4 other than
174ff
parent 146, 148-50, 152-4, 165 parsing 27, 28 participant role (s) 113 actor 107, 110, 111, 113 agent 112 beneficiary 89-90, 102 goal 111 patient 108, 110, 111, 112 recipient 85, 89, 102-3 theme 107, 110, 111, 112 passive 87, 93 past 90 patient (see participant role(s)) phrase 145-6, 154-8, 162-5 pied-piping 171, 181 plural inflection 202n. ll pluralia tantum 47, 48 precedence concord 190 predicate 195 predicate nominal 35, 41 predication 19-56 present 99 principle of No Crossing 190 principle of Node Lexicality 190 PRO 35, 42, 44, 45, 49, 50, 51, 52 processing 7, 16, 27-28 pro-drop 45, 46-7 Promotion Principle 146 proxy 177ff, 188, 199 pseudogapping 201n. 5 purpose 89 quantity variable 43, 44 question 98 recipient (see participant role(s)) regent 172 require
174
restrictive vs. non-restrictive 135-40 result 95, 108 Right Node Raising 203n. 21 semantic phrasing 26 semantics 4, 6, 7, 13, 24-7 sharer 147 shave type verbs 59 SLASH (see HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR) sociolinguistics 4, 6, 7 SOV 127, 128, 130 split subjects (see subject(s))
SUBJECT INDEX
232
SSP (see Syntax Semantics Principle) state 104-8 Stratificational Grammar 6 structure-sharing 22, 23 SUBCATlist 36, 51 subject(s) 97-101, 195, 205 accusative subject 36-9, 40 infinitival subject 38, 50 lexical subjects 216 morphosyntactic subjects 217 null subject 36, 44-5, 46, 49, 51, 52
split subjects 205 subject properties 206 ff syntactic subjects 218 subject-auxiliary inversion (SAI) 151, 153, 158, 160, 162, 202n. l9 subordinate 172 subordinate clause 145-46, 149-54, 159-60 superordinate 172 surface structure 147, 149-51 surrogate 174, 177ff, 187, 199 SVO 127, 128, 130, synsem (value) (see HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR) syntactic subjects (see subject(s)) syntax 4, 21-4, 25, 26, 27-8 Syntax Semantics Principle (SSP) 84, 94, 103, 110 SYNTAX-SEMANTICS (SYNSEM, see HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR; see also synsem)
terminal node 201n. l thai-clause 174 theme (see participant role(s)) three-place predicate 55 token (see also type) 43, 46 topic 145, 146, 152, 153, 158, 161-4, 197 topicalization 146, 156, 157, 159 Topological Linear Precedence Constraint 162-4 type (see also token) 43, 46 type-of construction 184-5 unreal words (see words) Unspecified Object Alternation utterance 15-17
65n. 3
V2 128, 128, 132, 136, 137 VP fronting 75
way 183 weil (German] 117, 130-40 ^-interrogative 146, 147, 149-54, 158-60, 162 w/z-pronoun 147, 148, 149, 153, 154 win type verbs 59 word(s) 4, 5, 13, 14, 15, 18-19, 20, 21-2, 23, 24, 26, 28 unreal words 6, 22, 23-4 Word Grammar (WG) 56-8, 117, 121-7, 137, 139, 140, 145-154, 159, 164-5, 167n. 2, 204, 205, 218, 221, 222 word order 146, 147, 152, 153, 161, 165, 172, 177, 188 word-order rule 146, 147, 152, 153