Skip to main content
eScholarship
Open Access Publications from the University of California

Glossa Psycholinguistics

Glossa Psycholinguistics banner

Children’s acquisition of new/given markers in English, Hindi, Mandinka and Spanish: Exploring the effect of optionality during grammaticalization

Published Web Location

https://doi.org/10.5070/G6011120Creative Commons 'BY' version 4.0 license
Abstract

We investigated the effect of optionality on the acquisition of new/given markers, with a special focus on grammaticalization as a stage of optional use of the emerging form. To this end, we conducted a narrative-elicitation task with 5-year-old children and adults across four typologically-distinct languages with different new/given markers: English, Hindi, Mandinka and Spanish. Our starting assumption was that the Hindi numeral ‘ek’ (one) is developing into an indefinite article, which should delay children’s acquisition because of its optional use to introduce discourse referents. Supporting the Optionality Hypothesis, Experiment 1 revealed that obligatory markers are acquired earlier than optional markers. Experiment 2 focused on Hindi and showed that 10-year-old children’s use of ‘ek’ to introduce discourse characters was higher than 5-year-olds’ and comparable to adults’, replicating this pattern of results in two different cities in Northern India. Lastly, a follow-up study showed that Mandinka-speaking children and adults made use of all available discourse markers when tested on a familiar story, rather than with pictorial prompts, highlighting the importance of using culturally-appropriate methods of narrative elicitation in cross-linguistic research. We conclude by discussing the implications of article grammaticalization for common ground management in a speech community.

Main Content

1. Introduction

Keeping track of referents in discourse requires monitoring which referents are new and which referents are given. In English, for example, a discourse referent can be marked as new through the use of an indefinite article (e.g., This morning we saw a fox in the garden), and later be marked as given by using a pronoun or a definite article (e.g., When it saw us, the fox got scared and ran away). Languages vary in the way in which they mark new and given referents, not only in form (e.g., whether they use articles or other discourse markers with the same function) but also in optionality (i.e., the degree to which the use of markers to signal a certain discourse function is optional or obligatory). By their very nature, optional discourse markers are used less consistently than obligatory ones, resulting in greater variability in the input that children receive. As a result, optionality has been found to affect children’s acquisition of discourse markers, with obligatory markers emerging earlier than optional ones (Berman & Slobin, 1994; Hickmann et al., 1996). Here we studied children’s production of discourse markers across four typologically-distinct languages, which differ in the degree to which the marking of new and given referents is optional: English, Hindi, Mandinka and Spanish (with samples collected in the United Kingdom, India, The Gambia and Spain, respectively).1

The main aim of this study was to investigate the effect of optionality on children’s acquisition of discourse markers. More specifically, we were interested in a particular type of optionality: that observed during a process of grammaticalization. For instance, the diachronic process whereby numerals are grammaticalized into indefinite articles (e.g., when the Old English form for ‘one’ gave rise to the indefinite article a in modern English) has been documented in many different languages (see Givón, 1981). However, from the point in time when the numeral ‘one’ starts being used to introduce new discourse referents until it becomes an indefinite article that must be used obligatorily with the same discourse function (Heine, 1997), the use of this form remains optional. Generalizing what we know from the language acquisition literature (e.g., Slobin, 1985; Berman & Slobin, 1994; Hickmann et al., 1996; Hickmann & Hendriks, 1999; MacWhinney, 2001; Narasimhan & Dimroth, 2008), we hypothesize that the acquisition of optional markers undergoing a process of grammaticalization should be protracted relative to fully grammaticalized markers.

A potential process of grammaticalization of the numeral ‘one’ into an indefinite article has recently been identified in Chinese (Chen, 2004; Wong, 2016) and Polish (Hwaszcz & Kędzierska, 2018). Here we investigated the case of Hindi, which, like Chinese and Polish, does not have an article system, but allows for the use of the numeral ‘one’ to introduce new discourse referents (Kachru, 1980, 2006; Dayal, 2004, 2018; Sharma, 2005). Based on a linguistic analysis of diachronic change and Hindi semantics, our starting assumption was that Hindi is undergoing a process of grammaticalization of their numeral ek into an indefinite article. However, the goal of this study was not to test this assumption, but rather to investigate the effect of optionality on children’s acquisition of discourse markers, understanding grammaticalization processes as a special kind of optionality.

To this end, children’s and adults’ use of the Hindi numeral ek to introduce discourse characters was compared with the use of indefinite articles in English and Spanish (which is obligatory for the same discourse function), and the use of bare nouns in Mandinka. Like Hindi, Mandinka also lacks articles, but to the best of our knowledge, it does not use the numeral ‘one’ (kiiling) for reference introduction, offering another interesting comparison with the use of ek in Hindi. Regarding the marking of familiar referents, this is obligatory in all four languages, which vary in the grammatical forms available for this discourse function. The second experiment in the study exclusively focused on Hindi, extending the initial investigation to older children and two additional locations in Northern India.

2. The acquisition of discourse functions in narratives

Previous studies on the early stages of acquisition of the new/given distinction focused on spontaneous production in young children’s speech (see Hickmann et al., 2015). However, studies focusing on later stages of development recognized the importance of eliciting child data in extended discourse, as it allows for a greater range of variation in referential strategies (for a review, see Berman, 2015). In particular, third-person narratives have been frequently used because they allow researchers to test the development of appropriate forms of reference for different functions. Bamberg (1986, 1987) identified three discourse functions in narratives: (i) introductions for new characters, (ii) maintenance for referents that have been mentioned in the immediately preceding context, and (iii) reintroduction for referents that have been mentioned before but not in the immediately preceding context.

Cross-linguistic studies have found that children as young as 3 years use pronouns appropriately for character maintenance (Hickmann & Hendriks 1999; Vion & Colas, 1999; Wong & Johnston, 2004). These studies also showed that young children rarely use pronouns to introduce new characters, opting for indefinite and definite nominals to mark that function. However, they often use pronouns inappropriately for character reintroductions, resulting in ambiguous reference. Findings from studies of naturalistic dialogue in various languages confirm the above observations (e.g., Orvig et al., 2010), with some studies reporting appropriate use of light forms (including overt and covert pronouns) for reference maintenance as early as in the two-word stage (Serratrice, 2005; Hughes & Allen, 2013). By the age of 2;5, English-speaking children also demonstrate sensitivity to discourse structure in disambiguating pronominal reference by interpreting pronouns as referring to the most prominent character, even when only the first mention and subject status of the noun are used to establish its prominence in the preceding context (Song & Fisher, 2007).

Findings from cross-linguistic studies of reference in extended discourse also report an early sensitivity to discourse continuity and the use of appropriate forms for familiar characters by the age of 4, whereas new characters are not introduced appropriately before the age of 7, and character reintroduction is not mastered until age 10 (Hickmann et al., 1996; Hickmann & Hendricks, 1999). In conclusion, the results of previous studies suggest the following order of acquisition of discourse functions in narratives across languages: Maintenance > Introduction > Reintroduction. Interestingly, the same order of acquisition of discourse functions has been reported in studies of second language learning (e.g., Jarvis, 2002; Sharma, 2005).

Wong and Johnston (2004) argue that the order of acquisition of the three discourse functions results from the different cognitive demands that they pose on the speaker, as speakers must maintain a mental model of the ongoing discourse in order to keep track of the listener’s knowledge (Levelt, 1989). In their view, introduction and maintenance require information about the listener’s knowledge to be updated, but reintroduction makes a further demand: that the listener’s focus of attention be monitored. Thus, according to Wong and Johnston (2004), young children may be familiar with the appropriate linguistic forms for reintroduction (e.g., definite articles), but their failure to track the listener’s attentional focus leads to inappropriate referencing.

A close comparison of the acquisition of appropriate markings for the three discourse functions across typologically-distinct languages reveals that the linguistic properties of the systems also play a role in acquisition. Some markings operate directly on the noun phrase and are therefore known as local markers (e.g., articles in English), while markings operating at the sentence level are known as global markings (e.g., in Chinese, all new referents occur in post-verbal position). Hickmann et al. (1996) investigated the acquisition of discourse markers in English, French, German and Chinese and observed that optional local markers of newness are infrequent in all the languages included in their study until 5 years of age. Interestingly, the obligatory global markings emerged later than the optional local markers in Chinese, becoming frequent by age 7. Hickmann et al. (1996) argue that the delayed acquisition of global markings (even when obligatory) is due to their functional complexity since there is no one-to-one mapping between sentence position and discourse function (see also Slobin, 1985; MacWhinney, 2001; Narasimhan & Dimroth, 2008).

In a more recent study, Aksu-Koç and Nicolopoulou (2014) compared children’s use of local and global markings to differentiate new and given referents in three languages: Greek, English and Turkish. Their findings show that children as young as 3 predominantly used indefinite forms for character introductions in Greek, but they did not do so until the age of 5 in English (which has a morphologically poorer article system, while Greek has a bundle of co-occurring features supporting the formation of form-function relations) and also in Turkish (which uses global markings).2 Thus, while obligatoriness may facilitate marker acquisition, global markings appear to be particularly difficult to acquire, even when they are obligatory.

Children’s use of referential forms in narrative-elicitation tasks can also be influenced by the previous discourse and by the listener’s perceptual access to the pictures. English-speaking children as young as 2;5–3;5 years old are sensitive to the type of question used in narrative elicitation (e.g., What is X doing? vs. What is happening?; Campbell et al., 2000; Matthews et al., 2006). Cross-linguistic studies have also reported that children show a developing sensitivity to the interlocutor’s perceptual access to the pictures, using light referential forms more often when they are narrating the story to an interlocutor who has perceptual access to the pictures, than when the interlocutor is ignorant (Hickmann et al., 1996; Serratrice, 2008). However, this is a late emerging sensitivity and is only systematically observed at around age 9.

The present study focused on local markings in the four languages we investigated and examined the role that optionality plays in the acquisition of those markings. Since previous work suggests that obligatory local markings emerge by the age of 3 (Berman & Slobin, 1994) and optional local markings are frequently used by the age of 5 (Hickmann et al., 1996), we compared 5-year-olds’ use of discourse markers of newness and givenness to their adult counterparts across the four languages in our study. In addition, because previous studies report earlier sensitivity to discourse structure than to the interlocutor’s perceptual access, in this study we used a general event-focus question – ‘What is happening here?’ – to elicit narratives in contexts with shared perceptual access between the participant and the experimenter.

Relative to previous cross-linguistic studies, our work aims to deepen our understanding of the effect of optionality on the acquisition of discourse markers by focusing on the grammaticalization of the Hindi numeral ek into an indefinite article, and comparing reference introduction in Hindi to other languages with and without indefinite articles. To the best of our knowledge, this is the first study to treat grammaticalization as a stage of optional use of the emerging form, therefore exploring the implications of diachronic change for language acquisition.

3. The grammaticalization of ‘one’ into an indefinite article

Language acquisition studies normally adopt a synchronic viewpoint, treating languages as fixed systems that young children must acquire through exposure. However, we know from diachronic linguistics that languages are constantly in flux (Heine & Kuteva, 2006). Thus, studying the acquisition of discourse markers in languages where articles are in the process of being grammaticalized sheds light on how optionality affects the acquisition of discourse markers, while providing us with an opportunity to study language change synchronically.

In his investigation of the development of the numeral ‘one’ into an indefinite article in Israeli Hebrew, Givón (1981) characterizes this process of language change as universal, having been independently attested in Germanic, Romance, Mandarin, Sherpa, Hungarian, Neo-Aramaic, Persian, Turkish and various Amerindian and Austronesian languages, in addition to being a hallmark of all Creole languages (p. 35). Heine (1997) proposed five stages in the process of grammaticalization of the numeral ‘one’ into an indefinite article: (i) the word ‘one’ is only used as a numeral; (ii) it is used to introduce characters in the discourse; (iii) it is used to mark specific indefiniteness; (iv) it is used to mark non-specific indefiniteness; and (v) it is completely grammaticalized as an indefinite article occurring with all types of nouns.

Since character introduction occurs early in the process of grammaticalization, languages at this stage do not yet mark new discourse referents obligatorily, resulting in children receiving an inconsistent input. Of the four languages in our sample, English and Spanish require the use of indefinite articles to introduce new discourse characters, whereas Hindi lacks an article system. However, Dayal (2018) has recently argued that the numeral ek ‘one’ can be used to introduce new discourse referents in Hindi and is frequently used with that discourse function, although it does not satisfy diagnostic semantic tests for indefinite articles (Chierchia, 1998; for discussion, see Dayal, 2018). While Dayal’s semantic analysis of ek confirms that it is not an indefinite article, it is important to note that the relative frequency with which it is used to introduce discourse referents in Hindi is open to empirical investigation. Compatible with Dayal’s synchronic analysis, here we assume that Hindi is in the second stage of Heine’s (1997) grammaticalization scale (i.e., the numeral ‘one’ can be used to introduce new discourse characters), and investigate (a) the frequency with which ek is used with that narrative function by both adults and children, and (b) how optionality affects the acquisition of this discourse marker relative to languages with indefinite articles, such as English and Spanish.

The fourth language in our sample, Mandinka, is similar to Hindi in that it does not have an article system, although it has a numeral ‘one’ (kiiling) which could in principle be used to introduce new discourse referents (as it has been recently reported in other languages without articles, such as Chinese (Chen, 2004; Wong, 2016) and Polish (Hwaszcz & Kędzierska, 2018)). The comparison between Mandinka and Hindi is also interesting because both The Gambia and India were British Colonies and many schools are English-medium. Since language contact is an important driver of language change (Heine & Kuteva, 2006), if the numeral ‘one’ was being grammaticalized as an indefinite article in Hindi or Mandinka, this process could be related to the contact of those languages with English (although this hypothesis was not tested in our study).

It must be noted, however, that Mandinka is an under-studied language and it has therefore not been documented whether their numeral ‘one’ is used to introduce new discourse referents. This would not be a surprising development, given the universality of this particular instance of language change (Givón, 1981). Therefore, the present study investigated the possible emergence of an indefinite article in Mandinka, in comparison with Hindi.

Finally, while indefinite articles tend to emerge from the numeral ‘one’, the universal path for the emergence of definite articles starts with exophoric demonstratives, which evolve into anaphoric demonstratives and later into definite articles (see Greenberg, 1978; Lyons, 1999). Amongst the languages in our sample, English and Spanish have fully grammaticalized definite articles, whereas, to the best of our knowledge, anaphoric demonstratives in Hindi and Mandinka are not undergoing a process of grammaticalization into definite articles. Therefore, our investigation of the use of markers of familiarity in these four languages will not look at the effect of grammaticalization. However, we will return to the grammaticalization of definite and indefinite articles and their connection to common ground management and Theory of Mind development at the end of the paper (see also Rubio-Fernandez, 2020).

4. Marking new/given referents in English, Spanish, Hindi and Mandinka

Below is a brief description of the referential systems of the languages included in our study:

English

English is a West Germanic language with default SVO order, which is fairly rigid and can be used to express grammatical relations between arguments. Regarding the new/given distinction, English has a grammaticalized article system, with indefinite articles being mandatory to introduce new referents, and definite articles marking given referents. There is no gender or number agreement in articles. Demonstratives can also be optionally used to mark givenness, and they are only marked for number. Given referents are also signaled by the use of pronouns, which have number agreement. In addition, third-person singular pronouns establish a three-way gender distinction (she/he/it). Bare nouns are used for mass nouns (e.g., rice or water) and for generic readings of plural nouns (e.g., Tigers live in jungles). However, singular bare nouns are grammatical in very few contexts by comparison (e.g., location (John is on television) or coordination (Kate ate with knife and fork)). Global markings like subject-verb inversions are also used in certain contexts to mark newness, but they are less frequent (e.g., Dancing in the rain was a girl).

Hindi

Hindi is an Indo-Aryan language spoken in the Indian subcontinent. It has a relatively free word-order with SOV as the default. Word order is associated with information structure and all major constituents can be scrambled. Sentence initial noun phrases generally refer to given entities in the discourse and new entities occupy immediately preverbal position. Nouns scrambled to the sentence initial position get definite interpretation (Kidwai, 2000).

Hindi does not have an article system, but optional local markers (such as numerals, demonstratives and case marking), as well as optional global markings (such as word order variations) can be used to mark the givenness of a referent. The latter possibility, however, is limited to specific syntactic constructions (with inanimate referents in direct object position) and is not generally available for every utterance. Third person pronouns are marked for number in some dialects of Hindi, but are not marked for gender, and can be used as demonstratives. The verb is inflected for gender agreement with the subject or the object depending on the construction type. Pronouns can be dropped in subject position.

Bare nouns usually have a definite interpretation in Hindi. However, in specific contexts, such as the direct object position (1) and locative constructions (2), bare nouns get an indefinite reading (for discussion, see Dayal, 2018):

    1. (1)
    1. mɛ̃
    2. I
    1. kitāb
    2. book
    1. paṛh-rahī-thī.
    2. read-PROG-PAST
    1. ‘I was reading a book.’
    1. (2)
    1. kamre-mẽ
    2. room-in
    1. čūhā
    2. mouse
    1. hɛ.
    2. is
    1. ‘There is a mouse in the room.’

Of the available optional local markers, the numeral ek ‘one’ is the most frequently used for marking new referents and it has even been claimed that it is required to introduce discourse characters (Dayal, 2018), although this claim has not yet been empirically tested.

Mandinka

Mandinka is a Mandé language spoken in The Gambia and Senegal, amongst other West African countries. It has a rigid SOV order and lacks an article system. A nominal marker –o, which has sometimes been called a default marker, occurs obligatorily with most nouns. However, in certain contexts, such as negative or interrogative clauses and NPs including a numeral, the default marker -o can be dropped, and its absence impacts the meaning of the NP. The following is an example from Creissels (2012a) illustrating the interaction between –o marking and negation in Mandinka (note that musu + lexical marker -o > mus-ôo; PF = Perfective):

    1. (3)
    1. a.
    1. Ŋ́
    2. 1SG
    1. ŋá
    2. PF.POS
    1. mus-ôo
    2. woman-D
    1. jé.
    2. see
    1. ‘I saw the/a woman.’
    1. (3)
    1. b.
    1. Ŋ́
    2. 1SG
    1. máŋ
    2. PF.NEG
    1. mus-ôo
    2. woman-D
    1. jé.
    2. see
    1. ‘I did not see the woman.’
    1. (3)
    1. c.
    1. Ŋ́
    2. 1SG
    1. máŋ
    2. PF.NEG
    1. musu
    2. woman
    1. jé.
    2. see
    1. ‘I did not see any woman.’

Historically, the lexical marker -o evolved from the demonstrative woo. Creissels (2012a, 2012b, 2020, forthcoming) argues that at some point in the evolution of the language, -o probably marked definiteness but lost that function over time (for a classic study of this pattern of language change in other African languages, see Greenberg, 1978). Nowadays, the default marker -o does not mark (in)definiteness.

Demonstratives ñǐŋ ‘this’ and wo ‘that’ can be optionally used in Mandinka to mark given information. An indefinite determiner dóo ‘some’ is also available, but it is not frequently used for reference introduction. The third-person pronoun (à) does not encode gender or animacy and cannot be omitted in subject position. Unlike the other three languages in our sample, Mandinka does not have an extensive written tradition.

Spanish

Spanish is a Romance language with SVO order, although it is not as rigid as English. Spanish has grammaticalized articles that are inflected for gender and number. It is obligatory to use indefinite articles to introduce new discourse referents (e.g., Compramos una casa ‘(We) bought a house’), and definite articles (e.g., Compramos la casa ‘(We) bought the house’) or pronouns (e.g., La compramos ‘(We) bought it’) are used for familiar discourse referents. A 3-way demonstrative system (este/ese/aquel) can be optionally used to mark givenness, and like articles and pronouns, demonstrative forms are inflected for gender and number. Generics are expressed with the use of the definite article (e.g., Los tigres viven en la jungla ‘[The] tigers live in the jungle’), unlike in English (which uses bare plural nouns). Bare nouns are rarer in Spanish than in English, with only mass nouns (e.g., Necesito café ‘(I) need coffee’) and plural count nouns (Compré flores ‘(I) bought flowers’) being allowed in direct object position. Spanish is a pro-drop language and pronouns in subject position are therefore often omitted.

5. Discourse markers investigated in the present study

As the above descriptions suggest, the four languages in our study vary in the forms available to mark discourse functions, as well as in their optionality. Given this variability, we focused on the most frequent forms used by adult native speakers of each language to introduce discourse characters and maintain reference in folk narratives elicited during pilot work. In the actual study, we compared children’s and adults’ use of indefinite articles as obligatory markers of newness in English and Spanish, Hindi’s optional use of the numeral ek with the same discourse function, and Mandinka’s use of bare nouns bearing the default lexical marker -o. While this lexical marker does not signal (in)definiteness (unlike indefinite articles and the numeral ‘one’ in the other languages), bare nouns were the most frequent form used by adult native speakers of Mandinka to introduce discourse characters during pilot work and were therefore coded as the appropriate form for reference introduction in the study.

The relative optionality of the newness markers coded in these languages is therefore the following: Hindi > English > Spanish > Mandinka. Hindi does not have any obligatory markers to introduce new discourse referents but does optionally use certain markers with this function. This makes newness marking more inconsistent in Hindi than in the other languages. English and Spanish both have obligatory article systems, but English has more permissible uses of bare nouns (in specific constructions) compared to Spanish. Finally, according to the available documentation (see Creissels, 2012a, 2012b, 2020, forthcoming), Mandinka lacks obligatory newness markers and employs a default lexical morpheme for all nouns. It must be noted, however, that Mandinka is not as well documented as the other languages in our sample, and it is therefore possible that Mandinka speakers use their numeral ‘one’ to introduce new discourse referents due to the influence of English (especially in the case of children, who are schooled in English).

In contrast with the marking of new discourse referents, marking familiar referents is obligatory in the four languages in our sample. Indeed, English, Hindi, Mandinka and Spanish all have pronouns, which are used for reference maintenance. In addition, English and Spanish have definite articles that are used to mark given discourse referents, while Mandinka uses demonstrative determiners and Hindi uses bare nouns with the same discourse function – as documented in pilot work. Therefore, for the marking of familiar referents, we compared children’s and adults’ use of definite articles in English and Spanish, Hindi’s use of bare nouns, and Mandinka’s use of demonstrative determiners. In addition, we compared the use of pronouns for reference maintenance in the four languages.

Regarding the marking of familiar referents, it is interesting to note that speakers of all four languages in the study had at least two options (e.g., using a pronoun or a definite article in Spanish). However, this is different from Hindi speakers’ arbitrary choice between marking and not marking new discourse characters (i.e., using ek plus a noun vs. a bare noun). Whereas pronouns and definite articles, or pronouns and demonstratives all signal familiarity, their use is not always interchangeable (with different researchers distinguishing the use of these forms in terms of salience, prominence, or accessibility; e.g., Ariel, 1990; Gundel et al., 1993; Arnold, 1998). However, to the best of our knowledge, the variability observed in Hindi speakers’ use of ek for reference introduction does not depend on information structure, and reflects, instead, a genuinely optional choice on the part of the speaker, which we predict should delay its acquisition relative to obligatory forms in other languages.

6. The Optionality Hypothesis and experimental predictions

The optionality of the available discourse markers varies across the four languages in our study, which allowed us to test the Optionality Hypothesis: the acquisition of optional markers is protracted relative to obligatory markers, as suggested by numerous cross-linguistic studies (e.g., Berman & Slobin, 1994; Hickmann et al., 1996; Hickmann & Hendricks, 1999; Serratrice, 2005; Schaeffer & Matthewson, 2005; Guasti et al., 2008; Rozendaal & Baker, 2008; Hughes & Allen, 2013; Bassano, 2015). While the use of ek to introduce new characters is optional in Hindi, the use of indefinite articles with the same discourse function is obligatory in English and Spanish, while Mandinka lacks a specific marker for reference introduction (with speakers using the lexical marker -o as a default). In contrast with the differences in optionality observed for reference introduction, marking familiar characters is obligatory in all four languages. It must be noted, however, that each language has different grammatical means to mark familiar referents, and the above studies have also documented differences in the acquisition of different givenness markers (e.g., pronouns are acquired earlier than definite articles, even though both signal familiarity).

On the basis of the Optionality Hypothesis, and given the grammatical characteristics of the referential systems of the languages investigated, we made the following predictions:

  1. Regarding the marking of new discourse referents, we predicted that 5-year-old children would have already acquired adult-like use of the default lexical marker in Mandinka (with bare nouns being the most frequent form used for character introduction). In Spanish, we predicted that children of the same age would have close to adult-like performance because indefinite articles are obligatory for introducing new characters. In English, we predicted that 5-year-old children would be less adult-like in their use of indefinite articles than Spanish-speaking children because indefinite articles are obligatory for reference introduction, but bare nouns are overall more permissible than in Spanish. Finally, we predicted that 5-year-old Hindi-speaking children would differ the most from adults, since marking new referents with the numeral ‘one’ is optional and therefore less frequent than the use of indefinite articles in English and Spanish.

  2. Regarding the marking of given referents, we expect to see cross-linguistic differences related to the specific forms available in each language to mark familiarity. For example, the use of bare nouns to maintain reference may emerge earlier in Hindi than the use of definite articles with the same discourse function in English or Spanish. On the other hand, numerous studies have shown that the use of pronouns to refer to familiar characters is mastered very early on in the acquisition of discourse-pragmatic skills (see Hickmann et al., 2015). We therefore predict that 5-year-olds will reveal adult-like pronoun use in the four languages, while they may lag behind in their use of other markers of familiarity (e.g., definite articles).

  3. Regarding cross-linguistic differences in children’s use of definite articles to signal familiarity, we predict that English-speaking children will produce fewer definite articles than their adult counterparts, in contrast with Spanish-speaking children. This pattern of results would replicate previous findings comparing the acquisition of definite articles in Germanic and Romance languages, with determiner omission occurring more frequently and lasting longer in Germanic languages (e.g., Guasti et al., 2008; Rozendaal & Baker, 2008; Bassano et al., 2011).

7. Experiment 1

7.1 Methods

Participants

The number of participants recruited for this study was limited by the time and resources available during fieldwork. Four groups of 20 children (English: M = 5;5, range: 5;0–5;11; Hindi: M = 5;7, range: 4;8–6;2; Mandinka: M = 5;5, range: 5;3–6;0; Spanish: M = 5;5, range: 4;10–5;9) and four groups of 15 adults who served as controls were recruited from Edinburgh (Scotland), Delhi (India), Brikama (The Gambia) and Asturias (Spain) and tested in their native languages (i.e., English, Hindi, Mandinka and Spanish, respectively). The Hindi-speaking children and the Mandinka-speaking children were schooled in English. However, their use of English was limited to the classroom.

Materials and procedure

A series of 14 pictures featuring one or two animal characters carrying out actions was adapted from Long et al. (under review) (for two sample displays of the visual materials, see Figure 1; for the remaining displays plus a comparison with standard narrative-elicitation resources MAIN and ENNI, see Supplementary Materials). A total of six possible characters – a bunny (female), a duck (female), a goose (female), a dog (male), a pig (male), and a bird (male) – were combined to create the pictures. The clothing of the animals signaled their gender. All participants were presented with the 14 pictures in the same order. Pictures were shown one at a time on a computer screen and depicted sequential series of events (e.g., Panel 1 showed a dog reading in bed and Panel 2 showed the same dog falling asleep in the same setting), as well as a few isolated events involving the same characters (e.g., a bird collecting firewood > the same bird playing football). Characters appeared consecutively in 2–5 pictures.

Figure 1

Two sample displays from the visual materials used in Experiment 1 and Experiment 2 to elicit narratives from children and adults.

Of interest was the way in which children referred to new characters (who had not yet been introduced) and familiar characters (who had already been introduced) relative to adults across the four languages. We therefore focused on two of the three main discourse functions: introduction and maintenance. Overall, our interest was on children’s marking of new vs. given referents, and we did not investigate the topic/comment distinction or other related dichotomies (see Krifka, 2008; Dimroth & Narasimhan, 2012).

Before the task, participants were told they would see a series of pictures with animals on the computer screen and they had to tell the experimenter what was happening in each picture. If the participant did not mention any of the characters in a given picture, one follow-up question was asked: ‘What else is happening?’ Participants’ responses were audio recorded for transcription and coding purposes.

Data treatment and coding

Given that the selected languages vary in how they mark new and given information, we created the following coding system for analysis purposes (see Table 1). Within the context of a narrative, A responses are appropriate for introducing new characters, whereas B and C responses are appropriate for referring to familiar characters. Response types A, B and C were selected during pilot work as the most frequent forms used by adult native speakers of each language to introduce and maintain reference. Note that we only coded reference to story characters, so mention of objects or other elements in the visual scenes were not coded or analyzed.

Table 1

Coding of new and given markers in each of the languages.

Discourse function Coding English Hindi Mandinka Spanish
Character introduction A Indefinite article Numeral ‘one’ Bare noun Indefinite article
Reference maintenance B Definite article Bare noun Demonstrative + noun Definite article
Reference maintenance C Pronoun Pronoun (overt or not) Pronoun Pronoun (overt or not)

Narratives were elicited by a native speaker of the language with training in linguistics. Narratives were audio recorded so that they could be first transcribed into the original language, and then coded according to the A-B-C scheme established through pilot work. The Mandinka narratives were also translated (verbatim) to English for cross-checking. All instances of A, B and C responses were coded accordingly, regardless of the discourse function of the specific expression (e.g., an indefinite description was coded as A in both English and Spanish, whether or not the description was used to introduce a new discourse character). Following Bamberg (1986, 1987), each reference to a story character was also coded for discourse function: first mention of a character was coded as Character Introduction, and subsequent references were coded as Reference Maintenance. Finally, separate statistical analyses were carried out for A, B and C responses for the corresponding discourse functions (i.e., A for Character Introduction and B and C for Reference Maintenance).

All data in Experiment 1 were re-coded by three coders with training in linguistics, who were blind to the aims of the study. Two coders were native Hindi speakers and the third one was a native Spanish speaker. All were proficient in English. In relation to the original coding, there were disagreements in 2.6% of entries (77 out of 2940), all of which were resolved through discussion.

7.2 Results

In Experiment 1, all responses were grammatical, although in 6 instances participants used a null argument for character introduction (e.g., ‘[ ] is chopping some wood’ for the first mention of the character), which is not pragmatically felicitous. For summary visualizations of all referential expressions used to mark new and given referents across languages and age groups in Experiment 1 (including those not coded for analysis), see Figure S1 in Supplementary Materials.

Analysis 1: Introducing new characters (A responses)

Using logistic mixed effects regression, we modelled the binary outcome variable of Response (A = 1, all else = 0) for New Characters, with Age Group (Adult, Child), and Language (English, Hindi, Mandinka, Spanish) as predictor variables. Sum contrast coding was used for Age (Adult = –.05, Child = .05) and Language was treatment-coded with Hindi as the reference level because we predicted that children would perform more similarly to adults in the other three languages. The alpha level for all reported tests in the study was set to p ≤ .05 and all analyses were run using R statistical software (R Core Team, 2019). All models were fit with the maximal random effects structure for Participants and Items (Barr, 2013).

As predicted, results revealed clear cross-linguistic developmental differences for character introduction (see Table 2 and Figure 2 for descriptive statistics; for the full model output, see Supplementary Materials). Spanish-speaking children were the only children in Experiment 1 to produce more A responses than the corresponding adults. This is because the Spanish-speaking adults also used other expression types commonly found in children’s narratives (e.g., using the animal name as a proper noun – La Señora Coneja ‘Mrs. Rabbit’, or a modified definite description suggesting the animal is a well-known story character – El cerdito valiente ‘The brave piglet’; see Figure S1 in Supplementary Materials).

Table 2

Rates of A responses for character introduction across languages and age groups.

Language Age Group Mean SD No. observations
Hindi Children .10 .30 12
Adults .52 .50 47
English Children .57 .50 68
Adults .73 .44 66
Spanish Children .89 .31 107
Adults .69 .47 62
Mandinka Children 1 0 120
Adults 1 0 90
Figure 2

Mean proportions of A responses (English: indefinite article; Hindi: numeral ek; Mandinka: bare noun; Spanish: indefinite article) for character introduction in the two age groups across the four languages. Error bars represent 95% confidence intervals and points reflect participant means.

There were no main effects or interactions for Mandinka relative to Hindi. This is probably due to the complete uniformity in responses brought about by the ceiling effect in Mandinka-speaking children and adults, as shown in Table 2 (M = 1, SD = 0). For both Spanish and English, on the other hand, there was a main effect of Language relative to Hindi (both p’s ≤ .0001), with more A responses for new characters in Spanish and English than in Hindi. As predicted, there was an Age × Spanish interaction relative to Hindi (p = .0002). To follow up on this interaction, we conducted the same statistical model as above, focusing exclusively on the subset of Hindi speakers first, then Spanish speakers. Our results indicate that the interaction was driven by a significant difference in A responses for Hindi-speaking children and adults (p = .0050) but no difference in Spanish-speaking children and adults (p = .1480) (see Figure 2). The same Age × English interaction relative to Hindi was marginally significant (p = .0533).

Analysis 2: Referring to familiar characters (B responses)

Using logistic mixed effects regression, we modelled the binary outcome variable of Response (B = 1, all else = 0) for Familiar characters, with Age Group (Adult, Child), and Language (English, Hindi, Mandinka, Spanish) as predictor variables. Sum contrast coding was used for Age (Adult = –.05, Child = .05) and in keeping with the previous model, Language was again treatment coded with Hindi as the reference level.

As predicted, there was an Age × English (p = .0061) and Age × Spanish (p = .0485) interaction relative to Hindi (see Table 3 and Figure 3 for descriptive statistics; for full model output see Supplementary Materials). Following up on these interactions (by conducting the same model analysis with the relevant subset of participants), we found no differences between Hindi and English-speaking children (p = .4022) or Hindi and Spanish-speaking children (p = .8828). Instead, these interactions were driven by differences between Hindi- and English-speaking adults (p = .0062) and Hindi- and Spanish-speaking adults (p = .0317). As shown in Figure 3, adult Hindi speakers’ rate of B responses is much lower than that of adult English and Spanish speakers. This is because there is greater variability in the expressions Hindi-speaking adults use (see Figure S1 in Supplementary Materials). This variability lends support to our characterization of Hindi as having more optionality than the other languages in the study.

Table 3

Rate of B responses for reference maintenance across languages and age groups.

Language Language Group Mean SD No. observations
Hindi Children .40 .49 83
Adults .20 .40 37
English Children .29 .46 64
Adults .51 .50 93
Spanish Children .38 .49 86
Adults .48 .50 88
Mandinka Children .04 .19 8
Adults .05 .22 9
Figure 3

Mean proportions of B responses (English: definite article; Hindi: bare noun; Mandinka: demonstrative determiner; Spanish: definite article) for reference maintenance in the two age groups across the four languages. Error bars represent 95% confidence intervals and points reflect participant means.

Our analysis also revealed a main effect of Age (.0423), with overall higher rates of B responses for adults (M = .31) than children (M = .27). This age difference suggests that B markers of reference maintenance might be acquired later than other markers such as pronouns, a finding which has previously been documented in the language acquisition literature (Hickmann & Hendriks 1999; Vion & Colas, 1999; Wong & Johnston, 2004; Orvig et al., 2010). Analysis 3 (C responses) tested this hypothesis by assessing pronoun use for familiar characters.

Lastly, we found a main effect of Mandinka (p = .0010) relative to Hindi, with more B responses in Hindi. The low proportion of B responses from Mandinka-speaking children and adults was the only surprising result from this analysis. Indeed, unlike the adult participants in our pilot study, the Mandinka speakers in Experiment 1 continued to use bare nouns (A responses) when referring to the same characters in subsequent trials. This intriguing finding was further investigated in a follow-up study reported later in the paper.

Analysis 3: Referring to familiar characters (C responses)

Using logistic mixed effects regression, we modelled the binary outcome variable of Response (C = 1, all else = 0) for Familiar characters, with Age Group (Adult, Child), and Language (English, Hindi, Mandinka, Spanish) as predictor variables. Sum contrast coding was used for Age (Adult = –.05, Child = .05) and in keeping with the previous models, Language was again treatment coded with Hindi as the reference level.

In line with previous work which suggests that pronouns are acquired earlier than other markers of reference maintenance (Hickmann & Hendriks 1999; Vion & Colas, 1999; Wong & Johnston, 2004; Orvig et al., 2010), we found that adults and children did not differ in their pronominal use as evidenced by the absence of an Age × Spanish and Age × Mandinka interaction relative to Hindi (both p’s > .05) (see Table 4 and Figure 4 for descriptive statistics; for full model output see Supplementary Materials). However, we did find an Age × English interaction relative to Hindi (p = .0037). Follow-up analyses (using the same model as above, focusing on the subset of English speakers first, then Hindi speakers), showed this effect was driven by a difference in pronominal use between English-speaking children and adults (p = .0010) but not between Hindi-speaking children and adults (p = .9700). As Figure 4 shows, English-speaking children use pronouns much more often than their adult counterparts. This low rate of pronominal use in English-speaking adults appears to drive the main effect of English relative to Hindi (.0354), whereby overall more pronouns are used in Hindi.

Table 4

Rate of C responses for reference maintenance across languages and age groups.

Language Language Group Mean SD No. observations
Hindi Children .39 .49 81
Adults .39 .49 73
English Children .53 .50 115
Adults .11 .32 21
Spanish Children .33 .47 76
Adults .37 .49 68
Mandinka Children .05 .22 12
Adults .00 .00 0
Figure 4

Mean proportions of C responses (i.e., pronouns in all languages) for reference maintenance in the two age groups across the four languages. Error bars represent 95% confidence intervals and points reflect participant means.

One explanation for the difference between English-speaking adults and children is that English-speaking children experience a protracted acquisition of the definite article, as can be observed in Figure 3. This effect is much less pronounced in Spanish, where significant age-related differences did not emerge. To further explore this possibility, we directly compared definite article and pronoun use in English and Spanish-speaking adults and children in Analysis 4.

Analysis 4: Use of definite articles vs. pronouns when referring to familiar characters in English and Spanish

It is well-documented that young children use pronouns to mark givenness before they start using definite articles, although cross-linguistic differences have also been reported (e.g., Hickmann & Hendriks 1999; Vion & Colas, 1999; Wong & Johnston, 2004; Orvig et al., 2010). Here we were interested in testing this effect in our English and Spanish data, since both languages have pronouns and definite articles. For this analysis, we focused on the subset of responses in which children and adults used either definite articles or pronouns when referring to familiar characters.

Using logistic mixed effects regression, we modelled the binary outcome variable of Definite article (Definite article = 1, Pronoun = 0) for Familiar characters, with Age Group (Adult, Child), and Language (English, Spanish) as predictor variables. Sum contrast coding was used for Age (Adult = –.05, Child = .05).

As predicted, our results revealed a main effect of Age (p = .0325), with children using definite articles less than adults (see Table 5 and Figure 5 for descriptive statistics; for full model output see Supplementary Materials).

Table 5

Rates of definite article use for familiar characters across age groups in English and Spanish.

Language Age Group Mean SD No. observations
English Children .36 .48 64
Adults .82 .39 93
Spanish Children .53 .50 86
Adults .56 .50 88
Figure 5

Rate of definite article vs. pronoun responses for reference maintenance in the two age groups in English and Spanish.

The analysis also revealed an Age × Language interaction (p = .0037). To follow-up on this, we conducted separate analyses on the subsets of Spanish and English speakers using the same model described above. Results revealed a significant age-related difference in the use of definite articles for English speakers (p = .0012), but not for Spanish speakers (p = .8750). These results suggest that 5-year-old English-speaking children have a preference for pronouns over definite articles compared to their adult counterparts, whereas 5-year-old Spanish-speaking children use definite articles to a similar extent as Spanish-speaking adults (see Figure 5). We interpret this difference in light of numerous cross-linguistic studies showing that determiners emerge earlier in Romance than Germanic languages, with determiner omission occurring more frequently and lasting longer in Germanic languages (e.g., Guasti et al., 2008; Rozendaal & Baker, 2008; Bassano et al., 2011). The typological contrast between Romance and Germanic languages is interesting because both types of languages use obligatory determiners with nouns (allowing bare nouns only in certain contexts) but differ along a number of features that are consistent with an earlier development of determiners in Romance languages (for discussion, see Bassano, 2015).

To the extent that the protracted acquisition of the definite article in Germanic languages (including English in our sample) is related to the more frequent use of bare nouns in specific syntactic constructions (relative to Romance languages, such as Spanish), this delay could be interpreted as indirectly supporting the hypothesis that more frequent markers are acquired earlier than less frequent ones (MacWhinney, 2001). In the case of the definite article, not only obligatory use to mark familiar nouns, but also frequent use in opposition to bare nouns seem to matter for early acquisition.

7.3 Discussion

The results of Experiment 1 supported the Optionality Hypothesis: a developmental difference was observed in the acquisition of newness markers (A responses) between a language with an optional marker (Hindi) and a language with an obligatory marker (Spanish). In addition, a similar developmental trend was observed between Hindi and English (which also has an obligatory marker). In line with the language acquisition literature, we also observed a main effect of age for givenness markers (B responses), with English- and Spanish-speaking children using fewer definite articles than their adult counterparts. By contrast, no main effect of age was observed in the use of pronouns to mark familiar characters (C responses). Finally, we also replicated the well-documented finding that determiners emerge earlier in Romance languages than in Germanic languages, with 5-year-old Spanish-speaking children producing definite articles at comparable rates to adults, whereas English-speaking children of the same age lagged behind relative to the adult group.

The results with Mandinka-speaking children and adults also supported the Optionality Hypothesis, showing that 5-year-old children used -o-marked bare nouns to introduce new characters at ceiling rates, just like their adult counterparts. These results also confirm that Mandinka speakers do not use their numeral ‘one’ (kiiling) to introduce new discourse characters, in contrast with speakers of other languages without articles, such as Chinese (Chen, 2004; Wong, 2016), Polish (Hwaszcz & Kędzierska, 2018) or Hindi (Dayal, 2004, 2018; Sharma, 2005). The comparison with Hindi is particularly interesting since both The Gambia and India were British Colonies and many of their children (including the ones in our study) are schooled in English. Thus, whereas contact with English could in principle result in the emergence of an indefinite article from the numeral ‘one’ in the two languages (Heine & Kuteva, 2006), this process only seems to be taking place in Hindi. The possible emergence of an indefinite article in Hindi was further investigated in Experiment 2.

Finally, an unexpected pattern of results was observed with both Mandinka-speaking children and adults, who barely used demonstratives or pronouns for reference maintenance (see Figures 34). Since either of those markers should be used obligatorily to signal familiar referents, the results of Experiment 1 suggest that Mandinka speakers may not have adopted a narrative stance when describing the vignettes, treating each picture as a separate scene and using bare nouns throughout the task. This possibility was further explored in a follow-up study, as it has important methodological implications for field linguists and language acquisition researchers working in non-Western societies.

8. Experiment 2

The first experiment revealed that 5-year-old Hindi-speaking children marked new characters less frequently than English- and Spanish-speaking children of the same age. Hindi has the option of using the numeral ek ‘one’ to introduce discourse characters, but it does not have a fully-grammaticalized indefinite article (Kachru, 1980, 2006; Dayal, 2004, 2018). The diachronic process whereby numerals are grammaticalized into indefinite articles has been documented in many languages (see Givón, 1981; Lyons, 1999), and our starting assumption was that Hindi is currently undergoing the same process. Such a grammaticalization process is compatible with Hindi-speaking adults using the numeral ek less frequently than Spanish- and English-speaking adults use the indefinite singular article to introduce new characters, which in turn explains the protracted acquisition of this discourse marker in Hindi-speaking children.

Experiment 2 had two main aims. The first was to further explore the developmental trajectory of ek acquisition. In this regard, we were interested in the age at which Hindi-speaking children would demonstrate adult-like use of the optional marker ek. According to MacWhinney (2001), children’s acquisition of a form-function mapping should be influenced by the frequency of its input, so we expected Hindi-speaking children to acquire the character-introduction function of the numeral ek later than children speaking languages with obligatory newness markers. Given that children have acquired the most sophisticated referential devices in a language by age 10 (e.g., the ability to appropriately reintroduce characters using definite articles; Hickmann et al., 2015), we tested a group of 10-year-old Hindi-speaking children from the same school in Delhi to see whether their use of ek in the narrative task was comparable to that of adults.

Second, the protracted acquisition of the numeral ek as a newness marker is compatible with the assumption that the Hindi numeral is undergoing a process of grammaticalization into an indefinite article. However, the results of Experiment 1 are also compatible with an alternative, more parsimonious hypothesis: the use of the numeral ek to introduce new discourse referents may be characteristic of a dialectal variety of Hindi spoken in the Delhi area.3 This is an important research question because previous analyses of the use of ek for reference introduction may have been based on speaker intuitions or observations for the dialectal variety of Hindi spoken in Delhi, or other similar dialectal varieties (e.g., Dayal, 2004, 2018; Sharma, 2005). The second aim of Experiment 2 was therefore to try to replicate the results of Experiment 1 outside of the Delhi area. To this end, the same narrative-elicitation task was used again in Experiment 2 with 5-year-olds, 10-year-olds and adults from Gorakhpur, a city in the North Eastern state of Uttar Pradesh, and with adults from Allahabad, a smaller city in the same Indian state.

In addition to the two main aims described above, Experiment 2 also had a secondary goal related to the first aim of the experiment (i.e., exploring the developmental trajectory of ek acquisition). Here we were interested in investigating the individual patterns of ek use produced by 5- and 10-year-old children and adults in Delhi and Gorakhpur. Using miniature artificial languages in the lab, Hudson Kam and Newport (2005, 2009) showed that 5- to 7-year-old children regularized the use of determiners even when receiving an inconsistent input, while adults learned determiners veridically, producing them according to their frequency in the input. The results of Experiment 1 confirmed that the use of ek to introduce new characters is not systematic in Hindi. However, unlike Hudson Kam and Newport, we do not have a way to assess the relative exposure of each age group to this novel use of the numeral ‘one’. Nevertheless, while our study lacks the experimental control of artificial language experiments, it offers naturalistic production data, which we believe is equally important in understanding language change and its implications for language acquisition. Experiment 2 therefore investigated whether Hindi-speaking children reveal more regular patterns of ek use than adults – either using it systematically to introduce new characters or never using it, in line with the lab results of Hudson Kam and Newport (2005, 2009).

8.1 Methods

Participants

Twenty 10-year-old children (M = 10;6; range: 9;8–11;0) were recruited from the same school in Delhi as in Experiment 1. These children were learning English in school and did not use English outside the classroom.

Twenty 5-year-old children (M = 5;5, range: 4;8–6;1), twenty 10-year-old children (M: 10;4, range: 9;9–11) and 15 adults were recruited from Gorakhpur, India. All participants were native speakers of Hindi. The children studied in one of two local schools, which are English-medium, and their use of English was limited to the classroom. The adults were all stay-at-home mothers, who had functional proficiency in English and used it in restricted situations (e.g., when helping their children with schoolwork).

Fifteen native Hindi-speaking adults were recruited from Allahabad, India. All participants had at most functional proficiency in English and used it in very restricted domains. Comparing the three groups of Hindi-speaking adults in the study, those from Delhi (Experiment 1) were on average the youngest (M = 25;7, range: 21–31), followed by those from Gorakhpur (M = 31;8, range: 27–38) and those from Allahabad (M = 39;7, range: 24–55). The level of English of these participants was not assessed. However, all the adults from Delhi had received higher education in English, whereas half of the adults from Gorakhpur and only two from Allahabad did.

Materials, procedure and coding

The materials, procedure and coding from Experiment 1 were used again in Experiment 2.

All data in Experiment 2 were re-coded by the two native Hindi speakers who had re-coded Experiment 1. In relation to the original coding, there were disagreements in 4.3% of entries (68 out of 1575), all of which were resolved through discussion.

8.2 Results

Analysis 5: The use of ek to introduce new characters across regions and ages

Using logistic mixed effects regression, we modelled the binary outcome variable of Use of ek (A = 1, all else = 0) for New Characters in Hindi, with Age (5-year-olds, 10-year-olds, and Adults), and Region (Delhi, Gorakhpur) as predictor variables. Age was treatment-coded with 10-year-olds as the reference level (since we were interested in whether their use of ek would more closely resemble adults’ referential behavior than that of young children), and Region was treatment coded with Delhi as the reference level, in line with Experiment 1.

Similar developmental patterns were found in the use of ek to introduce new characters across both regional varieties of Hindi (see Table 6 and Figure 6 for descriptive statistics; for the full model output, see Supplementary Materials). Summary visualizations and a technical description of the different referential expressions that were used to mark new and given referents in Hindi are reported in the Supplementary Materials (see Figs. S2 and S3; relatedly, see Sinha, 2009).

Table 6

Rates of ek use to introduce new characters across regions and age groups.

Mean SD No. observations
Delhi
    5-year-olds .10 .30 12
    10-year-olds .62 .49 74
    Adults .52 .50 47
Gorakhpur
    5-year-olds .11 .31 13
    10-year-olds .40 .49 48
    Adults .25 .44 23
Figure 6

Proportion of ek uses to introduce new discourse characters across three age groups and two regional varieties of Hindi. Error bars represent 95% confidence intervals and points reflect participant means.

The developmental patterns revealed a significant difference between 5-year-olds from Delhi and 10-year-olds from Delhi (p < .0001), with 10-year-olds using ek more frequently than 5-year-olds. However, 10-year-olds from Delhi did not differ from adults from Delhi (p = .6135), which suggests that by the age of 10, Delhi children’s use of ek is similar to adults. The same developmental pattern was found when Gorakhpur was set as the reference level: 5-year-olds and 10-year-olds differed in their use of ek (p = .0050), but not 10-year-olds and adults (p = .1191). With regards to the two regional varieties of Hindi, there was no difference in the use of ek for 10-year-olds across regions (p = .0822). Taken together, these results confirm that similar developmental patterns in ek use emerged in the Delhi and Gorakhpur regions (see Figure 6).

Analysis 6: Individual patterns in the use of ‘ek’ across regions and ages

Following Hudson Kam and Newport (2005, 2009), we calculated the percentage of children and adults who (i) always used ek, (ii) never used ek, and (iii) sometimes used ek to introduce new story characters (see Table 7). As in the original studies, we applied a margin of one to this classification (i.e., no use of ek or a single use were classified as a systematic non-ek user, and all or all-but-one uses were classified as a systematic ek user). Grouping those participants who were systematic in either using or not using ek for character introduction, we observed a general preference for systematicity, with most 5-year-olds, 10-year-olds and adults in both Delhi and Gorakhpur revealing a systematic pattern.

Table 7

Percentage of participants (and actual number in parentheses) in each age group and region showing a systematic or variable pattern of ek use.

Individual patterns Systematic EK users Systematic Non-EK users Total systematic users Variable EK users
Age 5 10 Adults 5 10 Adults 5 10 Adults 5 10 Adults
Delhi 0% (0) 40% (8) 33% (5) 80% (16) 25% (5) 27% (4) 80% (16) 65% (13) 60% (9) 20% (4) 35% (7) 40% (6)
Gorakhpur 5% (1) 15% (3) 7%(1) 90% (18) 40% (8) 60% (9) 95% (19) 55% (11) 67% (10) 5% (1) 45% (9) 33% (5)

Chi-square analyses revealed that Age was a significant determiner of systematicity in the Gorakhpur group (X2 (1, N = 55) = 8.449, p < .0147), but not in the Delhi group (X2 (1, N = 55) = 1.852, p = .396). However, the significant effect of Age in the Gorakhpur group did not result from more systematicity in the children vs. the adults – as one might have expected from the results of Hudson Kam and Newport (2005, 2009) using miniature artificial languages. Instead, the younger children and the adults revealed higher systematicity than the older children in the Gorakhpur group. While our results are not directly comparable to the laboratory findings by Hudson Kam and Newport (2005, 2009), at the very least they suggest that the conclusion that children are more systematic than adults in their use of emerging forms might be overly simplistic, with children of different ages playing different roles in language change (see Kerswill, 1996).

At a descriptive level of analysis, the individual patterns observed in Experiment 2 revealed more complexity than a simple preference for systematicity in children. The large majority of 5-year-olds in both Delhi and Gorakhpur were indeed systematic non-ek users (at 80% and 90%, respectively). However, the 10-year-olds’ performance was more mixed: in the Delhi group, most systematic 10-year-olds were ek users (at 40% vs, 25%), while in the Gorakhpur group, most systematic 10-year-olds were non-ek users (at 40% vs, 15%). This pattern might be related to the adult usage: systematic adults in the Delhi group were more evenly split between users and non-users of ek (at around 30%), while most systematic adults in the Gorakhpur group were non-users (at 60% vs, 7%).

While the nature of our study does not allow establishing a causal link between the adult data and the productions of the 10-year-old children, it is interesting that they seem to be relatively consistent across the two regions. Our results are generally compatible with Labov’s (2007) model of intergenerational language change by transmission, according to which language change is advanced by the younger generation, who tend to use a new form more frequently than their parents. However, future narrative-elicitation studies should investigate whether parents’ use of ek to mark newness correlates with their children’s usage at any point in development.

Analysis 7: The use of ek to introduce new characters across three adult groups

Using logistic mixed effects regression, we modelled the binary outcome variable of Use of ek (A = 1, all else = 0) for New characters in Hindi, with Region (Delhi, Gorakhpur, and Allahabad) as the predictor variable. In line with the previous analyses, Region was treatment-coded with Delhi as the reference level because we wanted to compare the two groups of adults outside of Delhi with those within Delhi.

Results revealed clear differences on the use of ek by the three adult groups (see Table 8 and Figure 7 for descriptive statistics; for the full model output, see Supplementary Materials).

Table 8

Adult rates of ek uses to introduce new characters in three regional varieties of Hindi.

Groups Mean SD No. observations
Delhi adults .52 .50 47
Gorakhpur adults .25 .44 23
Allahabad adults .03 .18 3
Figure 7

Mean proportions of ek uses to introduce new discourse characters across Hindi-speaking adults in Delhi, Gorakhpur, and Allahabad. Error bars represent 95% confidence intervals and points reflect participant means.

There was a significant difference in the use of ek between adults in Delhi and Allahabad (p = .0071), with those in Delhi using ek at higher rates. Adults in Delhi also used ek more frequently than adults in Gorakhpur (p = .0458), though numerically the difference was smaller (see Table 8). Together these results demonstrate that Delhi adults used ek more frequently than adults from Gorakhpur and Allahabad.

8.3 Discussion

The results from Experiment 2 confirmed that, by the age of 10 years, Hindi-speaking children have reached adult levels in their use of ek to introduce new discourse characters. This pattern of results was observed not only in Delhi, but also in Gorakhpur, a city in a neighboring state. These results offer support to the Optionality Hypothesis: compared to the English-speaking children and the Spanish-speaking children in Experiment 1, the acquisition of ek as a marker of newness is protracted in Hindi because of its optionality.

An exploratory analysis of the individual patterns of ek use observed in the three age groups tested in Delhi and Gorakhpur revealed that children were not more systematic than adults (cf. Hudson Kam & Newport, 2005, 2009). Most 5-year-olds in both regions were systematic in not using ek for character introduction, but the 10-year-olds showed different trends: in the Delhi group, most 10-year-olds were systematic ek users, while in the Gorakhpur group, most systematic 10-year-olds were non-users of ek. Interestingly, compatible patterns were observed in the two adult groups. Future studies should therefore investigate whether and when parents’ use of ek to introduce discourse characters correlates with their children’s usage; that is, whether the transmission of this form is vertical or horizontal (Gong, 2010).

The fact that similar developmental patterns were observed in Delhi and Gorakhpur runs counter to the alternative hypothesis that the use of ek to introduce new referents is merely a dialectal feature of the Hindi spoken in Delhi. While more research is needed in order to confirm whether ek is in the process of being grammaticalized as an indefinite article, the results of Experiment 2 are in principle compatible with such a hypothesis. Interestingly, however, the adults from Delhi produced higher rates of ek relative to the adults from Gorakhpur and Allahabad, which suggests that the grammaticalization process may have started in the Delhi area and is extending to other Hindi-speaking regions in Northern India. Future studies should also investigate this question.

Possible factors underlying the regional differences observed in adult ek use are participants’ age and level of English. Among other sources, grammaticalization processes can result from language contact (Heine & Kuteva, 2006). For example, a recent study by Otwinowska et al. (2020) reported that Polish-English bilingual children produce more definite markings than Polish monolingual children when telling stories in Polish. Since Polish does not have articles, the authors interpret this finding as a language transfer effect, whereby the bilingual children are trying to compensate for the lack of a definite article in Polish by using demonstratives and possessives.

In our study, the Hindi-speaking adults from Delhi (who used ek most frequently) were all proficient English speakers, who had received higher education in English. The adults from Gorakhpur were on average older and had a lesser exposure to English (following the Delhi adults in ek use), and this difference in age and level of English was even more marked in the adults from Allahabad (who used ek the least). It is therefore possible that the use of ek to introduce discourse referents is more frequent in younger adults with more exposure to English, which could reflect a transfer effect from the indefinite article in English (analogous to the findings by Otwinowska et al., 2020).

Future research should more precisely distinguish the extent to which age and/or exposure to English determine the use of the numeral to introduce new characters in Hindi. It must be noted, however, that exposure to English cannot be the only explanation for this linguistic phenomenon since the Mandinka-speaking participants in Experiment 1 did not use their numeral ‘one’ to introduce discourse characters, despite their exposure to English. Therefore, more work is needed in this area to determine the relative roles played by age and exposure to English in the development of ek as a newness marker in Hindi.

9. Follow-up study

In order to investigate the effect of optionality on children’s acquisition of discourse markers, we elicited picture-based narratives from both children and adults in four different languages. However, while visual narratives are generally presumed to be highly accessible (and thus used in children’s books, comics, assembling instructions, etc.), research suggests that the interpretation of images as sequential events is an acquired skill that requires exposure and practice (Cohn, 2019, 2020). Thus, differences in cultural background (e.g., between cultures with a strong oral story-telling tradition vs. a picture book tradition) could result in the interpretation of each image as an isolated event rather than a continuous story. This lack of continuity would in turn affect the way in which characters are referred to, as the same character might be treated as a new referent in each picture and be marked accordingly (Fussell & Haaland, 1978; Cook, 1980; Núñez et al., 2012). Since Mandinka has a strong oral story-telling tradition, but not such a strong picture book tradition, our participants may have interpreted the images on the computer screen as disconnected pictures, which would explain why both children and adults used -o-marked bare nouns for both new and familiar referents, despite having the option of using demonstratives and pronouns for reference maintenance.

A quick look through The Gamble Archive (1946–2003), which is a rich collection of work on the history and culture of The Gambia by Professor David Gamble (1997), shows that the use of pronouns is actually very common in Mandinka. This led us to investigate whether the lack of pronouns in our data was due to the specific task employed for eliciting narratives. We hypothesized that the reason why Mandinka speakers used full nominals for all characters was because they were not adopting a narrative stance. Interestingly, we observed a similar pattern of behavior in 3 English-speaking adults, who also treated the pictures as independent events and repeatedly referred to familiar characters using an indefinite article (see Figures 3 and 4). In order to investigate whether the lack of pronouns observed with Mandinka speakers in Experiment 1 was due to the use of an unfamiliar task, we collected oral stories from a small group of native speakers of Mandinka without any pictorial prompt. Given the cultural significance of story-telling in West Africa, re-telling of oral narratives should reveal a more naturalistic use of pronouns than picture story-telling.

9.1 Methods

Participants

A total of 10 Mandinka speakers, 5 children (Mean age: 5;6, range: 5;3–6:1) and 5 adults were recruited from Brikama, The Gambia. Written informed consent was obtained from the adult participants and the children’s minders prior to testing.

Materials and procedure

Two different fables were selected for the task because one was more popular amongst the children (The hen and the cat) while the other was more popular amongst the adults (The greedy lion). Participants were asked to tell one of these two stories, in their own words. Only the title of the story was given as a prompt. Stories were recorded for transcription and coding.

The hen and the cat

Once there was a hen and a cat. The hen came to the cat’s home and said: ‘Let’s go and play football’, and so the two went out to play. The cat kicked the ball out into a field of nettles and the ball burst. The hen cried to the cat: ‘Cock-a-doodle-doo, go and pay for my ball!’ And the cat responded: ‘Meow, meow, meow, wasn’t it the two of us who were playing?’

The greedy lion

A lion once went hunting and found a rabbit. He caught the rabbit and started heading home but, on the way, he saw an antelope. The lion got greedy and decided to leave the rabbit and go for the antelope instead, but the antelope outran him. The lion came back to get the rabbit, but the rabbit was gone too. He then went home on an empty stomach.

9.2 Results

All participants were familiar with the fable they were asked to tell. Importantly, however, the narratives included sufficient variability to suggest that these stories had not been memorized verbatim (e.g., one of the children said that the football had been eaten by a crocodile, instead of being kicked out to a field of nettles). The exact wording of the fables was also different across participants, suggesting that they had not memorized these stories by heart.

Unlike the vignettes used in Experiments 1 and 2, the folk stories that were elicited in this follow-up study allowed us to investigate the three main discourse functions – introduction, maintenance and reintroduction – and distinguish between the main and secondary characters in the adult fable. Since different fables were used for each age group, pronoun production was not directly compared across ages. However, all narratives were evaluated for their pragmatically-felicitous use of pronouns.

As shown in Figure 8, Mandinka-speaking children used different referential forms to introduce and maintain reference to the two characters in their story, clearly distinguishing the two discourse functions (Bamberg, 1986, 1987). In addition, and in contrast with the results from Experiment 1, 5-year-old children used pronouns 51% of the time, on average. In instances of reference maintenance, 93% of referential expressions were pronouns (either singular or plural).

Figure 8

Percentages of -o-marked bare nouns and pronouns (singular and plural) used to introduce (left panel), maintain reference to (middle panel) and reintroduce (right panel) the two characters in a traditional oral narrative by Mandinka-speaking children.

As shown in Figure 9, adult Mandinka speakers demonstrated sensitivity to the distinction between main vs. secondary characters when choosing appropriate referential forms in a traditional narrative. Participants also seemed to mark referents differently according to the three discourse functions of narratives (i.e., reference introduction, maintenance and reintroduction). Speakers unanimously preferred -o-marked bare nouns over pronouns for introducing new characters. However, pronouns were always used to refer to the main character (the lion) once it had been introduced, whereas -o-marked bare nouns were preferred for referring to secondary characters, even when they had already been introduced. Those instances where participants used full nouns for reference maintenance were contexts where primary and secondary characters were competing for pronominal reference (e.g., ‘It catches a rabbit. It says the rabbit will make a good dinner’, where ‘it’ refers to the primary character, the lion). This could be a form of audience design whereby speakers were avoiding ambiguity, since Mandinka pronouns are not marked for gender.

Figure 9

Percentages of -o-marked nouns and pronouns used to introduce, maintain reference to and reintroduce the main character (left panel) and the secondary character (right panel) in a traditional oral narrative by adult speakers of Mandinka.

9.3 Discussion

The results of this follow-up study confirm that the surprisingly infrequent use of pronouns that we observed with child and adult Mandinka speakers in Experiment 1 was likely a result of treating each panel as an independent picture, probably because of their limited experience with picture story-telling (especially when the pictures are presented on a computer screen). When children and adults were asked to retell a familiar fable in Mandinka, both groups used pronouns for reference maintenance at normal rates. While the low number of participants and the different fables used for the two age groups do not allow making a reliable comparison between children and adults, the stark difference in the referential expressions used by Mandinka speakers in the two experiments highlights the importance of using culturally-appropriate methods of narrative elicitation. We believe that this is an important methodological observation that should help future field linguists and language acquisition researchers working on storytelling in non-Western societies.

10. General discussion

According to Hickmann et al. (2015), cross-linguistic comparisons are necessary before generalizations can be made about early vs. late mastery of reference on a universal basis (p. 204). In this study, we investigated a typologically-diverse set of languages, allowing us to explore the effect of optionality on the acquisition of different discourse markers for two main narrative functions. Our initial prediction regarding the acquisition of newness markers was that they would be influenced by the consistency with which they are used in adult language (MacWhinney, 2001). Our results confirmed that the use of discourse markers emerged earlier in languages that consistently used them for the same functions. As predicted, children produced the most adult-like descriptions in Mandinka, given the uniform use of -o-marked bare nouns in adult responses. Hindi-speaking 5-year-olds, on the other hand, demonstrated the least adult-like performance, which was also expected given that Hindi-speaking adults are the least consistent in their use of newness markers amongst the four language groups. Results from Spanish and English also fell in line with our predictions, showing that children in both of these languages use appropriate newness markers (i.e., indefinite articles) in a more adult-like way than Hindi-speaking children.

Regarding the use of givenness markers across languages, our results revealed a significant effect of age in the use of various givenness markers, including, most importantly, a lesser use of definite articles by English- and Spanish-speaking children. By contrast, our results did not reveal a developmental difference in pronominal use between children and adults, a pattern that was expected since previous studies have shown that children as young as 3 years old are able to use pronouns for character maintenance, while 5-year-old children already demonstrate adult-like performance (Wong & Johnston, 2004; Song & Fisher, 2007).

Regarding cross-linguistic differences in the use of givenness markers, when comparing English and Spanish, Spanish-speaking children revealed the most adult-like performance when marking familiar characters. This pattern of results can be explained by the different consistency with which articles are used in the two languages: Spanish has more restrictions on the use of bare nouns than English does, which results in Spanish-speaking children arguably getting exposed to a more consistent use of articles. The relatively infrequent use of bare nouns in Spanish would therefore facilitate children’s form-to-function mapping between articles and discourse functions (see also Bassano, 2015). Overall, the results of our study support the Optionality Hypothesis, according to which the acquisition of optional discourse markers is protracted relative to obligatory markers.

The potential grammaticalization of the Hindi numeral ek into an indefinite article

The findings from Experiments 1 and 2 confirm that the numeral ek is used in Hindi to introduce new discourse characters (Dayal, 2004; Sharma, 2005), although its use is not obligatory (cf. Dayal, 2018). These results are compatible with our starting assumption that the numeral ek is in the process of being grammaticalized into an indefinite article in Hindi, in line with analogous processes observed in Chinese (Chen, 2004; Wong, 2016), Polish (Hwaszcz & Kędzierska, 2018) and numerous other languages (Givón, 1981). In Experiment 1, Hindi-speaking adults did not use ek to mark new characters as frequently as Spanish and English adults used indefinite articles with that function, which in turn explains why 5-year-old speakers of Hindi lagged behind Spanish and English children of the same age in marking new referents. Experiment 2 further showed that by age 10, Hindi-speaking children use ek to introduce new characters at adult levels.

The findings from Experiment 2 run counter to the alternative hypothesis that the use of ek to introduce new discourse referents is only characteristic of the dialectal variety of Hindi spoken in Delhi. Similar patterns of results were observed with 5-year-olds, 10-year-olds and adults from Gorakhpur, a city in the North Eastern state of Uttar Pradesh, suggesting that this linguistic phenomenon is more widespread than a mere dialectal variation. Interestingly, however, adult speakers of Hindi differed in their use of ek to introduce new discourse characters, with the highest rates being observed in Delhi, followed by Gorakhpur and finally by Allahabad (a smaller city in Uttar Pradesh). These three groups varied in age and exposure to English, with younger adults with greater exposure to English using ek more frequently for reference introduction.

We hypothesized that the potential grammaticalization of ek into an indefinite article could be related to the close contact between Hindi and English (see Heine & Kuteva, 2006). However, future studies should confirm this hypothesis by collecting measures of English proficiency alongside narratives in Hindi. Moreover, Mandinka children, like the Hindi-speaking children in our sample, are schooled in English, but our Mandinka-speaking participants did not use their numeral ‘one’ to introduce discourse characters. This suggests that if contact with English plays a role in the emergence of an indefinite article in Hindi, this diachronic process is more complex than a mere effect of exposure to English. Future studies should therefore investigate the linguistic and environmental factors underlying this potential process of language change in Hindi.

The effect of narrative elicitation techniques

Many researchers have pointed out the implications of methodological heterogeneity for the comparability of studies on the acquisition of pragmatic markers in extended discourse (see Hickmann et al., 2015). Usually, heterogeneity emerges from both linguistic and cognitive factors (e.g., the question type used for prompting, the number of characters present in the scene, whether the experimenter and the participant share knowledge, etc.). However, there are also cultural factors that could influence participants’ performance: narratives involve the use of decontextualized language (i.e. language that abstracts away from the here-and-now) and the conventions used to interpret the narrative situation are highly culture-specific. Unsurprisingly, research has shown cross-cultural differences in narrative performance by both children and adults (Gorman et al., 2011; Carmiol & Sparks, 2014), even though the use of picture stimuli for narrative elicitation has been a common practice in both linguistic and psychological research.

Interestingly, Cohn (2019, 2020) has recently argued that the interpretation of sequential images as a cohesive whole might not be universal, but rather dependent on exposure to the system of graphics in a culture. The stark contrast between the use of pronouns by Mandinka speakers when describing a pictorial narrative (Experiment 1) versus retelling a folk story from memory (Follow-up study) offers support to Cohn’s argument. Therefore, our findings highlight the importance of using culturally-appropriate tasks and diverse samples in cross-linguistic studies of the emergence of discourse-pragmatic functions.

Future directions: Language change and Theory of Mind

Like other aspects of information structure, marking what is new and given in discourse requires monitoring the temporary state of the addressee’s mind (Wong & Johnston, 2004; Zimmermann, 2016). Therefore, from a language acquisition perspective, there is an interesting connection between marking new and old information and the development of Theory of Mind: our capacity to understand mental states such as intentions or beliefs (Gundel et al., 2007; Gundel & Johnson, 2013; for a review, see Dimroth & Narasimhan, 2012). Seen this way, monitoring what information is shared with an interlocutor, or which referents are accessible in a conversation are implicit forms of mindreading that are trained in communication (Rubio-Fernandez, 2019, 2020, under review). Thus, children’s incipient discourse markings reveal not only their language acquisition, but also their pragmatic development, which is a window into their Theory of Mind.

Rubio-Fernandez (2020, under review) has recently argued that, from a diachronic perspective, the pathway of language change whereby exophoric demonstratives evolve into anaphoric demonstratives, which in turn give rise to definite articles, marks a three-step expansion of the speakers’ notion of common ground. In this account, common ground starts with the shared physical space, which allows for the joint use of demonstratives and pointing gestures (e.g., ‘Look at that!’). Then, with the use of anaphoric demonstratives, the notion of common ground as co-presence expands to include the interlocutors’ ongoing discourse representation (e.g., ‘That is what I heard’). Lastly, in the third stage of language change, common ground expands once more to include earlier experiences and world knowledge shared by the interlocutors, which can be marked with the use of the definite article (e.g., ‘We bought the house’).

While the evolution of the indefinite article from the numeral ‘one’ is not part of the same pathway of language change, indefinite articles are used to mark new discourse referents (e.g., ‘We bought a house’), which is another form of common ground management (Krifka, 2008). Therefore, the potential emergence of indefinite articles observed in languages like Hindi, Chinese or Polish (Chen, 2004; Dayal, 2004, 2018; Sharma, 2005; Wong, 2016; Hwaszcz & Kędzierska, 2018) has implications for the marking of information structure in those speech communities, and as a result, for their regular Theory of Mind use in communication (for further discussion, see Rubio-Fernandez, 2020, under review). Future developmental studies will hopefully investigate these important questions at the intersection between language change and social cognition.

Notes

  1. We refer to discourse markers as those grammatical forms that are used to distinguish new and given information in different languages, marking referents according to their discourse function (i.e., introduction, maintenance and re-introduction; Bamberg, 1986, 1987). For a different conception of discourse markers (usually employed in discourse analysis), see Renkema and Schubert (2018). [^]
  2. Unlike our analysis of the Hindi numeral ek, which focuses on the distinction between optional and obligatory markers, Aksu-Koç and Nicolopoulou’s (2014) analysis of bir ‘one’ in Turkish focused on the distinction between local and global markings. [^]
  3. The dialectal-variation hypothesis is not incompatible with the grammaticalization hypothesis, but it would suggest a much more local phenomenon, which could be a reason to question the degree to which ek is undergoing grammaticalization. [^]

Data accessibility statement

All data files, analysis script and supplementary materials for this study are publicly available in an OSF repository (https://osf.io/gq342).

Ethics and consent

The experiments reported here were conducted in accordance with the Declaration of Helsinki. The study was approved by the Ethics Committee at the University of Edinburgh. Written informed consent was obtained from each child’s parent/minder and each adult participant prior to testing.

Acknowledgements

This research was supported by a Researcher Project from the Research Council of Norway (Ref. 275505) awarded to PRF. All authors gratefully acknowledge this funding. Special thanks to Fatoumata Jallow for all her help with data collection, transcription and translation in The Gambia. Thanks also to Vrinda Bathia, Anwesha Mahapatra and Ana Rubio-Fernandez for help with blind coding and cross-checking. We are also very grateful to the schools and children who participated in the study. Finally, thanks to María Mercedes Piñango, Veneeta Dayal and all the other participants in Meaning in Flux for very inspiring discussions on language change and its implications for human cognition.

Competing interests

The authors have no competing interests to declare.

Author contributions

VS: Data collection, Data curation, Writing – Original draft, and Project administration.

ML: Data collection, Data curation, Statistical analyses, and Visualizations.

PRF: Conceptualization, Methodology, Data collection, Data curation, Writing – Original draft, Writing – Review and Editing, Supervision, and Funding acquisition. Corresponding author.

References

Aksu-Koç, A., & Nicolopoulou, A. (2014). Character reference in young children’s narratives: A crosslinguistic comparison of English, Greek, and Turkish. Lingua, 155, 62–84. DOI:  http://doi.org/10.1016/j.lingua.2014.04.006

Ariel, M. (1990). Accessing noun-phrase antecedents. London: Routledge.

Arnold, J. E. (1998). Reference form and discourse patterns. Unpublished doctoral dissertation, Stanford University.

Bamberg, M. G. (1986). A functional approach to the acquisition of anaphoric relationships. Linguistics, 24, 227–284. DOI:  http://doi.org/10.1515/ling.1986.24.1.227

Bamberg, M. G. (1987). The acquisition of narratives: Learning to use language. Berlin: Walter de Gruyter. DOI:  http://doi.org/10.1515/9783110854190

Barr, D. J. (2013). Random effects structure for testing interactions in linear mixed-effects models. Frontiers in Psychology, 4, 328. DOI:  http://doi.org/10.3389/fpsyg.2013.00328

Bassano, D. (2015). The acquisition of nominal determiners: Evidence from crosslinguistic approaches. In L. Serratrice & S. E. M. Allen (Eds.), The acquisition of reference, TiLAR 15 (pp. 25–49). Amsterdam: John Benjamins Publishing Company. DOI:  http://doi.org/10.1075/tilar.15.02bas

Bassano, D., Maillochon, I., Korecky-Kröll, K., van Dijk, M., Laaha, S., Dressler, W. U., & van Geert, P. (2011). A comparative and dynamic approach to the development of determiner use in three children acquiring different languages. First Language, 31, 253–279. DOI:  http://doi.org/10.1177/0142723710393102

Berman, R. (2015). Language development and use beyond the sentence. In E. Bavin & L. Naigles (Eds.), The Cambridge handbook of child language (pp. 458–480). Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781316095829.021

Berman, R. A., & Slobin, D. I. (1994). Different ways of relating events in narrative: A crosslinguistic developmental study. Hillsdale, NJ: Lawrence Erlbaum.

Campbell, A. L., Brooks, P., & Tomasello, M. (2000). Factors affecting young children’s use of pronouns as referring expressions. Journal of Speech, Language, and Hearing Research, 43, 1337–1349. DOI:  http://doi.org/10.1044/jslhr.4306.1337

Carmiol, A. M., & Sparks, A. (2014). Narrative development across cultural contexts. In D. Matthews (Ed.), Pragmatic development in first language acquisition (pp. 279–296). Philadelphia, PA: John Benjamins Publishing Company. DOI:  http://doi.org/10.1075/tilar.10.16car

Chen, P. (2004). Identifiability and definiteness in Chinese. Linguistics, 42, 1129–1184. DOI:  http://doi.org/10.1515/ling.2004.42.6.1129

Chierchia, G. (1998). Reference to kinds across language. Natural Language Semantics, 6, 339–405. DOI:  http://doi.org/10.1023/A:1008324218506

Cohn, N. (2019). Structural complexity in visual narratives: Theory, brains, and cross-cultural diversity. In M. Grishakova and M. Poulaki (Eds.), Narrative complexity and media: Experiential and cognitive interfaces (pp. 174–199). Lincoln: University of Nebraska Press. DOI:  http://doi.org/10.2307/j.ctvhktjh6.13

Cohn, N. (2020). Visual narrative comprehension: Universal or not? Psychonomic Bulletin and Review, 27, 266–285. DOI:  http://doi.org/10.3758/s13423-019-01670-1

Cook, B. L. (1980). Picture communication in the Papua New Guinea. Educational Broadcasting International, 13, 78–83.

Creissels, D. (2012a). The flexibility of the noun vs. verb distinction in the lexicon of Mandinka. International Conference on Polycategoriality, Paris, October 2010 (revised manuscript).

Creissels, D. (2012b). Mandinka. Leipzig Valency Classes Project.

Creissels, D. (2020). Grammaticalization in Manding languages. In W. Bisang & A. Malchukov (Eds.), Grammaticalization scenarios: Areal patterns and cross-linguistic variation (pp. 695–727). Berlin: De Gruyter. DOI:  http://doi.org/10.1515/9783110712735-002

Creissels, D. (forthcoming). A sketch of Mandinka. To appear in F. Lüpke (Ed.), The Oxford guide to the Atlantic languages of West Africa. Oxford: Oxford University Press

Dayal, V. (2004). Number marking and (in)definiteness in kind terms. Linguistics and Philosophy, 27, 393–450. DOI:  http://doi.org/10.1023/B:LING.0000024420.80324.67

Dayal, V. (2018). (In)definiteness without articles: Diagnosis, analysis, implications. In G. Sharma & R. Bhatt (Eds.), Trends in Hindi linguistics (pp. 1–26). Berlin: Mouton de Gruyter. DOI:  http://doi.org/10.1515/9783110610796-001

Dimroth, C., & Narasimhan, B. (2012). The acquisition of information structure. In M. Krifka & R. Musan (Eds.), The expression of information structure (pp. 319–362). Berlin: Mouton de Gruyter. DOI:  http://doi.org/10.1515/9783110261608.319

Fussell, D., & Haaland, A. (1978). Communicating with pictures in Nepal: Results of practical study used in visual education. Educational Broadcasting International, 11, 25–31.

Gamble, D. P. (1997). Mandinka stories from books published prior to 1960. In Gambia studies series by Professor David Gamble. Library Special Collections, Charles E. Young Research Library, UCLA.

Givón, T. (1981). On the development of the numeral ‘one’ as an indefinite marker. Folia Linguistica Historica, 15, 35–54. DOI:  http://doi.org/10.1515/flih.1981.2.1.35

Gong, T. (2010). Exploring the roles of horizontal, vertical, and oblique transmissions in language evolution. Adaptive Behavior, 18, 356–376. DOI:  http://doi.org/10.1177/1059712310377241

Gorman, B. K., Fiestas, C. E., Peña, E. D., & Clark, M. R. (2011). Creative and stylistic devices employed by children during a storybook narrative task: A cross-cultural study. Language, speech, and hearing services in schools, 42, 167–181. DOI:  http://doi.org/10.1044/0161-1461(2010/10-0052)

Greenberg, J. H. (1978). How does a language acquire gender markers. Universals of Human Language, 3, 47–82.

Guasti, M. T., Gavarró, A., De Lange, J., & Caprin, C. (2008). Article omission across child languages. Language Acquisition, 15, 89–119. DOI:  http://doi.org/10.1080/10489220801937346

Gundel, J. K., Hedberg, N., & Zacharaski, R. (1993). Cognitive status and the form of referring expressions. Language, 69, 274–307. DOI:  http://doi.org/10.2307/416535

Gundel, J. K., & Johnson, K. (2013). Children’s use of referring expressions in spontaneous discourse: Implications for Theory of Mind development. Journal of Pragmatics, 56, 43–57. DOI:  http://doi.org/10.1016/j.pragma.2013.04.003

Gundel, J. K., Ntelitheos, D., & Kowalsky, M. (2007). Children’s use of referring expressions: Some implications for Theory of Mind. ZAS Papers in Linguistics, 48, 1–21. DOI:  http://doi.org/10.21248/zaspil.48.2007.351

Heine, B. (1997). Cognitive foundations of grammar. Oxford: Oxford University Press.

Heine, B., & Kuteva, T. (2006). The changing languages of Europe. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199297337.001.0001

Hickmann, M., & Hendriks, H. (1999). Cohesion and anaphora in children’s narratives: A comparison of English, French, German, and Mandarin Chinese. Journal of Child Language, 26, 419–452. DOI:  http://doi.org/10.1017/S0305000999003785

Hickmann, M., Hendriks, H., Roland, F., & Liang, J. (1996). The marking of new information in children’s narratives: A comparison of English, French, German and Mandarin Chinese. Journal of Child Language, 23, 591–619. DOI:  http://doi.org/10.1017/S0305000900008965

Hickmann, M., Schimke, S., & Colonna, S. (2015). From early to late mastery of reference: Multifunctionality and linguistic diversity. In L. Serratrice & S. E. M. Allen (Eds.), The acquisition of reference, TiLAR 15 (pp. 181–211). Amsterdam: John Benjamins Publishing Company. DOI:  http://doi.org/10.1075/tilar.15.08hic

Hudson Kam, C. L., & Newport, E. L. (2005). Regularizing unpredictable variation: The roles of adult and child learners in language formation and change. Language Learning and Development, 1, 151–195. DOI:  http://doi.org/10.1207/s15473341lld0102_3

Hudson Kam, C. L., & Newport, E. L. (2009). Getting it right by getting it wrong: When learners change languages. Cognitive Psychology, 59, 30–66. DOI:  http://doi.org/10.1016/j.cogpsych.2009.01.001

Hughes, M. E., & Allen, S. E. M. (2013). The effect of individual discourse-pragmatic features on referential choice in child English. Journal of Pragmatics, 56, 15–30. DOI:  http://doi.org/10.1016/j.pragma.2013.05.005

Hwaszcz, K., & Kędzierska, H. (2018). The rise of an indefinite article in Polish: An appraisal of its grammaticalisation stage. Studies in Polish Linguistics, 13, 145–166. DOI:  http://doi.org/10.4467/23005920SPL.18.005.8744

Jarvis, S. (2002). Topic continuity in L2 English article use. Studies in Second Language Acquisition, 24, 387–418. DOI:  http://doi.org/10.1017/S0272263102003029

Kachru, Y. (1980). Aspects of Hindi grammar. New Delhi: Manohar Publications.

Kachru, Y. (2006). Hindi. Amsterdam: John Benjamins Publishing. DOI:  http://doi.org/10.1075/loall.12

Kerswill, P. (1996). Children, adolescents, and language change. Language Variation and Change, 8, 177–202. DOI:  http://doi.org/10.1017/S0954394500001137

Kidwai, A. (2000). XP-adjunction in Universal Grammar: Scrambling and binding in Hindi-Urdu. Oxford: Oxford University Press.

Krifka, M. (2008). Basic notions of information structure. In C. Fery & M. Krifka (Eds.), Interdisciplinary studies of information structure (pp. 13–55). Potsdam: Potsdam University. DOI:  http://doi.org/10.1556/ALing.55.2008.3-4.2

Labov, W. (2007). Transmission and diffusion. Language, 83, 344–387. DOI:  http://doi.org/10.1353/lan.2007.0082

Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.

Long, M., Rohde, H., Oraa Ali, M., & Rubio-Fernandez, P. (under review). Finding the bigger picture: Macro- (but not micro-) level features of the discourse reveal age-related differences in referential choice.

Lyons, C. (1999). Definiteness. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511605789

MacWhinney, B. (2001). The competition model: The input, the context, and the brain. In P. Robinson (Ed.), Cognition and second language instruction (pp. 69–90). Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781139524780.005

Matthews, D., Lieven, E., Theakston, A., & Tomasello, M. (2006). The effect of perceptual availability and prior discourse on young children’s use of referring expressions. Applied Psycholinguistics, 27, 403–422. DOI:  http://doi.org/10.1017/S0142716406060334

Narasimhan, B., & Dimroth, C. (2008). Word order and information status in child language. Cognition, 107, 317–329. DOI:  http://doi.org/10.1016/j.cognition.2007.07.010

Núñez, R., Cooperrider, K., Doan, D., & Wassmann, J. (2012). Contours of time: Topographic construals of past, present, and future in the Yupno valley of Papua New Guinea. Cognition, 124, 25–35. DOI:  http://doi.org/10.1016/j.cognition.2012.03.007

Orvig, A. S., Marcos, H., Morgenstern, A., Hassan, R., Leber-Marin, J., & Parès, J. (2010). Dialogical beginnings of anaphora: The use of third person pronouns before the age of 3. Journal of Pragmatics, 42, 1842–1865. DOI:  http://doi.org/10.1016/j.pragma.2009.09.020

Otwinowska, A., Opacki, M., Mieszkowska, K., Białecka-Pikul, M., Wodniecka, Z., & Haman, E. (2020), Polish-English bilingual children overuse referential markers: MLU inflation in Polish language narratives. First Language, 42, 191–215. DOI:  http://doi.org/10.1177/0142723720933769

R Core Team. (2019). R: A language and environment for statistical computing, Version 3.0. 2. Vienna, Austria: R Foundation for Statistical Computing; 2013.

Renkema, J., & Schubert, C. (2018). Introduction to discourse studies. Amsterdam: John Benjamins Publishing Company. DOI:  http://doi.org/10.1075/z.219

Rozendaal, M. I., & Baker, A. E. (2008). A cross-linguistic investigation of the acquisition of the pragmatics of indefinite and definite reference in two-year-olds. Journal of Child Language, 35, 773–807. DOI:  http://doi.org/10.1017/S0305000908008702

Rubio-Fernandez, P. (2019). Theory of Mind. In C. Cummins & N. Katsos (Eds.), The Oxford handbook of experimental semantics and pragmatics (pp. 524–536). Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780198791768.013.23

Rubio-Fernandez, P. (2020). Pragmatic markers: The missing link between language and Theory of Mind. Synthese, 199, 1125–1158. DOI:  http://doi.org/10.1007/s11229-020-02768-z

Rubio-Fernandez, P. (under review). Cultural evolutionary pragmatics: Investigating the co-evolution of language and social cognition.

Schaeffer, J., & Matthewson, L. (2005). Grammar and pragmatics in the acquisition of article systems. Natural Language & Linguistic Theory, 23, 53–101. DOI:  http://doi.org/10.1007/s11049-004-5540-1

Serratrice, L. (2005). The role of discourse pragmatics in the acquisition of subjects in Italian. Applied Psycholinguistics, 26, 437–462. DOI:  http://doi.org/10.1017/S0142716405050241

Serratrice, L. (2008). The role of discourse and perceptual cues in the choice of referential expressions in English preschoolers, school-age children, and adults. Language Learning and Development, 4, 309–332. DOI:  http://doi.org/10.1080/15475440802333619

Sharma, D. (2005). Language transfer and discourse universals in Indian English article use. Studies in Second Language Acquisition, 27, 535–566. DOI:  http://doi.org/10.1017/S0272263105050242

Sinha, R. M. K. (2009). Learning disambiguation of Hindi morpheme ‘vaalaa’ with a sparse corpus. In 2009 International Conference on Machine Learning and Applications (pp. 653–657). IEEE. DOI:  http://doi.org/10.1109/ICMLA.2009.130

Slobin, D. I. (1985). Crosslinguistic evidence for the language-making capacity. In D. I. Slobin (Ed.), The crosslinguistic study of language acquisition (pp. 1157–1256). Lawrence Erlbaum Associates, Inc.

Song, H. J., & Fisher, C. (2007). Discourse prominence effects on 2.5-year-old children’s interpretation of pronouns. Lingua, 117, 1959–1987. DOI:  http://doi.org/10.1016/j.lingua.2006.11.011

Vion, M., & Colas, A. (1999). Expressing coreference in French: Cognitive constraints and development of narrative skills. Journal of Psycholinguistic Research, 28, 261–291. DOI:  http://doi.org/10.1023/A:1023206231534

Wong, A. L. (2016). Indefinite markers, grammaticalization, and language contact phenomena in Chinese. Proceedings of the Linguistic Society of America, 1, 9–1. DOI:  http://doi.org/10.3765/plsa.v1i0.3702

Wong, A. M. Y., & Johnston, J. R. (2004). The development of discourse referencing in Cantonese-speaking children. Journal of Child Language, 31, 633–660. DOI:  http://doi.org/10.1017/S030500090400604X

Zimmermann, M. (2016). Information structure. In M. Aronoff (Ed.), Oxford bibliographies in linguistics. DOI:  http://doi.org/10.1093/obo/9780199772810-0130