The Linguist

The Linguist-63/2-Summer24

The Linguist is a languages magazine for professional linguists, translators, interpreters, language professionals, language teachers, trainers, students and academics with articles on translation, interpreting, business, government, technology

Issue link: https://thelinguist.uberflip.com/i/1521779

Contents of this Issue

Navigation

Page 24 of 35

@CIOL_Linguists SUMMER 2024 The Linguist 25 FEATURES An approach that factors in actual usage yields much more effective results. If we retain the relation between verbs and tense/aspect markers we see that the cells (Table 1) are occupied by a different subset of verbs (see Table 2): some verbs occur preferably with specific tense/ aspect markers ('replied' in the simple past), while other tense/aspect combinations are typically accompanied by contextual elements ('since then' and the past perfect). While this provides learners with knowledge they can apply directly, it does look like madness: do we really memorise these tense and aspect preferences for each individual verb? When we use computational techniques to check millions of examples it quickly becomes clear that those 12 cells (Table 1) aren't all equally important: the present and past simple make up more than 80% of all examples. This suggests that users do not get exposed to all forms equally, and this has implications for learning. Building an algorithm To explore how this system would be learnt we turned a simple but fundamental rule of error-correction learning – the Rescorla-Wagner rule 2 – into a computational algorithm. In this way we have a model that mimics how people learn from raw language data in a naturalistic way. The algorithm uses the target form's immediate context as cue(s) to inform the choice of verb form, as people do. For example, in the sentence 'Almost a year later nothing has happened', the cues are individual words (what we refer to as '1-grams'), including 'almost', 'year', 'later', 'nothing', 'a', but also '2-grams', such as 'almost#a', 'a#year', and '3-grams', e.g. 'a#year#later' alongside the verb 'happen'. After the model has worked its way through a large number of examples, calculating the strength of co-occurrence between the 1-, 2- and 3-grams and forms of the verb 'happen', the model is tested. It is given a an effective, sustainable and rewarding experience for language learners, whatever the medium of instruction. The building blocks for learning One well-known principle of learning is error-correction learning. This assumes that an organism gradually builds relevant relationships between elements in its environment to gain a better understanding of the world. Learning a language also involves building relations, correcting errors and gradually improving our performance. But what do we need to pick up when we are exposed to a new language? Natural languages exhibit a unique property: a small number of words are very frequent, but the vast majority are rarely used. In English, for example, 10 words make up 25% of usage 1 and you need only around 1,000 words to be able to understand 85% of what an average speaker says. This observation has been used to make informed decisions about learning priorities, in particular which words should be taught first. To illustrate our approach, let's look at an area that is notoriously difficult for language learners: tense/aspect. Generally, English grammars assume the existence of 12 tense/aspect combinations. These arise from three tenses and four aspects (see Table 1). In a traditional grammatical approach, you would define the abstract meaning of each tense and each aspect separately. Tense is relatively easy: if something happened yesterday, you use a past tense; if something will happen tomorrow, you use a future tense. Aspect can also be explained concisely: the simple aspect is there to express a fact; the perfect is for actions which are completed but retain some relevance to the present situation; the progressive describes an event that happens over a period of time. It's an economical method and works to teach forms and their labels, but it is terribly inadequate when it comes to enabling learners to use those forms. Figure 2: Prevalence of cues that enable language users to determine which aspect/tense to use. 1-grams are one-word contextual cues (e.g. 'almost'), 2-grams are two-word contextual cues (e.g. 'a#year'), lexical items refer to the verbs that are used (e.g. 'reply'). Figure 1: The prevalence of each tense/aspect in the British National Corpus

Articles in this issue

Archives of this issue

view archives of The Linguist - The Linguist-63/2-Summer24