PhD defence: How people can 'control' their speech

Date of news: 7 June 2022

There are about seven and a half billion people in the world, but no two of them speak in exactly the same way. Through various experiments, linguist Chen Shen investigated how speakers control and improve their speech production so that communication runs smoothly. Shen will defend her thesis on 8 June.

Photo by Andrea Piacquadio, Pexels

No two people speak in exactly the same way, not even speakers of the same language, age, gender and dialect. Thus, speakers may differ in how quickly they speak, in their articulation of words or in how often they stumble over words when speaking.  Speakers may also adapt their speaking style to the situation or environment in which they are communicating. For example, people often speak louder in noisy restaurants or articulate more clearly if their interlocutors are hearing impaired, to ensure that their message is understood.

But not all speakers are equally capable of ‘enriching’ their speech for listeners. Linguist Chen Shen's research addresses the question of how speakers can adapt their speech to meet different (communicative) requirements.

Link between articulation and cognition

In her research, Shen examined the individual differences in speakers’ speech production through a few empirical studies. The focus was on the so-called 'late stages' of speech production, for instance, articulation.

In the first part of her study, Shen investigated whether there is a relationship between articulation and the executive control skills of her subjects. 'In other words, is your articulation better if you can better stay goal-oriented despite distractions?' explains Shen. 'Previous research suggests that there is no connection, as the late stages of speech production are thought to be largely automatic. But there are also studies  indicating there IS a connection.’

Speaking tasks

The test subjects first had to perform two speaking tasks: a Tongue Twister task and a DDK task. In the Tongue Twister task - which is often used in psycholinguistic studies - the test subjects each had to pronounce four Dutch tongue-twister sentences as quickly and as accurately as possible, with five to eight repetitions per sentence. For example: 'Ik bak een plak bakbloedworst' (I fry a slice of blood-sausage. An English example of a tongue-twister is: She sells seashells by the seashore)

During the diadochokinetic (DDK) task – often used in clinical research - subjects were encouraged to repeat one syllable or a sequence of several syllables as quickly and accurately as possible, for 10 seconds. The subjects had to pronounce a total of three repetitive ('papapa...', 'tatata...', and 'kakaka...') and four alternating stimuli ('pataka...', 'katapa...', 'kapotte...', and 'paketten...'), with the last two alternating stimuli being real Dutch words.

Shen compared the data from the two speech tasks with the results from three cognitive tasks that her test subjects also had to perform, which measured their executive control skills (for the curious reader: the Flanker interference task, a Letter-number switching task and an Operation Span task). Results showed that the participants who scored higher on the Letter-number switching task were also more able to repeat DDK stimuli and tongue-twister sentences at a fast rate. ‘Moreover, real-word DDK stimuli were produced more accurately than nonsense DDK stimuli, and the size of that difference was related to the speakers' cognitive switching ability,' says Shen. 'This shows a relationship between maximum speech accuracy and executive control, suggesting that late stages of speech production are also related to executive control.'

Background noise

In another part of her research, Shen investigated the extent to which speakers adapt their speech in a noisy environment. This kind of speech requires extra vocal effort from speakers, possibly resulting in vocal fatigue and reduced vocal function. In a speech in noise task, the 78 subjects had to read out 48 sentences both in their usual speaking style and then in a condition where they were instructed to speak clearly while hearing loud background noise through headphones. The subjects' read-aloud sentences were rated using a computer metric (called HEGP metric), which predicts how intelligible the speaker's speech would have been for human listeners.

The results showed that despite the greater effort required to produce the kind of ‘enriched’ speech, speakers were generally able to not only maintain, but even improve their speech adaptations over the course of the experiment. 'Although the speakers tested immediately spoke louder, slower, and more clearly in the noisy environment, they were able to continue to improve. This may be good news for hearing impaired people, as it seems to suggest that a clear speaking style, at least to some extent, is trainable.'


The results of Shen's study show that the speech performance of individual speakers is related to their cognitive abilities and speaking demands. Moreover, speakers were found to exhibit a dynamic pattern of speech adaptation in noisy speech conditions. The data collected by Shen for her studies was published in two open-access corpora. The two corpora - the Radboud Maximum Speech Performance Corpus (RaMax) and the Radboud Lombard Corpus (RaLoCo) - open up opportunities to answer more questions about differences between speakers in different speaking conditions, and their effects on listeners.

Chen Shen will defend her thesis Individual Differences in Speech Production and Maximum Speech Performance on 8 June from 12:30.