A unified neurocomputational bilateral model of spoken language production in healthy participants and recovery in poststroke aphasia

Significance Studies of healthy and impaired language have generated many verbally described hypotheses. While these verbal descriptions have advanced our understanding of language processing, some explanations are mutually incompatible, and it is unclear how they work mechanistically. We constructed a neurocomputational bilateral model of spoken language production to simulate a range of phenomena in healthy participants and patients with aphasia simultaneously, including language lateralization, impaired performance after left but not right damage, and hemispheric involvement in plasticity-dependent recovery. The model demonstrates how seemly contradictory findings can be simulated within a single framework. This provides a coherent mechanistic account of language lateralization and recovery from poststroke aphasia.


S1. Explorations of the selection of the number of hidden units in the model
To determine the minimum number of units that were required for the model to perform the repetition task, we have developed a unilateral model with different numbers of hidden units. The selection principle followed our assumption that the key difference between the left and right pathways in the model should be quantitative, in terms of differential capacity, rather than qualitative, in terms of function. Thus we ensured that the unilateral model was capable of performing the word and nonword repetition tasks to a satisfactory level (i.e., at least 80% accuracy for both words and nonwords). The architecture of the model and the performance are illustrated in   accuracy on word and nonword repetition tasks respectively. This number was used for the right processing pathway in the left lateralised model reported in the main text.

S2. Explorations of the model's recovery without the implementation of inefficient learning of the surviving units after damage
To simulate behavioural patterns in post-stroke aphasia and recovery, we trained a damaged model with initial inefficient learning. This was to mimic a loss of function and activation in the damaged brain regions immediately after stroke observed in most patients (1). However, to demonstrate this implementation is not a critical determinant factor to explain different behavioural recovery patterns, we re-trained the damaged model without such an inefficiency period. It means that the surviving units in the hidden layer 1 immediately after damage can learn as efficiently as other units do in the unaffected layers. The levels of damage and the training time of recovery were the same as those described in the main text. Fig. S2 shows the recovery patterns of the damage model without initial inefficient learning in different lesion conditions.
The resulting performance and output activation patterns were broadly similar to those produced by the model with initial inefficient learning (Fig. 3). When the left lesion was mild, the activation patterns tended to return to be left lateralised during recovery. By contrast, when the left lesion was more severe the activation patterns became right lateralised, and this shift in activation led to relatively poor performance in particular for nonwords. One difference was that the transient pattern from left to right and then back to left previously observed in the left mild lesion condition 3 was less pronounced. However, it was clear that that activity in the right pathway rapidly increased immediately after damage with decreased activity in the left pathway, though there was no crossover.
Regarding all of the other measures, the patterns were very similar to those reported in Fig. 3. These results demonstrate that the simulation without initial inefficient learning could capture the general patterns of recovery in different recovery phases. However, to better characterise the shift in activation patterns in the acute phase, the implementation of initial inefficient learning is critical in simulating a loss of function and activation in the damaged brain regions immediately after stroke observed in most patients (1).

S3. A figure for the full simulation patterns of post-stroke aphasia and recover including six measures: model performance, output unit activation, hidden unit activation, weight strength, rate of weight change, and RSA correlation
Six different measures illustrated in Figure S3 were used to reveal the underlying recovery mechanism of the damaged model. In particular, average weight strength and weight change across the hidden layers in the model were useful for understanding how the model re-learned the task during recovery and what the link was between recovery performance and re-learning processes. For instance, in the left severe lesion condition, the right output unit activation increased rather quickly after damage, and this was also reflected in an initial rise in the rate of weight change. However, performance accuracy had not started to improve at the time. When output unit activation reached a steady status, the weights continued to be updated and performance gradually improved. This may indicate two critical steps for re-learning: activation and tuning weight connections. Immediately after damage, the activation level of units in the model is generally low. Thus the first step toward relearning is to increase the activation level and weight connections, and this is followed by reoptimising weight connections in order to re-learn the task by minimising the errors between the target and actual patterns at the output layer.

S4. Explorations of the inhibitory interconnectivity between the left and right hemispheres
To our knowledge, there is no direct evidence of transcallosal inhibitory connectivity outside

M1. Phonological representations
The training set included one hundred three-phoneme high frequency and one hundred threephoneme low frequency monosyllabic words with consonant-vowel-consonant (CVC) structures.
Each word was represented in three phoneme slots, with each slot consisting of 25 phonetic features (including, voiced, nasal, labial, palatal, round, etc.). For instance, the word "let" its phonology was

M2. Training environment
The model was trained with a learning rate of 0.01, a batch size of 1 and momentum of 0.9, using a standard back-propagation algorithm with a negative bias of -2. The sigmoid function was used as an output activation function. The weight decay was set to 0.000001. Weight connections in the model were updated after each word presentation on the basis of the cross-entropy error computed between the target and the actual activation of the output units (for details, see the following section). There was no dropout and no regularisation term. Note that a simple recurrent network generally has a sequential update procedure, which means layers in the network are updated in order. To prevent the order of update from biasing the model's reliance on one pathway, a counterbalance update sequence at the batch level was used during training.

M3. Cross-entropy error measure for back-propagation in the model
In neural network modelling, the back-propagation algorithm is often used to compute how to change connection weights in order to minimise the errors between output patterns generated by the model and their target patterns. Different error measures can be applied, that includes the summed 12 squared error (5) and cross-entropy error (6). The summed squared error measures the sum of squared errors across all output units: where o(i) is the output of unit i and t(i) is its target value.
Alternatively, the cross-entropy error measures Kullback-Leibler divergence (7) between the output pattern and the target pattern: where o(i) is the output of unit i and t(i) is its target value.
In our simulations, we used the cross-entropy error measure because it generates larger weight changes than the summed squared error measure. This can be particularly important when training the model with sparse representations (6), such as our phonological representations. Most of the output units were turned off by the model and it requires sufficiently large weight changes to shift its stable state for those few units that need to be on.

M4. Testing procedures
The phonological representation of each phoneme was presented sequentially for the first three time ticks. From the fourth time tick, the activation of units at the output phonological layer was recorded. If unit activation was greater than 0.5, the unit output was set to 1; otherwise, it was 0.
The model's output pattern was then compared with its target representation of each phoneme from the fourth time tick to the sixth time tick sequentially. If all of the phonemes produced by the model and target phonemes were the same, then the model was judged to have spoken the word (or nonword) correctly.