Inferring transmission trees to guide targeting of interventions against visceral leishmaniasis and post–kala-azar dermal leishmaniasis

Significance Methods for analyzing individual-level geo-located disease data have existed for some time, but have rarely been used to analyze endemic human diseases. Here we apply such methods to nearly a decade’s worth of uniquely detailed epidemiological data on incidence of the deadly vector-borne disease visceral leishmaniasis (VL) and its secondary condition, post–kala-azar dermal leishmaniasis (PKDL), to quantify the spread of infection around cases in space and time by inferring who infected whom, and estimate the relative contribution of different infection states to transmission. Our findings highlight the key role long diagnosis delays and PKDL play in maintaining VL transmission. This detailed characterization of the spatiotemporal transmission of VL will help inform targeting of interventions around VL and PKDL cases.


Model Description
The model used in this paper (Fig. S1) is an extension of our previously published individual-level spatiotemporal SEIR model of visceral leishmaniasis (VL) transmission (1) to explicitly include asymptomatic infection and post-kala-azar dermal leishmaniasis (PKDL), and account for unobserved initial infection statuses and migration of individuals. We measure time in units of months, with t = 1 corresponding to the start of the study (January 2002) and t = T = 108 the end of the study (December 2010), and label individuals who developed VL symptoms between January 2002 and December 2010 by i = 1, 2, . . . , nI (nI = 1018) and the remainder of the population by i = nI + 1, nI + 2, . . . , n (n = 25506). Between January 2002 and December 2010, 725 individuals relocated between different households within the study area, which we account for by including a second observation for these individuals in their second household. These internal migrators are essentially treated as extra individuals for the purpose of calculating pairwise distances and transmission rates. However, their infection status is updated in such a way that it is consistent across the two observations (e.g. if they are asymptomatically infected in the period of the first observation and do not recover before relocating, their asymptomatic infection status is carried over to the second observation). We denote the vectors of event times for individuals i = 1, . . . , n as follows: Upon infection, individuals either develop pre-symptomatic infection with probability pI or asymptomatic infection with probability 1 − pI (see Table S2 for values of fixed parameters used in the model). Pre-symptomatic individuals progress to symptomatic VL following a variable incubation period, and then upon treatment to recovery, or dormant infection if they later relapse or develop PKDL. VL cases that relapse either return to dormant infection or progress to recovery once re-treated. Asymptomatically infected individuals either recover naturally or very occasionally progress to dormant infection and subsequent PKDL. PKDL cases can either resolve following treatment or naturally, whereupon they enter the recovered class. We assume that recovered individuals do not return to being susceptible, i.e. cannot be reinfected, irrespective of whether they have recovered from VL, PKDL or asymptomatic infection. It is still uncertain whether individuals can become reinfected with L. donovani parasites, particularly asymptomatically infected individuals. Most previous modelling studies have assumed or concluded that individuals can be reinfected (2)(3)(4), but have not accounted for spatial variation in transmission and available evidence suggests that repeat episodes of VL are relatively rare and are due to relapse not reinfection (5)(6)(7), and that in highly endemic settings a high proportion of asymptomatically infected individuals develop long-term protective cell-mediated immunity following infection (8,9). For each individual i in the study population, we define Vi = max(0, Bi, IM i) and Wi = min(T + 1, EM i, Mi) as their times of entry into and exit from the study area respectively. All non-symptomatic individuals (individuals without any VL or PKDL symptoms before or during the study) who were born or entered the study area after January 2002 are assumed to have been susceptible upon entry.
A. Pairwise infection pressures. Susceptible individuals become infected either from pre-symptomatically infected individuals, from symptomatic VL cases, PKDL cases, asymptomatic individuals, or 'background' (unexplained) transmission. The 'infection pressure' exerted by an infected individual j on a susceptible individual i at time t is given by: λij(t) = (βK(dij) + δ1ij)hj(t), i ∈ S(t), j ∈ E(t) ∪ A(t) ∪ I(t) ∪ P(t), [S1] where β is the rate constant for spatial transmission between infected and susceptible individuals; K(dij) is the spatial kernel function that scales the infection pressure by the distance dij between individuals i and j; δ (≥ 0) is a rate constant for additional within-household transmission; 1ij is an indicator function for individuals living in the same household, i.e. 1ij = 1, if i and j are in the same household, 0, otherwise; and hj(t) is the infectiousness of j at time t. Individuals are assumed to be stationary in their households when transmission to and from sandflies occurs (since the vast majority of sandfly biting occurs at night when individuals are asleep in, or directly outside, their homes (10)(11)(12)(13)(14)), so that the distances between them dij are fixed when transmission takes place. However, the changes in the distances between individuals that occur when individuals migrate within the study area is incorporated into the model (by calculating the distances between migrators and all other individuals both when they are in their first household and when they are in their second household).
Since we found little difference in the goodness of fit of the model (based on the Deviance Information Criterion (DIC)) between an exponentially decaying spatial kernel and a Cauchy-type kernel in our previous study (1) (the exponential kernel gave a marginally better fit), we use the exponential kernel here: [S2] where 1/α is the distance decay rate (per m) in transmission risk (so smaller values of α correspond to a faster decrease in risk with distance), and K0 is a normalisation constant, defined such that which reduces correlation between α and β and thus improves the mixing of the MCMC algorithm ( §2B.4). However, unlike in our previous study, we allow the possibility of transmission between individuals in different paras (hamlets). The reason for this is that surveying of neighbouring paras was more complete in the two clusters of paras in the present study, and thus some minimum inter-para household distances were less than intra-para household distances and within the estimated short-term flight range of P. argentipes sandflies (a few hundred metres (15)(16)(17)), so transmission between neighbouring surveyed paras may have occurred.

B. Infectiousness over time.
There is relatively little data available on the infectiousness of individuals in different L. donovani infection states over time. We therefore assume that the infectiousness of individuals remains constant within each state and individuals in the same state have the same probability of passing on infection, such that an individual's infectiousness over time takes the form of a step function (Fig. S1B). The relative infectiousness of asymptomatic individuals compared to VL cases, h4, is assumed to take a fixed value between 0 and 0.02 based on results from xenodiagnosis studies in which 0 of 183 asymptomatically infected individuals infected sandflies (95% confidence interval for transmission probability: 0-0.023) (18), and estimates from previous modelling studies (3,4,19) (Table S1). We test different values of h4 to assess the sensitivity of the other parameter estimates to the assumed asymptomatic infectiousness. In the absence of data on the relative infectiousness of pre-symptomatic individuals and asymptomatic individuals, we take the relative infectiousness of pre-symptomatic individuals, h0, to be the same as that of asymptomatic individuals (i.e. h0 = h4).
One-hundred and thirty-eight of the 190 PKDL cases underwent one or more examinations by a trained physician to determine the type and extent of their lesions (Table S1). Data from a recent xenodiagnosis study in Bangladesh (20) on 47 PKDL patients (21 nodular and 26 macular/papular) and 15 VL patients is used to assign infectiousness to these cases according to their lesion type (macular/papular, plaque, or nodular) ( Table S1). The results of the xenodiagnosis study suggest that infectiousness of PKDL increases with lesion severity and macular/papular PKDL cases are less infectious towards sandflies than VL cases but nodular PKDL cases more so. As plaques were not treated as a separate lesion type in the study, but are intermediate in severity between macular/papular lesions and nodular lesions (21), a value halfway between the infectiousness of macular/papular PKDL cases and nodular PKDL cases is assigned to these individuals. The 52 PKDL cases that were not physically examined during the study are assigned an average of the infectiousnesses of the different lesion types (weighted by frequency amongst the examined cases). Although it is not known for certain whether treated VL cases who subsequently relapse or develop PKDL are uninfectious following treatment, we assume dormantly infected individuals are uninfectious, based on rapid decreases in parasitaemia observed in VL cases following the start of treatment (22,23)). We assume that relapse cases are as infectious upon relapse as in their first clinical episode, as relapse appears to be associated with resurgence in parasitaemia to high parasite loads (23,24). Over half (100/190=53%) of the PKDL cases did not receive treatment, and of these 49 self-resolved in a median time of 20 months (interquartile range (IQR) 14-32 months). These cases are assumed to have remained infectious until their lesions resolved, while treated PKDL cases are assumed to have stopped being infectious once their treatment commenced (25). Thus, individual j's infectiousness at time t is given by [S3] C. Incubation period. Following previous work (1), we model the incubation period as negative binomially distributed NB(r, p) with fixed shape parameter r = 3 and 'success' probability parameter p, and support starting at 1 (such that the minimum incubation period is 1 month), i.e. via the following probability mass function (PMF): We estimate p in the MCMC algorithm for inferring the model parameters and missing data (see §2).
D. VL onset-to-treatment time distribution. Several VL cases with onset before 2002 have missing symptom onset and/or treatment times (only their onset year is recorded), and may therefore have been infectious at the start of the study period. In order to be able to infer the onset-to-treatment times of these cases, OT j = min(R j , D j ) − I j (where primed variables are the missing counterparts of the observed variables), in the MCMC algorithm (see §2) we model the onset-to-treatment time distribution as a negative binomial distribution NB(r1, p1) (Table S2) and fit to the onset-to-treatment times of all VL cases for whom both onset and treatment times were recorded (Fig. S2A): to obtain r1 = 1.34 and p1 = 0.38 (corresponding to a mean onset-to-treatment time of 3.2 months).

E. Asymptomatic infection duration.
Neither the infection or recovery time of asymptomatically infected individuals is observed, so we need to specify a distribution for the asymptomatic infection duration in order to infer their recovery times. Based on a previous multi-state Markov model of the natural history of VL (26), in which it was assumed that the duration of asymptomatic infection is exponentially distributed and its mean was estimated as approximately 5 months, we assume the asymptomatic infection duration, AIPj = Rj − Aj, follows a geometric distribution (the discrete analog of the exponential distribution) Geom(p2), with a minimum duration of 1 month and a mean of 5 months (p2 = 1 5 ): The choice of the geometric distribution is partly motivated by the fact that it is a memoryless distribution, i.e. P(AIPj > s + t|AIPj > s) = P(AIPj > t). This simplifies the estimation of recovery times for initially asymptomatically infected individuals (see §1I) in the MCMC algorithm as it means that the probability that an initially asymptomatically infected individual remains infected for a further t months is the same as if they were infected in month 0 regardless of when they were infected.

F. Dormant infection duration.
Sixteen (8.4%) of the 190 PKDL cases reported no prior history of VL, and are assumed to have been previously asymptomatically infected. We assume that their duration of dormant infection prior to PKDL and that of VL cases who develop PKDL follow the same negative binomial distribution, NB(r3, p3), and estimate r3 and p3 by fitting to the observed VL-treatment-to-PKDL-onset times by maximum likelihood estimation (Fig. S2B). Unlike for the incubation period, we take 0 months to be the minimum duration of dormant infection, since two VL cases developed PKDL in the same month as they were treated for VL, and one had simultaneous VL and PKDL. Thus the PMF is: where r3 = 1.73 and p3 = 0.064 (corresponding to a mean duration of dormant infection of 25 months).
G. Time to relapse and relapse duration. Forty-five VL cases suffered treatment failure or relapse during the study. Of these, 16 were treated with miltefosine and 29 with sodium stibogluconate (SSG). Of those treated with miltefosine, 15 were treated with counterfeit drug in 2008 (27) and all but one of these cases reported no gap between the start of treatment and recurrence of symptoms (the other case reported a gap of 30 days). We assume that individuals without any gap between treatment and symptom recurrence suffered treatment failures, and that their infectiousness continued for 1 month following the start of treatment until they were treated with SSG. The gap between the start of treatment and new symptoms occurring was recorded for 7 of the cases originally treated with SSG or miltefosine from a clinical trial, and was non-zero in all cases. We assume that all cases not recorded as having immediate recurrence of symptoms suffered VL relapse and that the time to relapse follows a geometric distribution Geom(p4) with PMF: where fitting to the recorded gaps gives p4 = 0.13 (corresponding to a mean time to relapse of 7.9 months). Relapse cases are assumed to be uninfectious from their treatment month to their relapse time and their duration of symptoms upon relapse is assumed to follow the same distribution as the onset-to-treatment time for a first VL episode (Eq. (S5)). We assume all relapse cases were treated for relapse before the end of the study, since the latest treatment time for primary VL in a case that subsequently relapsed was April 2009. H. Infection pressure. The total infection pressure on susceptible individual i at time t is given by the sum of the infection pressures on them from all infectious individuals (VL cases, PKDL cases, pre-symptomatic individuals and asymptomatic individuals) at time t in Eq. (S1) (see Fig. S1C) plus a constant background transmission rate to account for unexplained infections due to non-explicitly modelled factors (e.g. due to short-term movement of individuals): where Inf (t) = E(t) ∪ A(t) ∪ I(t) ∪ P(t) is the set of all individuals infectious at time t. The transmission process is thus described by a discrete-time approximation to an inhomogeneous Poisson process with rate λ(t) = i∈S(t) λi(t) (28). The probability of susceptible individual i remaining susceptible in any particular month t is: while the probabilities of pre-symptomatic or asymptomatic infection in month t given susceptibility up to month t − 1 are, respectively: [S12] I. Model for initial status of non-symptomatic individuals. As there was transmission and VL in the population before the start of the study, individuals with no record of VL prior to 2002 may have been asymptomatically infected before the start of the study. Thus, the initial statuses of non-symptomatic individuals are censored and need to be estimated. Although the data on VL pre-2002 is likely incomplete, we adopt the simplifying assumption that any individual that had VL prior to 2002 is at least recorded as having had previous VL (even if their onset and treatment times are missing, imprecise or inaccurate), such that anyone in the rest of the population who was infected prior to 2002 could only have had asymptomatic infection. We also average over historical and spatial variation in the transmission rate, by assuming that the asymptomatic infection rate prior to 2002, λ0, was constant. We assume that all 16 individuals who developed PKDL during the study period without prior VL were asymptomatically infected during the study rather than before (i.e. we ignore the possibility that such individuals were initially dormantly infected). The latter assumption is not unreasonable despite the estimated long duration of dormant infection, as the earliest PKDL onset amongst these individuals was in November 2005 (47 months into the study), and it is unlikely to significantly affect the results given the small number of such cases. With these assumptions we arrive at the simplified model for the initial status of non-symptomatic individuals shown in Fig. S3. The probabilities of each non-symptomatic individual initially present (i.e. with Vj = 0) being susceptible, asymptomatically infected, or recovered from asymptomatic infection at time t = 0 (pS 0 , pA 0 , pR 0 ) can then be found by calculating the probability of avoiding asymptomatic infection in every month from their birth to the start of the study (Eq. (S13)), summing over the probabilities of being asymptomatically infected in one of the months between their birth and the start of the study and recovering after the start of the study (Eq. (S14)), and summing over the probabilities of being asymptomatically infected in a month before the start of the study and recovering before the start of the study (Eq. (S15)), respectively: where aj is the age of individual j in months at t = 0. Since we assume that non-symptomatic individuals who are born, or who immigrate into the study area, after the start of the study (with Vj > 0) are susceptible, for notational convenience we define the probabilities for these individuals as pS 0 (aj) = 1, pA 0 (aj) = pR 0 (aj) = 0. We estimate the historical asymptomatic infection rate, λ0, by fitting the model to age-prevalence data on leishmanin skin test (LST) positivity amongst non-symptomatic individuals from a cross-sectional survey of three of the study paras conducted in 2002 (8) (see Fig. S4). We assume that entering state R corresponds to becoming LST-positive, as LST positivity is a marker for durable, protective cell-mediated immunity against VL (8,9), and estimate λ0 by maximising the binomial likelihood: where nL = 479 is the number of non-symptomatic individuals that were LST-positive out of nNS = 1399 individuals tested. This gives λ0 = 0.0019 month −1 (95% confidence interval 0.0017-0.0021 month −1 ). J. Complete data likelihood. We assume that the end of the epidemic was observed from the point of view of VL and PKDL cases, i.e. every individual in the study population who eventually developed VL or PKDL did so within the observation period. Whilst it is likely that there was ongoing transmission beyond the end of the study in December 2010, this simplifying assumption should not introduce significant error based on the epidemic curves in Fig. 1 in the main text and the very low numbers of VL and PKDL cases with onset in the final months of the study. As noted in the main text, there is a considerable amount of missing data, including the infection times of VL cases (E) and the infection and recovery times of asymptomatic individuals (A and RA). So that we can specify the complete data likelihood -the likelihood of all events if all variables had been observed -and perform likelihood-based inference, we define the sets of all observed data and missing data as Y = (B, IM, EM,Ĩ,RI ,ĨR, DI , P, RP , M) and X = (A, E, RA, I , R I , DA, I R , RR) respectively, whereĨ,RI andĨR are the observed onset and recovery times of VL cases and relapse times of VL relapse cases; I , R I and I R are their missing counterparts; and DI and DA are the start times of dormant infection for VL cases and asymptomatic individuals. With these definitions, the complete data likelihood for the augmented data Z = (Y, X) given the model parameters θ = (β, α, , δ, p) is composed of the products of the probabilities of all the different individual-level events over all months: Here, L1 is the overall probability of individuals avoiding infection (from Eq. (S10)) while subject to infection pressure from infectious individuals around them (Eq. (S9)), L2 is the probability of all pre-symptomatic infections which occur (from Eq. (S11)), L3 is the probability of the unobserved incubation periods of VL cases (from Eq. (S4)), L4 is the probability of all unobserved asymptomatic infections which occur (from Eq. (S12)), L5 is the probability of individuals' unobserved initial infection statuses according to the historical transmission rate (from Eqs (S13)-(S15)), and is the cumulative distribution function of the asymptomatic infection period distribution (Eq. (S6)).
Given the length and variability of the incubation period, some VL cases with onset early in the study (i.e. in 2002) may have been infected before 2002. However, unlike in our previous work (1), we cannot assume that these cases arose due to background transmission, as there was not an extended period without VL cases directly before 2002. In fact, 413 individuals in the study area were recorded as having VL with onset prior to 2002, 141 of these in 2000 or 2001. Since the data on VL occurrence, and onset and treatment times before 2002 is less complete and probably less reliable than from 2002 onwards, there may be missing information on potential sources of infection of cases with onset during 2002. We therefore exclude these cases from the infection probability part of the likelihood (L2 in Eq. (S16)), but still impute their infection times in the MCMC algorithm ( §2B.4) by drawing new infection times for them using the incubation period distribution, and accepting/rejecting these based on the effect they have on the likelihood of the infection times of other individuals.

Parameter Estimation
A. Bayesian inference. Since the infection times of VL cases (E) and infection and recovery times of asymptomatic individuals (A and RA) were unobserved, and some recovery, relapse and relapse treatment times of VL cases (R I , I R , RR) were not recorded, and it is not computationally feasible to sum over all possible configurations of this missing data to calculate the complete data likelihood (Eq. (S16)), we treat the missing times as extra parameters and use a data augmentation approach to sample from the joint posterior distribution of the model parameters θ = (β, α, , δ, p) and the missing data X given the observed data Y [S17] We do this using a Metropolis-within-Gibbs MCMC data augmentation algorithm in which we iterate between sampling from the conditional posterior distribution of the parameters given the observed data and the current value of the missing data, P(θ|Y, X), and the conditional posterior distribution of the missing data given the current parameter values and the observed data, P(X|θ, Y) (1,29).
B. MCMC data augmentation scheme.

B.1. Asymptomatic infection times.
To account for the uncertainty in the parameter estimates due to the missing asymptomatic infection and recovery times we need to estimate which individuals were asymptomatically infected during the study and when they were infected, as part of the MCMC algorithm. To do this, we assign every non-symptomatic Since every non-symptomatic individual has an infection and recovery time pair, with a finite set of possible values, the dimension of the model remains fixed and reversible jump MCMC (RJMCMC) is not required. We note, however, that the algorithm described below for updating the asymptomatic infection and recovery time pairs is equivalent to a classical RJMCMC algorithm in which only individuals that are asymptomatically infected before or during the study have asymptomatic infection and recovery times and the number of asymptomatic infections varies as asymptomatic infection times are added or removed in the steps of the algorithm (30)(31)(32)(33). As in the RJMCMC algorithm, the likelihood can be higher in the initial iterations of the chain than the value it ultimately converges to in our approach since the number of asymptomatic infections in the likelihood changes as the number of pairs corresponding to asymptomatic infection before or during the study changes, despite the fixed dimension of the parameter space.
In order to sample effectively from the posterior distribution of the asymptomatic infection and recovery times, we need to use a proposal distribution that allows us to efficiently navigate the very large space of possible asymptomatic infection and recovery time pairs. We therefore use the running estimate of the total infection pressure on each individual, averaged over the history of the MCMC chain, to propose new asymptomatic infection times for non-symptomatic individuals. The proposal distribution for individual j at the kth iteration of the MCMC chain thus consists of the probabilities of them being asymptomatically infected at each time point according to the mean infection pressure on them at time t from the previous samples in the chain, λ (k) is a normalising constant to account for the fact that we know that j was not presymptomatically infected during the study. Although the infection pressure with a newly proposed asymptomatic infection and recovery time pair (A j , R j ) will be different from that with the current pair (Aj, Rj), and the probability of the reverse move (A j , R j ) → (Aj, Rj) will therefore not be equal to the probability of the forward move (Aj, Rj) → (A j , R j ), the difference in the infection pressure between consecutive iterations will be O( 1 k ). Hence, as the number of iterations becomes large, the adaptation will diminish (34) and λ k j (t) will tend towards a constant, such that the proposal probabilities for the forward and backward moves may be treated as equal.

B.2. Prior distributions.
We use relatively weak gamma distributions (Gamma(k, θ) with shape parameter k and scale parameter θ, i.e. probability density function f (x) = 1 Γ(k)θ k x k−1 e −x/θ ), for the prior distributions for the transmission parameters β, α, and δ (which are non-negative), since there is little information available with which to construct informative priors (Table S3).
The mean of the prior distribution for α is chosen as 100m based on previous findings (1). A beta distribution, Beta(a, b), is chosen as a conjugate prior for the incubation period parameter p, since it is a probability (p ∈ (0, 1]). The parameters of the beta distribution are chosen to match the mean of the prior distribution for the incubation period (NB(r, p), p ∼ Beta(a, b)) with the estimated mean incubation period (∼5 months) taken from previous analysis of diagnostic data from a subset of the study population (26). With this choice of prior, the conditional posterior distribution of p is a beta distribution: [S19] where β = (β, α, , δ), so p can be updated efficiently in the MCMC by drawing from this full conditional distribution rather than using a random walk Metropolis-Hastings update.  (1) and preliminary runs of the MCMC are used for the starting values for the model parameters to reduce the burn-in time, θ0 = (β0, α0, 0, δ0, p0) = (3, 100, 0.001, 0.001, 3/7). The initialisation of the missing data, X, is more involved as it requires picking starting values for the asymptomatic infection and recovery times of all non-symptomatic individuals in the population (which determine their initial statuses and whether they were infected or not by the end of the study), along with the infection times of all VL cases, and proceeds as follows: • For each VL case j missing a treatment time, draw a treatment time using Eq. (S5), i.e. [S20] • For each VL case j missing a treatment time who may have had active VL at the start of the study (assumed to be only those who had onset after January 2001 since those with onset before January 2001 were likely to have been treated by January 2002 (Fig. S2A)): • For each VL case j, draw an infection time: • For each non-symptomatic individual j: draw their initial status (susceptible, currently asymptomatically infected, or previously asymptomatically infected and recovered) according to Eqs (S13)-(S15) for Rj,0 = T + 1 -for the subset that are initially susceptible: * calculate the infection pressure on them (Eq. (S9)) from VL cases and PKDL cases with prior VL * use this to calculate the probability of asymptomatic infection at each time point as in Eq. (S18), i.e. with λj(t) replaced by the infection pressure from VL and PKDL cases * draw 1−p I p I nI − nAP individuals, where nAP is the number of asymptomatic individuals who develop PKDL, to be asymptomatically infected according to their cumulative probability of asymptomatic infection before the end of the study, The MCMC algorithm was run from a range of initial parameter values and asymptomatic infection and recovery time configurations (with different numbers of individuals asymptomatically infected before the study and during the study etc.) to test convergence, and in all cases converged to the same posterior distributions.

B.4. MCMC algorithm.
With the initial parameter values, missing data and priors chosen as described, the MCMC algorithm proceeds by repeating the steps below. Note that throughout the following we suppress notation of conditional dependencies in the likelihood terms where they are obvious to maintain legibility. The algorithm also accounts for the fact that some individuals were born or migrated or died during the study when updating the unknown pre-symptomatic infection times and asymptomatic infection and recovery times (using the birth/migration/death times as bounds on the proposed unobserved times), but we omit these details from the following description for simplicity. We update 20% of infection times of VL cases and 20% of infection and recovery times of asymptomatic individuals in each iteration of the algorithm to strike a balance between efficient mixing of the MCMC chain and computational speed. The algorithm is run for N = 10 5 iterations, including a burn-in of 20, 000 iterations which is discarded.
Terms in the full likelihood (Eq. (S16)) for the asymptomatic infection duration, dormant infection duration, VL symptom duration of cases with missing onset and/or treatment time, and relapse duration do not appear in the acceptance probabilities for the updates for asymptomatic infection and recovery times, dormant infection times, missing onset and treatment times, and missing relapse and relapse treatment times (steps 4-10 below), as they cancel with the proposal probabilities for the newly proposed times, and are unaffected by the other steps in the algorithm.

Update transmission parameters:
Update the transmission parameters β = (β, α, , δ) conditional on the current incubation period distribution parameter p and missing data values X and observed data Y, using an adaptive random walk Metropolis-Hastings step: (a) Propose new values β as described in §2B.5.

Move pre-symptomatic infection times:
Update one-at-a-time the pre-symptomatic infection times of a random 20% of VL cases for whom both the onset and treatment times were observed, and the infection times of all cases missing their onset and/or treatment time: (a) For each case j, propose a new infection time conditional on the current values of the other infection times and the model parameters, E j |((X \ {Ej}), Y, θ), using an independence sampler, i.e. propose a new incubation period IP j ∼ NB(r, p) and subtract this from the onset time: (b) Accept the infection time move with probability where (a) Choose a non-symptomatic individual j with probability proportional to their cumulative probability of being asymptomatically infected before the end of the study if Rj > 0, R j = 0, and accept (A j , R j ) with probability in Eq. (S21) but with ii. If Aj = T + 1: A. If A j = T + 1, then R j = T + 1, so accept immediately as the likelihood does not change. B. If A j = 0, follow Step 4(c)iA, except with .
iii. If Aj ∈ [1, T ]: A. If A j = 0, follow Step 4(c)iA, but with Q replaced by

Update asymptomatic infection times and dormant infection times of PKDL cases without prior VL:
Update the asymptomatic infection and dormant infection times for each PKDL case j without prior VL, conditional on the asymptomatic infection time being after the start of the study or j's birth (whichever is later): (a) Propose a new time, D j , for recovery to dormant infection from asymptomatic infection using an independence sampler: D j = Pi − DIPj, DIPj ∼ NB(r3, p3).

Update whole onset-to-treatment period of potentially initially active VL cases:
Update the onset and treatment times of all cases who potentially had active VL at the start of the study (t = 0) who are missing both onset and treatment times, one case at a time. For each case j: (a) Propose new onset and treatment times, I j and R j : p1).
(b) If I j is not in j's onset year or R j > min(Pj − 1, Mj, T ), reject immediately. Otherwise, accept the new times with probability where I−j = I \ {Ij}.

Update missing treatment times of potentially initially active VL cases:
Update the treatment times of cases who potentially had active VL at the start of the study whose treatment times were not recorded but whose onset times are known, one by one. For each case j: (a) Propose a new treatment time, R j :

Update whole relapse period of cases missing both relapse and relapse treatment times:
Update the relapse and relapse treatment times of all VL cases who suffered relapse during the study who are missing both times, one case at a time. For each case j:  B.5. Accelerated adaptive random walk Metropolis algorithm. We block update the transmission parameters in step 1 of the algorithm above using a combination of an 'Accelerated Shaping' algorithm and an 'Accelerated Scaling' algorithm. The Accelerated Shaping algorithm is a generalised version of the adaptive random walk Metropolis algorithm of Haario et al. (35), which uses the running estimate of the covariance matrix of the posterior distribution of the parameters from the MCMC chain in the multivariate normal proposal distribution to automatically tune the proposal covariance matrix to make efficient proposals. The Haario algorithm uses a fixed initial guess for the covariance matrix for a certain number of iterations at the start of the chain and then the estimate of the covariance matrix from the whole history of the chain after that, including the initial values and the burn-in, which can lead to slow convergence of the chain due to the sensitivity of the covariance estimate to outliers. The Accelerated Shaping algorithm instead begins adapting straight away and removes early iterations of the chain from the running estimate of the covariance matrix at a rate slower than it includes new iterations. This helps to ensure that the chain converges to the posterior mode quickly even if the initial guesses for the parameter values and covariance matrix are relatively poor.
If β k is the vector of the values of the transmission parameters at the kth iteration, then the proposal for β at the (k + 1)th iteration is given by where Σ k+1 and n β are the running estimate of the covariance matrix and dimension of the posterior density for β, and c k is a scaling constant that is tuned to achieve efficient mixing via the Accelerated Scaling algorithm (see below). To discard early iterations at a slower rate than new ones are included, a non-decreasing sequence of integers is used, f (k) = k 2 , such that f (0) = 0 and f (k + 1) = f (k) or f (k + 1) = f (k) + 1 for all k, and Σ k+1 for k > 0 is defined as is the empirical covariance of the last k − f (k) + 1 samples of β from the chain, is the mean of the last k − f (k) + 1 samples, Σ0 is the initial guess for the covariance matrix, and k0 determines the rate at which the influence of Σ0 on Σ k+1 decreases (the weight of Σ0 halves after the first 2k0 iterations). We use k0 = 1000 here.
If f (k) = f (k − 1) (i.e. if k is odd with f (k) chosen as above), an additional observation is added to the estimate of the covariance matrix: if k is even), the new observation replaces the oldest: It has been shown that N (β k , 2.38 2 Σ/n β ), where Σ is the covariance matrix of the posterior distribution, is the optimal proposal distribution for rapid convergence and efficient mixing of the MCMC chain for symmetric product-form posterior distributions as n β → ∞, and leads to an acceptance rate of 23.4% (36,37). This corresponds to a scaling of c k = 1 in Eq. (S23). However, we are in a context with a large amount of missing data, which is strongly correlated with some of the transmission parameters (see §8B), so the posterior distribution is not symmetric, and this scaling is not optimal. We therefore scale c k adaptively as the algorithm progresses to target an acceptance rate of approximately 23.4% for updates to β (Accelerated Scaling). We do this by rescaling c k by a factor of x k > 1 every time an acceptance occurs and by a factor of x ν/(ν−1) k < 1 every time a rejection occurs such that the acceptance rate ν approaches 23.4% in the long run, In order to satisfy the 'Diminishing Adaptation' condition (38), which is necessary to ensure the Markov chain is ergodic and converges to the correct posterior distribution, it is required that c k tends to a constant as k → ∞. So that the adaptation diminishes as k increases, we use the sequence where m0 is the number of iterations over which the scaling factor x k decreases from 2 to 1.5. Here we use m0 = 100.

Model Comparison
We compare the goodness of fit of models with and without additional within-household transmission using DIC (39). DIC measures the trade-off between model fit and complexity, and lower values indicate better fit. Since some variables were not observed, we use a version of DIC appropriate for missing data from (40), which is based on the complete data likelihood L(θ; Z) = P (Y, X|θ). This is equivalent to the standard version of DIC for fully observed data except that it is averaged over the missing data: where D(θ) is the deviance (the measure of model fit), given (up to an additive constant dependent only on the data) by pD is the effective number of parameters in the model (the measure of model complexity), defined by andθ is an appropriate summary statistic for θ from its posterior distribution (here we use the posterior mode). The first term in Eq. (S24) can be straightforwardly estimated as the mean of the values of the log-likelihood over the MCMC samples: The second term in Eq. (S24) is calculated by storing the values of the missing data while running the MCMC, re-running the full log-likelihood computation for each iteration using the stored values and the posterior mode, and averaging these values Since there can be issues with using DIC for model comparison (41), we also compared the posterior distributions of the deviances of the models to assess differences in quality of fit and validated the DIC metric on simulated data (see §10).

Calculating the Contribution of Different Infection States to Transmission
We calculate the contribution of each infection state to the total infection pressure on all susceptible individuals λ(t) ( Fig. 2A in the main text), which determines the rate of new infections at time t, by summing the infection pressures on susceptible individuals from each state: where X ∈ {A, E, I, P} denotes the infection state. The relative contribution to the infection pressure on susceptible individuals (Fig. S21) is then given by λX (t) The relative contribution of state X to the infection pressure on the ith VL case at their infection time, i.e. the probability that i's infection source is X (Fig. 2B in the main text), is: Y∈{A,E,I,P} , X ∈ {A, E, I, P}. [S27] The probability that the ith VL case is infected from the background transmission is λi(Ei − 1) .

Reconstructing the Epidemic
A. Reconstructing the transmission tree. We reconstruct the transmission tree following the 'sequential approach' described in (42). We draw N samples (θ k , X k ) (k = 1, . . . , N ) from the joint posterior distribution from the MCMC, calculate the probability that infectee i was infected by individual j conditional on their infection time Ei for all infectees and possible infectors for each sample, and draw an infector for each infectee from the posterior distribution of possible infectors {p ij|E i (θ k , X k )} from each sample. This yields a sample of N transmission trees drawn from their predictive distribution, and thus accounts for uncertainty in the infection source (given fixed values of the parameters and missing data), and uncertainty in the parameter values and missing data (over the posterior distribution). We use N = 1000 here. We reconstruct the transmission chains for each infected individual by searching backwards through each sampled transmission tree for their infector, and the infector of their infector, and so on, until the "index" case (in practice the earliest observed case during the study period) in the chain is reached. This gives the transmission chain for each infectee in each transmission tree. We then find the set of all "index" cases (the roots of all the transmission chains) and calculate the mean time from their onset to the infections of their infectees, and mean distance to their infectees, in each infection generation, and average these quantities over all sampled transmission trees. Figures 4A and 4B in the main text) are calculated from the sample of N transmission trees by averaging the distances and times from each infector to their VL infectees within each tree, and then averaging these quantities over all the trees in which that VL/PKDL case is an infector:

B. Calculating transmission distances and times. The mean infector-to-VL-infectee distance and mean infector-onset-to-VLinfectee-infection time for each VL and PKDL infector (
where τ (k) ij is the time between j's onset and i's infection in the kth tree, Fj is the set of all trees in which j is an infector, G j,k is the set of all infectees of j in the kth tree, and | · | denotes set size.

C. Calculating reproduction numbers.
The individual-level reproduction number for infectious individual j is obtained by summing the probabilities in Eq. (S28) over all individuals i that j could have infected (43,44): To account for uncertainty in infected individuals' unobserved infection times these reproduction numbers are averaged over the joint posterior distribution of the transmission parameters and missing data: where Ω is the space of all possible values of the missing data X, R (k) j is individual j's reproduction number from the kth iteration of the MCMC chain (with the burn-in removed), and N is a suitably large number of samples from the posterior distribution (we use N = 1000). Since we are interested in comparing the numbers of new infections generated by VL and PKDL cases, in the main text we separate the expected numbers of secondary infections coming from the VL episodes and PKDL episodes of individuals who had both VL and PKDL (Fig. 5A in the main text) (i.e. we effectively treat them as coming from separate individuals), and hence no longer refer to the numbers of secondary infections per VL case and per PKDL case as reproduction numbers.
The time-dependent overall effective reproduction number (Fig. 5C in the main text) can also be calculated from the transmission probabilities as the mean of the individual reproduction numbers for all VL cases and asymptomatic individuals with onset and infection, respectively, at time t (43):
The absolute contribution of each infectious state to the effective reproduction number at time t is: where X ∈ {A, I} denotes the infectious state, and, as described above, in the main text we split the numbers of secondary infections (Rj) arising from VL and PKDL for cases that had both.

Model Simulations
To assess the fit of the model and simulate hypothetical interventions against PKDL, we create a stochastic simulation version of the individual-level spatiotemporal transmission model described above. We follow standard stochastic simulation methodology for discrete-time individual-level transmission models (45), converting infection event rates into probabilities in order to simulate who gets infected in each month. We assume that an individual's progression through different infection states following infection occurs independently of the rest of the epidemic (i.e. is either governed by internal biological processes or random external processes of detection), which enables the simulation of an individual's full infection history from the point of infection. So that we can simulate durations of PKDL infectiousness, we fit a negative binomial distribution NB(r5, p5) to the observed PKDL onset-to-treatment times and onset-to-resolution times for self-resolving PKDL cases in the data: by maximum likelihood estimation, which yields r5 = 1.18, p5 = 0.066 (corresponding to a mean duration of PKDL infectiousness of 18 months). We simulate epidemics for the study population starting from January 2003 (at which point all but one of the paras had had at least 1 VL case since January 2002) using the individual-level location and demographic information (months of birth, death and migration) from the observed data, and posterior samples of the parameter values and individuals' infection statuses in December 2002 obtained from the MCMC algorithm.
Given these pieces of information, the simulation algorithm proceeds as follows: 1. Starting at t = 12, draw the times of the next events for all individuals that have already been infected, e.g. VL onset times for individuals already pre-symptomatically infected, using the size-biased negative binomial distributions corresponding to Eq. (S4), Eq. (S5), Eq. (S7), and Eq. (S35), and Eq. (S6) for the asymptomatic infection durations. The size-biased negative binomial distribution accounts for the fact that individuals with left-censored infection times who are observed to be actively infected at a particular point in time are likely to have had long durations of infection if the infection duration is negative binomially distributed (46). The PMF of a size-biased negative binomial random variable X * corresponding to X ∼ NB is: 2. Simulate the times of the subsequent events for these individuals by drawing from Eq. (S4)-Eq. (S7) and Eq. (S35). Assign each PKDL case an infectiousness by drawing from Cat({h1, h2, h3, hu}, p), where Cat(·) is the categorical distribution and p = (101/190, 31/190, 6/190, 52/190) are the probabilities of the different lesion types according to their observed frequencies in the data (Table S1).
3. For each susceptible individual i from the set of individuals who have been born/immigrated and have not yet died/emigrated, determine whether they become infected: (a) Infection occurs with probability 1 − e −λ i (t) , where λi(t) is given by Eq. (S9).
(b) Else they remain susceptible.

For each infection, decide if it is a pre-symptomatic infection or an asymptomatic infection:
(a) Pre-symptomatic infection occurs with probability pI . In which case, draw an onset time: and decide whether the individual subsequently develops PKDL or not: i. PKDL occurs with probability pP . In which case draw times for progression to dormant infection, PKDL onset and recovery: p1), p3), and assign a PKDL infectiousness to the individual by drawing from Cat({h1, h2, h3, hu}, p).
ii. Else the individual recovers upon treatment, so draw a treatment time: (b) Else it is an asymptomatic infection. In which case, decide whether the individual progresses to PKDL or not: i. Progression to PKDL occurs with probability pD. In which case, draw times for progression to dormant infection, PKDL onset and recovery: and assign a PKDL infectiousness by drawing from Cat({h1, h2, h3, hu}, p). ii. Else the individual recovers without developing PKDL, so draw a recovery time: 6. Repeat steps 3-5 until t = T (as the last iteration).
We run 10,000 simulations of the model for each PKDL intervention scenario (normal interventions, complete PKDL prevention (h1 = h2 = h3 = hu = 0) and halving the mean duration of infectiousness (r5 = 1.18, p5 = 0.13)), consisting of 100 simulations (to account for stochastic uncertainty) of each of 100 posterior samples for the model parameters and individuals' infection statuses in December 2002 (to account for uncertainty in their values given the observed data).

Code
Code for running the MCMC algorithm and processing the MCMC output was written in MATLAB R2017b (47) and the simulation code was written in Julia 1.0.5 (48,49). All code is freely available from https://github.com/LloydChapman/ VLSpatiotemporalModelling.

MCMC Output
A. Model selection. The posterior deviance distributions and DIC values for the different models tested are presented in Fig.  S5 and Table S4 respectively. Based on the DIC values, models with additional within-household transmission fit better than those without. We focus on the output of the model with the highest level of relative asymptomatic and pre-symptomatic infectiousness (both 2% as infectious as VL) below and in the main text, as there is some evidence to suggest that asymptomatic individuals occasionally generate VL cases (50,51) and this is the most conservative assumption in terms of estimating the contribution of PKDL to transmission. B. Parameter estimates. The modes and 95% highest posterior density intervals (HPDIs) of the transmission parameters (β, α, and δ) and incubation period distribution parameter p for the different models are shown in Table S4. The parameter estimates are very similar across the different models and vary in the way expected -the spatial transmission rate constant β and background transmission rate are lower for models with additional within-household transmission (δ > 0) and decrease with increasing relative asymptomatic infectiousness h4, and the mode for α is slightly larger for models with δ > 0 (since a flatter kernel shape compensates for the extra within-household transmission). The posterior distributions for the incubation period distribution parameter p correspond to a mean incubation period of 5.4-6.7 months (95% HPDIs (4.6,6.2)-(5.7,7.8) months).
The log-likelihood trace and posterior distributions for the parameters for the model with additional within-household transmission and 2% relative asymptomatic and pre-symptomatic infectiousness are shown in Fig. S6. The parameters are clearly well defined by the data, as the posterior distributions differ significantly from the weak prior distributions. The corresponding autocorrelation plots are shown in Fig. S7. The high degree of autocorrelation evident for all the parameters is due to strong correlation between the transmission parameters and the missing data, in particular between the spatial transmission rate constant β and the asymptomatic infection times. Fig. S8 shows that β is strongly negatively correlated with the mean asymptomatic infection timeĀ. This is expected since a higher overall transmission rate leads to individuals being infected earlier. With a continuous-time model we could tackle this correlation using a 'non-centred' MCMC algorithm, i.e. by re-parameterising the model to reduce the a priori dependence between β and the asymptomatic infection times (29,32), but with a discrete-time model the re-parameterisation is much more difficult, since the asymptomatic infection times do not vary continuously with β. We therefore ran the MCMC algorithm for a large number of iterations (N = 10 5 ) to obtain a sufficient number of independent samples.  show that there is some negative correlation between β and , α and , and δ and p. These correlations are not surprising: the more transmission that is explained by proximity to infectious individuals (the higher β), the less needs to be explained by the background transmission (the lower ); the flatter the spatial kernel (the larger α), the fewer infections need to be explained by the background transmission; and the more infections are accounted for by transmission within the same household (the higher δ), the longer the incubation period (the lower p) needs to be (due to long times between onsets of cases in the same household). The acceptance rate for the transmission parameter updates (Step 1 in the MCMC algorithm) was 23.2-23.4% for all models, as expected from tuning the proposal variance (see §2B.5). The acceptance rates for the updates for the pre-symptomatic infection times; missing VL onset and treatment times; and unobserved asymptomatic infection and recovery times were between 56% and 97% for every model, indicating that the algorithm efficiently explores the space of unobserved data.

C. Unobserved pre-symptomatic infection times, and asymptomatic infection and recovery times.
Here we present plots of various quantities derived from the posterior distributions of the pre-symptomatic infection times and asymptomatic infection and recovery times to demonstrate what can be inferred about the missing data using the data augmentation algorithm. Fig.  S10 shows the incidence curve of VL and PKDL cases for the whole study area and the inferred incidence curve of asymptomatic  infections (averaged over the MCMC chain). The number of asymptomatic infections increases and decreases with the number of VL cases as expected given the assumption of a fixed incidence ratio of asymptomatic to symptomatic infection.
The posterior probabilities that individuals were asymptomatically infected during the study (shown in Fig. S11, with individuals arranged in order of para and household number) show clustering of asymptomatic infection in space and that this is associated with proximity to VL cases (the probabilities of asymptomatic infection are higher where the VL case density is higher). This is as expected given the estimates of the transmission parameters -the rapid decrease in the risk of infection with distance from an infectious individual (Fig. S19) Fig. S13 shows the posterior distributions of the unobserved infection times of a selection of VL cases, and indicates that the data does contain sufficient information to constrain the probable infection times of some cases, as the posterior distributions differ significantly from the prior distributions (i.e. from the prior for the incubation period distribution) for some cases.

D. Reconstructed transmission trees.
Snapshots of one inferred transmission tree in the part of the south-east cluster of paras shown in Fig. 3 in the main text at different time points are shown in Fig. S14. These show that while asymptomatic infection (and therefore subsequent immunity) was initially mostly clustered around VL cases, by December 2005 it was widespread and by December 2009 had saturated in this part of the study population. The relatively high uncertainty in infection sources can be seen in the different pattern of infectors and longer inferred transmission distances in this single tree (from one sample from the joint posterior distribution of model parameters and missing data) than the consensus tree in Fig. 3 in the main text. Fig. S15 shows, for all inferred transmission chains, the relationship between the mean time after the onset of the index case in the chain at which infectees in each infection generation were infected and the mean distance from the index case to infectees in each infection generation (averaged over 1000 sampled transmission trees). It suggests that infection spread over

MCMC Validation
The MCMC algorithm was run on simulated data to verify that it can accurately infer the parameter values and missing data. Since a high proportion of infections are asymptomatic, there is a large amount of missing data in the form of the missing asymptomatic infection and recovery times. To test the ability of the algorithm to infer this missing data and simultaneously recover the true parameter values, we simulated transmission in paras 1-4 (comprising 6231 individuals) from January 2002 to December 2010 with three different proportions of symptomatic infection, pI ∈ {0.1, 0.15, 0.2}, and ran the inference procedure on each of the simulated datasets with the asymptomatic infection data removed, assuming the symptomatic proportion was known. The values of pI were chosen to cover a plausible range based on the highest cumulative VL incidence in any of the paras over the study period (∼10%, which equates to a maximum possible proportion of asymptomatic infection of ∼90% if the whole population was infected) and estimates from previous studies (in which the highest estimated symptomatic proportion was 20%) (52). To ensure relatively consistent VL incidence across the simulated datasets and to account for potential correlation between pI and other parameter values, we ran the MCMC algorithm on the observed data for paras 1-4 for each value of pI and re-simulated the data with the simulation model using parameter values close to the posterior modes obtained. There were 242, 303 and 331 VL cases in the simulations with pI = 0.1, 0.15 and 0.2 respectively. Fig. S16 shows the posterior distributions for the parameters obtained from running the MCMC algorithm on the simulated datasets. All of the posterior distributions overlap the true parameter values, indicating that the algorithm is able to recover the true parameter values, even for proportions of asymptomatic infection as high as 90%. The algorithm can also accurately estimate the number of new asymptomatic infections each month across the range of symptomatic proportions tested, as shown by the inferred epidemic curves of asymptomatic infections in Fig. S17.

Validation of DIC for Model Comparison
As it is known that there can be issues with using DIC to compare models, particularly in the context of missing data (40,41,53), we verified that the version of DIC for latent variable models that we use (Eq. (S24)) can discriminate between models with/without additional within-household transmission using simulated data. We simulated transmission in paras 1 to 4 from January 2002 to December 2010 without and with additional within-household transmission δ = 0 month −1 and δ = 0.15 month −1 (and h0, h4 = 0.02), and then ran the MCMC algorithm and DIC calculation on the simulated datasets with  (Table  S5). These results agree with the findings of Gamado et al. (53) that the DIC defined in Eq. (S24) can be used to distinguish different forms of spatial kernel.