# Point process temporal structure characterizes electrodermal activity

^{a}Harvard–Massachusetts Institute of Technology Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA 02139;^{b}Department of Anesthesia, Critical Care, and Pain Medicine, Massachusetts General Hospital, Boston, MA 02114;^{c}Institute of Medical Engineering and Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;^{d}Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;^{e}Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy 20133;^{f}Picower Institute of Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139

See allHide authors and affiliations

Edited by Peter L. Strick, University of Pittsburgh, Pittsburgh, PA, and approved August 14, 2020 (received for review March 12, 2020)

## Significance

Electrodermal activity (EDA) is a readout of the body’s sympathetic nervous system measured as sweat-induced changes in the electrical conductance properties of the skin. Interest is growing in using EDA to track physiological conditions such as stress levels, sleep quality, and emotional states. The integrate-and-fire physiology underlying EDA production suggests that its interpulse intervals should obey an inverse Gaussian probability model. In an analysis of EDA data recorded in 11 healthy volunteers during quiet wakefulness, we established that the inverse Gaussian model accurately characterized the interpulse intervals. Our findings show a physiologically based statistical model provides a parsimonious and accurate description of EDA.

## Abstract

Electrodermal activity (EDA) is a direct readout of the body’s sympathetic nervous system measured as sweat-induced changes in the skin’s electrical conductance. There is growing interest in using EDA to track physiological conditions such as stress levels, sleep quality, and emotional states. Standardized EDA data analysis methods are readily available. However, none considers an established physiological feature of EDA. The sympathetically mediated pulsatile changes in skin sweat measured as EDA resemble an integrate-and-fire process. An integrate-and-fire process modeled as a Gaussian random walk with drift diffusion yields an inverse Gaussian model as the interpulse interval distribution. Therefore, we chose an inverse Gaussian model as our principal probability model to characterize EDA interpulse interval distributions. To analyze deviations from the inverse Gaussian model, we considered a broader model set: the generalized inverse Gaussian distribution, which includes the inverse Gaussian and other diffusion and nondiffusion models; the lognormal distribution which has heavier tails (lower settling rates) than the inverse Gaussian; and the gamma and exponential probability distributions which have lighter tails (higher settling rates) than the inverse Gaussian. To assess the validity of these probability models we recorded and analyzed EDA measurements in 11 healthy volunteers during 1 h of quiet wakefulness. Each of the 11 time series was accurately described by an inverse Gaussian model measured by Kolmogorov–Smirnov measures. Our broader model set offered a useful framework to enhance further statistical descriptions of EDA. Our findings establish that a physiologically based inverse Gaussian probability model provides a parsimonious and accurate description of EDA.

Electrodermal activity (EDA), which measures the electrical properties of the skin as changes in conductance, is mediated almost exclusively by the sympathetic branch of the autonomic nervous system (1). The skin continuously receives sympathetic innervation. Consequently, EDA is continuously present due to sweat glands filling and releasing sweat onto the skin. Changes in the level of filling occur in response to internal stimuli (physiological and psychological) and external stimuli such as threats or dramatic changes in ambient temperature. EDA is a component of the primal flight-or-fight response that is routinely used as a measure of the sympathetic nervous system activity in psychological studies, polygraph tests and studies of stress (1). Measures of EDA are now being developed as a neuromarketing tool to evaluate consumer responses to different products or promotions (2). For this reason, there is growing interest in the development of analysis methods to characterize EDA accurately.

EDA has a characteristic pattern consisting of two distinct components. There is a baseline or tonic component that drifts gradually with time. On top of the tonic component is a phasic component composed of pulse events that vary in amplitude, shape, and spacing. Sweat release is a pulsatile process because a sufficient volume of sweat has to accumulate, fill the glands, and be released onto the skin to observe an EDA change. This integrate-and-fire nature of the phasic component is believed to represent fast changes in sympathetic nervous system activity. EDA activity can show large variation in both baseline activity and pulse activity within and between individuals.

Most current EDA analysis methods focus mainly on the phasic component to characterize sympathetic activity. These methods fall in two categories: rate-based methods and deconvolution methods. The rate-based methods specify a time window and estimate a moving average of the number of pulse events per time window (3⇓–5). For the deconvolution methods, a single pulse shape is assumed, and the EDA signal is represented as the convolution of this pulse shape with neural inputs. For this technique, the neural input is deconvolved from the EDA while simultaneously fitting pulse shape parameters (6⇓⇓⇓–10). Hence, deconvolution methods report the occurrence times and amplitudes of the pulse events. Although widely used, neither the rate-based nor the deconvolution methods are based on EDA’s well-established physiology. Likewise formal statistical modeling is not used to characterize the interpulse interval dynamics (pulse rate and pulse times) or the pulse amplitudes.

The important advance that we report here is the application of elementary point process models based on physiology to characterize the dynamics of EDA interpulse intervals. These parsimonious probability models provide a physiologically based framework for statistical analysis of EDA. The balance of this paper is organized as follows. In *Physiology and Theory*, we formulate generalized inverse Gaussian integrate-and-fire probability models derived from EDA’s integrate-and-fire properties and, as alternatives, lognormal, exponential, gamma, and generalized inverse Gaussian non-integrate-and-fire probability models. In *Application* and *Results*, we use these models in the analysis of EDA recorded from 11 healthy subjects during quiet wakefulness. The *Discussion* describes the implications of our findings for future basic science and translational studies.

## Physiology and Theory

### The Anatomy and Physiology of Electrodermal Activity.

To develop our statistical models of EDA activity we review the anatomy and physiology of sweat production in the skin (1). Each sweat gland consists of three parts: the dermal gland, the duct that connects the gland to the skin surface, and the pore where the duct opens to the skin (Fig. 1*A*; 1). The dermal portion of the gland is innervated by sudomotor nerves, a part of the peripheral nervous system that is predominantly under sympathetic control (1). Sweat is produced in the gland by sympathetically induced abrupt increases in spiking activity (action potentials) in the sudomotor nerves. These electrical events are called sudomotor bursts. Sweat produced in response to these bursts accumulates in the duct. Once the duct is full, it pushes open the pore and spills onto the skin. The sweat either evaporates or is reabsorbed through the walls of the duct. With the duct now empty, the accumulation process begins again (1, 11, 12).

Sweat on the skin’s surface increases its electrical conductance (inverse of resistance) because its salt content facilitates the propagation of electrical currents. Electrical conductance across the skin can be measured in a standard fashion by placing two electrodes on either the palm or fingers, applying a constant voltage, and measuring the current (1). The pulsatile effects of the sudomotor bursts measured at the skin are termed galvanic skin responses (GSRs). The second-to-second changes in skin sweating measured as second-to-second changes in skin conductance are termed EDA (1).

EDA measurements have two distinct components: tonic activity and phasic activity. Tonic activity represents generally ongoing EDA that reflects the background state of the EDA. The phasic activity reflects primarily the GSR or pulsatile events.

### Statistical Models of EDA Pulses.

We investigate four statistical models to characterize the times between GSR events recorded during EDA measurements: generalized inverse Gaussian, lognormal, gamma, and exponential. The choice of the generalized inverse Gaussian model follows directly from the physiology described previously. The phasic component of the EDA measurements is comprised of pulsatile events. The times between these pulse events are governed by sympathetic stimulation of the glands, the sweat production and its accumulation in the glands, sweat release on the skin, and sweat reabsorption and evaporation. We represent this four-step sequence of sweat accumulation in the gland and its release onto the skin as an integrate-and-fire process defined by a Gaussian random walk with drift diffusion (Fig. 1*B*). It is well known that the times between firing events for this elementary diffusion process obey an inverse Gaussian probability model (13⇓–15). Therefore, we chose as the first model the generalized form of the inverse Gaussian probability density defined for an interpulse interval time

To broaden the class of alternatives to integrate-and-fire models, we also consider the lognormal distribution. Its probability density function is defined as

We postulate that under stable, or approximately stable, background conditions the inverse Gaussian model will provide an accurate description of the inter-GSR (interpulse) event times as the integrate-and-fire model is a plausible description of skin sweat production. This assumes that the group activity of the sweat glands shows a coherent behavior. If there is variation in the behavior of the individual glands, even when the background state is stable, then it is possible that the elementary inverse Gaussian model may not apply.

There may be a mixture of inverse Gaussian models which could produce either longer or shorter interevent intervals relative to a single inverse Gaussian model. These differences could be manifested by either heavier or lighter tails relative to an inverse Gaussian model. To model heavy tail distributions, we consider the lognormal and the gamma distributions, whereas to model the light tail distributions, we consider the gamma and the exponential distributions. The gamma is flexible in that it can have either a heavier or a lighter tail than the inverse Gaussian depending on the values of the parameters (17). Although the exponential distribution is a special case of the gamma, it represents the null model of an underlying Poisson process governing GSR-event production.

We quantify the tail behavior of a probability density

## Application

### Experimental Data.

Our experimental protocol was approved by the Massachusetts Institute of Technology Institutional Review Board, and all subjects provided written informed consent. We collected EDA data from 12 healthy volunteers (six men) between the ages of 22 and 34 while awake and at rest (19). Electrodes were connected to the second most distal phalange of the second and fourth digits of each subject’s nondominant hand. Approximately 1 h of EDA data were collected at 256 Hz. Subjects were seated upright and instructed to remain awake. They were allowed to read, meditate, or watch something on a laptop or tablet but not to use the instrumented hand. We assumed skin and ambient temperature were constant for the duration of the experiment. One subject’s data were not included in the analysis because we learned, after the data collection, that this individual occasionally experienced a Raynaud’s type phenomenon. This would affect the quality of the EDA data. Data from the remaining 11 subjects were analyzed using MATLAB 2017a.

### Data Preprocessing and EDA Pulse Selection.

Preprocessing consisted of two steps: 1) detecting and removing artifacts and 2) isolating the phasic component. Artifact detection was done based on the derivative of the time series since large rapid changes are physiologically impossible for skin conductance. Artifact removal was done in two parts, first correcting for artifact-related large magnitude changes in the remainder of the signal and then interpolating the few seconds around the artifact itself. Then a low-pass FIR filter was used to estimate and remove the slow-moving tonic component of the signals, thereby, isolating the phasic component. The data preprocessing is described in further detail in Subramanian et al. (20).

Since there are underlying tonic fluctuations, absolute pulse amplitude alone is insufficient to extract pulses reliably. Therefore, we computed locally adjusted amplitudes for all detected peaks using the MATLAB function *findpeaks*. The *findpeaks* algorithm computes a prominence or relative amplitude for each peak, by adjusting the amplitude of each peak as the height above the highest of neighboring valleys on either side. The valleys are chosen based on the lowest point in the signal between the peak and the next intersection with the signal of equal height on either side. With this method, a peak with small absolute amplitude can be rewarded in its prominence value if it is in a region of data with low activity. We then used a threshold on this prominence value to account for the varying baseline, instead of on the absolute amplitude.

We used a prominence threshold of 0.005 to extract peaks across all subjects, unless this resulted in too few or too many pulses for each hour-long recording. This was primarily verified by visual inspection of extracted pulses, as well as rough estimates of 60 and 360 pulses as generous bounds for what should be expected of 1 h of data at rest with little external stimulation. This corresponds to one pulse every 10 to 60 s on average (4, 21⇓⇓⇓⇓–26). In the case that too few pulses were extracted, we gradually reduced the prominence threshold by 0.001 until the number of pulses exceeded 100 and no obvious pulses were missed by visual inspection (verified by three different viewers independently). In the case too many pulses were extracted, we gradually increased the prominence threshold by 0.001 until the number of pulses was fewer than 360 and the extracted pulses were distinguishable from sensor noise by visual inspection (verified by three different viewers independently). If changing the threshold in increments of 0.001 resulted in drastic changes in the number of pulses each time, we reduced the increment to 0.0005. Although not fully automated at this stage, the design of this method to extract pulses attempted to take into account the wide variation in baseline levels of EDA activity seen across subjects. The extracted peaks included smaller peaks that other methods would generally ignore as noise. However, we chose to include them in the analysis.

### Statistical Model Fitting and Comparison.

First, we restricted the parameter space to

A KS plot compares the rescaled quantiles from the fit of the estimated probability model with the quantiles of an exponential distribution with rate 1 using the time-rescaling theorem (29). This theorem states that any point process can be rescaled to an exponential distribution with rate 1 using its conditional intensity (hazard) function. The KS distance computes the maximum distance between the quantiles of the rescaled data and the uniform distribution, which is a simple transform of an exponential distribution with rate 1. A smaller KS distance indicates that the model is more similar to the structure observed in the data. We computed 95% confidence intervals (5% significance cutoffs) for the KS plot and compared the KS distances across models (30). A KS distance that is within (outside) the 95% confidence intervals suggests that the model offers (fails to offer) a reasonably accurate description of the data.

We also compared the models using a tail behavior analysis, in which the settling rates of the models were compared to determine the heaviness of the tails of the distributions. We hypothesized that each EDA interpulse interval distribution results from the activity of multiple sweat glands. This could lead to deviations from the inverse Gaussian with slightly heavier or lighter tails. These deviations can be captured statistically by right-skewed models such as the lognormal, generalized inverse Gaussian and gamma that together allow for more flexibility in tail behavior.

## Results

### Extraction of EDA Pulses.

Fig. 1*C* shows an example of an excerpt of extracted pulses for Subject S07. This includes pulses large enough to be used in most analyses as well as those that are much smaller and usually either smoothed out or ignored as noise. We included both types of pulses for all subjects and did not distinguish between them. The majority of subjects showed appreciable fluctuations in the tonic component across time (*SI Appendix*, Fig. S1). Across the 11 subjects, the total number of pulses in the 1-h time window ranged between 97 and 348, including the distantly spaced smaller pulses (*SI Appendix*, Fig. S2). The final prominence thresholds used ranged between 0.0025 and 0.023.

### Findings from Statistical Model Comparison.

For all 11 subjects, the optimal generalized inverse Gaussian diffusion model was an inverse Gaussian, indicated by λ=−0.5 in Table 1. This inverse Gaussian was always within the significance cutoff according to KS distance, indicating that it provides an accurate description of the data and supporting our hypothesis that the pulsatile sweat release events of EDA can be modeled as an integrate-and-fire process. When allowing for deviations from the inverse Gaussian, for 9 of the 11 subjects, one or more of the lognormal, gamma, or generalized inverse Gaussian nondiffusion models was able to improve statistically on the fit of the inverse Gaussian (Tables 2 and 3; Fig. 2; and *SI Appendix*, Figs. S3–S7). The exponential was never a better fit than the inverse Gaussian, and was only within the significance cutoff for 4 of the 11 subjects. This suggests that for the majority of subjects, the exponential model did not offer an accurate description of the data.

Both AIC and KS distance were in agreement for the best fit models for the data of 8 of the 11 subjects. This corresponded to 1 generalized inverse Gaussian diffusion model (S08), 1 generalized inverse Gaussian nondiffusion model (S10), 4 lognormal (S02, S03, S09, and S11), and 2 gamma (S04 and S05) models. For S01 and S06, the AICs and KS distances suggested different best fit distributions between inverse Gaussian and lognormal. For S07, they identified different best fit distributions between lognormal and gamma. It is reasonable to expect that the results from AIC and from KS distance will not match exactly since different metrics are intended to capture different aspects of model fits. However, the fact that they agree across the majority of subjects reinforces that there is specific statistical structure in the data that can be captured with a parsimonious model.

The second phase of comparing the models was analyzing the tail behavior (Table 4) using the settling rates estimated for each of the five models for all subjects. Across all 11 subjects, the lognormal always had the heaviest tail since it is commonly classified as a heavy-tailed distribution. The generalized inverse Gaussian integrate-and-fire model had the next heaviest tail, indicated by the next smallest settling rate. The lognormal was the only other class of models besides the generalized inverse Gaussian diffusion models that was within the significance cutoff for all subjects. This suggests that deviations from the inverse Gaussian tended toward heavier tails that were best captured by the lognormal.

Among the remaining models, the generalized inverse Gaussian non–integrate-and-fire models always had lighter tails than the integrate-and-fire models. The tails of the gamma were even lighter. Omitting the exponential, since it fit poorly in the majority of cases, the remaining models—lognormal, generalized inverse Gaussian (diffusion and nondiffusion), and gamma—together provide a systematic framework to: 1) evaluate the presence of inverse Gaussian structure in EDA using diffusion models; and 2) enhance statistical descriptions using models capable of capturing a range of tail behavior properties.

Among the three subjects for whom the AIC and KS distance disagreed on the best fit model (subjects S01, S06, and S07), there was only one case (S07) in which this disagreement predicted very different tail behavior. In this case, the AIC and KS distance were divided between the lognormal and the gamma as the best model, one of which predicted a very heavy tail while the other predicted a very light tail. However, the tail of the interpulse interval distribution may have been relatively underrepresented for this subject, so the fit was primarily based on the shape near the mode, which both the lognormal and gamma distributions fit well. With a data collection period longer than 1 h, we would most certainly arrive at a more accurate description of the tail behavior.

## Discussion

In this study, we used EDA data collected from 11 healthy volunteers at rest to test our hypothesis that EDA contains highly regular statistical structure that is consistent with integrate-and-fire physiology that describes sweat production and release. To do that, we fit probability models to the EDA time series and quantified the goodness of fit using AIC and KS distance. We also assessed tail behavior by computing the settling rates. Our findings show that for each of the 11 data series, the model fits and tail behavior were consistent with integrate-and-fire sweat gland physiology. There was also room for statistical improvements to capture deviations due to EDA data reflecting the simultaneous activity of many sweat glands.

The physiology of sweat gland activity predicts that the interpulse intervals for sweat gland pulses should obey an inverse Gaussian distribution. This is an elementary model for processes where there is a build-up of a continuous variable to a threshold. Crossing the threshold leads to an observed event. Here, the build-up is the accumulation of sweat in the gland in response to sympathetic stimulation. The observed event is the GSR pulse measured as EDA. Our results show that the EDA data are consistent with an inverse Gaussian model for all subjects (Table 3), and the inverse Gaussian is also the optimal integrate-and-fire probability model

We refined our hypothesis further by considering that measured EDA is the aggregation of data from hundreds of sweat glands. This predicts that the interpulse intervals could likely follow a mixture of inverse Gaussian models. This mixture could deviate from a single inverse Gaussian in tail behavior, which can be captured by other non–integrate-and-fire models. Lighter tails were modeled as gamma or generalized inverse Gaussian non–integrate-and-fire models, whereas heavier tails were modeled as lognormal (Table 3). More of the data are consistent with the heavier tail lognormal distribution, suggesting more frequent longer interpulse intervals across sweat glands than would be predicted by a homogeneous inverse Gaussian model. Furthermore, we did not observe any multimodal structure in the EDA suggesting a degree of coherent behavior among the sweat glands in the recording area. The non–integrate-and-fire models provide a systematic framework to make individualized improvements in the statistical fits. However, the fact remains that there is an inverse Gaussian model that is an accurate statistical description of the data for each subject. This reinforces the idea that the statistical structure in the data are fundamentally guided by the physiology of sweat gland activity which can be approximated well as an elementary integrate-and-fire process which we took to be a Gaussian random walk with drift diffusion.

Our results link directly the physiology of sweat glands and the statistical structure of the EDA data collected at the skin surface. Current detailed signal processing methods for EDA analysis require significant computational complexity (6⇓⇓⇓–10). However, looking to the physiology provided a principled framework by which to drastically reduce model dimensionality—all of our models had only one, two, or three parameters—and increase the accuracy of the data description. This result has implications for understanding and tracking the sympathetic activity in the autonomic nervous system in a more informed way.

Several important extensions are possible in future work. We will study EDA pulse amplitudes along with interpulse intervals, by taking into account that both arise from the same integrate-and-fire process. Having established a natural point process structure in the EDA time series under approximately stable conditions, we can now study their dynamics over longer time periods, by applying history-dependent inverse Gaussian models like those developed for heart rate variability (31⇓⇓⇓–35). These more detailed models could include other relevant covariates such as skin and environmental temperature. We will also study EDA in other contexts, such as under different emotional conditions, during sleep, in response to painful stimuli, and under general anesthesia. Our findings provide a principled, physiologically based approach for extending EDA analyses to these more complex and important applications.

## Data Availability.

Anonymized electrodermal activity time series and code for analysis have been deposited in PhysioNet (https://physionet.org/) (19).

## Acknowledgments

We thank the Massachusetts Institute of Technology Clinical Research Center staff. This work was partially funded by funds from the Picower Institute for Learning and Memory, the NSF Graduate Research Fellowship Program, and NIH Award P01-GM118629 (to E.N.B.).

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: enb{at}neurostat.mit.edu.

Author contributions: S.S., R.B., and E.N.B. designed research; S.S. performed research; S.S., R.B., and E.N.B. contributed new reagents/analytic tools; S.S. analyzed data; and S.S., R.B., and E.N.B. wrote the paper.

Competing interest statement: A patent application was filed with S.S., R.B., and E.N.B. on July 15, 2020 (PCT/US2020/042031).

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2004403117/-/DCSupplemental.

- Copyright © 2020 the Author(s). Published by PNAS.

This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).

## References

- 1.↵
- W. Boucsein

- 2.↵
- 3.↵
- 4.↵
- 5.↵
- A. Sano,
- R. W. Picard,
- R. Stickgold

- 6.↵
- 7.↵
- 8.↵
- 9.↵
- A. Greco,
- G. Valenza,
- A. Lanata,
- E. P. Scilingo,
- L. Citi

- 10.↵
- R. Faghih et al

- 11.↵
- 12.↵
- 13.↵
- E. Schrodinger

- 14.↵
- 15.↵
- R. Chhikara,
- J. Folks

- 16.↵
- 17.↵
- L. Halliwell

- 18.↵
- J. Folks,
- R. Chhikara

- 19.↵
- S. Subramanian,
- R. Barbieri,
- E. Brown

- 20.↵
- R. Barbieri, Ed.

- S. Subramanian,
- R. Barbieri,
- E. N. Brown

- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- Y. Pawitan

- 28.↵
- B. Jorgensen

- 29.↵
- 30.↵
- D. Daley,
- D. Vere-Jones

- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- J. Patton, Ed.

- S. Subramanian,
- R. Barbieri,
- E. N. Brown

## Citation Manager Formats

## Article Classifications

- Biological Sciences
- Neuroscience