## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Physiological and pathological population dynamics of circulating human red blood cells

Edited by Leo P. Kadanoff, The James Franck Institute, Chicago, IL, and approved October 6, 2010 (received for review August 29, 2010)

## Abstract

The systems controlling the number, size, and hemoglobin concentrations of populations of human red blood cells (RBCs), and their dysregulation in anemia, are poorly understood. After release from the bone marrow, RBCs undergo reduction in both volume and total hemoglobin content by an unknown mechanism [Lew VL, et al. (1995) Blood 86:334–341; Waugh RE, et al. (1992) Blood 79:1351–1358]; after ∼120 d, responding to an unknown trigger, they are removed. We used theory from statistical physics and data from the hospital clinical laboratory [d'Onofrio G, et al. (1995) Blood 85:818–823] to develop a master equation model for RBC maturation and clearance. The model accurately identifies patients with anemia and distinguishes thalassemia-trait anemia from iron-deficiency anemia. Strikingly, it also identifies many pre-anemic patients several weeks before anemia becomes clinically detectable. More generally we illustrate how clinical laboratory data can be used to develop and to test a dynamic model of human pathophysiology with potential clinical utility.

In healthy human adults, ∼2.5 × 10^{11} new red blood cells (RBCs) are released from the bone marrow into the peripheral circulation per day, and about the same number are cleared. The cells composing the circulating population are thus continuously changing, but in healthy individuals (and patients with mild disease) the characteristics of the population are very stable. In the clinic, population characteristics such as the volume fraction of cells in the blood (hematocrit), the average RBC volume (MCV), the coefficient of variation in RBC volume (RDW), and the mean intracellular hemoglobin mass (MCH) are routinely measured in complete blood counts (CBCs) (1). Recently, it has become possible to identify and characterize very young (hours or days old) circulating RBCs (reticulocytes) (2). RBCs undergo a rapid reduction in volume and hemoglobin in the few days after release from the bone marrow (3). This rapid phase is followed by a much longer period of slower reduction (4–7) during which volume and hemoglobin are coregulated (8) (Fig. 1*A*). A comparison of the probability distributions of reticulocytes and of all circulating RBCs (Fig. 1*A*) shows that the correlation between volume and hemoglobin content increases as the cells mature, from an initial correlation coefficient of ∼0.40 in the reticulocyte population to ∼0.85 in the full population. Thus, whereas many of the molecular mechanisms involved are unknown, it is clear that the average RBC matures in such a way that its hemoglobin concentration tends toward the population mean corpuscular hemoglobin concentration (MCHC), shown as an iso-concentration line in Fig. 1 *A* and *B*: the variation in hemoglobin concentration is lower than that for volume and hemoglobin content (8).

## Results

The volume and hemoglobin regulation of an individual RBC in vivo during the course of its lifetime is extremely complex and difficult to understand. Understanding the average behavior of a large population of RBCs may be more tractable. To gain insight into this population-level behavior, we developed a model of RBC maturation and clearance that describes the dynamics of an RBC population. Our model applies only to the typical behavior of RBCs in healthy individuals or those with mild anemia and may not apply to RBCs in more severe disease or even to extreme behavior of RBCs in healthy individuals. Our model decomposes the volume (*v*) and hemoglobin (*h*) dynamics of an average RBC over time (*t*) into deterministic reductions (**f**) and random fluctuations (**ζ**) as shown in Eq. **1**, where *v* and *h* are scaled by their population means , and *t* is scaled by the average cell age . On the basis of data from prior reports (3–5, 7), we introduced two parameters into the deterministic component: a fast change (**β**), whose effect dominates until the RBC is close to the MCHC line, and a slow change (α). The random fluctuation is modeled as a Gaussian random variable with mean zero and variance given by a diffusion tensor 2**D**, as shown in Fig. 1*B* and Eq. **1**:

As with inverse problems in general and human pathophysiology in particular (9), this problem is ill posed in the sense that similar in vivo dynamics can result from different functional forms of **f**. We find that the precise functional form of **f** is less important to the behavior of this model than the qualitative combination of fast and slow deterministic dynamics and random fluctuations. See *SI Text* (*Functional Forms*, Figs. S1–S4, and Table S1) for details.

In our model, the random fluctuation and deterministic dissipation or reduction of volume and hemoglobin content for a typical individual cell are described by a Langevin equation, commonly used to model Brownian motion in a potential (10). The dynamics of the entire circulating population of RBCs may then be described by a master equation for the time-dependent joint volume–hemoglobin probability distribution () which can be approximated by a Fokker–Planck equation (10, 11). Eq. **2** describes the drift (**f**), diffusion (**D**), birth (*b*), and death (*d*) of probability density for this joint volume–hemoglobin distribution:

The birth and death processes account for the RBCs that are constantly added to and removed from the population. In states of health and mild illness, the total number of cells added equals the total number removed: . The precise trigger and mechanism for RBC removal are not fully understood (12), but empirical measurements such as those shown in Fig. 1*A* suggest that there is a threshold () along the MCHC line beyond which most RBCs have been cleared. See *Clearance Function* in *Materials and Methods* for details. CBC measurements vary from person to person but do not change significantly for a healthy individual (13), indicating that these dynamic processes reach a steady state in vivo, , where . See Movie S1 for an example of a numerical solution to Eq. **2**. With appropriate choice of parameters (α, **β**, **D**, and *v*_{c}), our model faithfully reproduces the observed distribution of RBC populations in healthy individuals. See Movie S2 for a comparison of a fitted and an observed distribution.

To test whether the model can distinguish the dynamics of RBC populations in healthy individuals from those in anemic individuals, we obtained CBC and reticulocyte measurements for individuals with three common forms of anemia with different underlying etiologies: anemia of chronic disease (ACD), an inflammatory condition; thalassemia trait (TT), a genetic disorder; and iron deficiency anemia (IDA), a nutritional condition (14). We selected mild cases of each anemia where RBC population characteristics appeared stable and a quasi-steady-state assumption was reasonable; we also selected several apparently healthy controls. See *Blood Samples* in *Materials and Methods* for details on sample collection, data collection, and diagnosis. For each patient sample, we identified an optimal parameter set (α, **β**, **D**, and ) that reproduces the steady state observed for that patient. We used a least-squares fit between the simulated steady-state distribution and the measured CBC distribution to identify the best fit. Where repeat tests were available for the same individual, we found that any variation in fitted parameters was explained by analytic variation in the CBC measurement. See *Parameter Fits* in *Materials and Methods* and Movie S2 for details. Fitted parameters for healthy and anemic individuals are shown in Fig. 2 and median fitted parameters are listed in Table 1.

We find clear differences between the best-fit parameters derived for healthy individuals and those with anemia, and we also find that the different anemic conditions have different characteristic parameter sets. For example, healthy individuals and ACD patients have high β_{v} and β_{h} and low α; i.e., they lose relatively more of their volume and hemoglobin during the fast phase than they do during the slow phase. In contrast, patients with TT and IDA lose relatively more volume and hemoglobin during the slow phase than in the fast phase. Patients with ACD show slightly elevated *D*_{v} and slightly reduced *D*_{h} relative to healthy individuals, whereas TT is associated with a larger increase in *D*_{v} along with a substantially reduced *D*_{h}. IDA patients have a *D*_{v} similar to that of healthy individuals and show dramatic elevation in *D*_{h} with most individuals >10 times higher than normal. The normalized critical volume, , in healthy individuals and those with ACD is ≈80% of , or ∼72 fL. Most patients with TT or IDA typically have a reduced and reduced . Fig. 2 shows that in addition to absolute reductions in and , the for these patients is further reduced and shows much greater variability across different individuals.

## Discussion

These parameters show that RBCs in patients with TT and IDA remain in the periphery with much smaller volumes and lower hemoglobin contents, in both absolute and relative terms, than they would under normal conditions. Their persistence may reflect a compensatory delay in clearance in response to the less efficient erythropoiesis of these anemias. Thus, mechanisms must exist that can alter the behavior of the trigger for RBC clearance. Comparing RBC clearance in TT and IDA with that of healthy individuals or ACD patients may provide a route to identifying the trigger. The variation in is much smaller than the variation in for healthy individuals (14), suggesting that the trigger is highly correlated with position on the MCHC line.

Our model offers a potential way to identify patients with latent or compensated IDA before it leads to clinical anemia, by looking for signs of clearance delay. We tested this possibility in an independent set of patients who had normal CBCs followed at least 30 and no more than 90 d later by either another normal CBC or clinical IDA. For each patient sample, we projected the (*v*, *h*) coordinates of all cells onto the MCHC line and integrated the probability density along this line below 85% of the mean (*P*_{0.85}). Fig. 3 *A–C* shows the evolution of one patient's CBC from normal to latent IDA and ultimately to IDA. The CBC shown in Fig. 3*B* was clinically unremarkable, but the *P*_{0.85} was abnormal, predicting anemia that did not come to medical attention for 51 d. Fig. 3*D* shows values of *P*_{0.85} for 20 normal CBCs from patients who remained healthy and 20 normal CBCs from those who developed IDA between 30 and 90 d later. The value of *P*_{0.85} predicted IDA with a sensitivity of 75% and a specificity of 100%. See *Predicting Iron Deficiency Anemia* in *Materials and Methods* for details. Common existing approaches have a sensitivity of 0% because all CBCs were “normal.” This model-based prediction relies on only a single CBC measurement at one point in time, in contrast to statistical regression approaches that often rely on the integration of multiple measurements and types of information from different sources and different time points (15). IDA is often the initial presentation in serious conditions such as colon cancer (16) and childhood malnutrition (17). Earlier detection of IDA via this method would enable a faster response to such conditions. If a reduced represents an adaptive physiologic response to offset the anemia, then perhaps one would expect the normal seen here for ACD, where the anemia itself may represent an adaptive physiologic response (18).

Fig. 2 shows that *D*_{h} differentiates IDA and TT, the two most common causes of microcytic anemia. TT is one of the most commonly screened conditions in the world, but existing diagnostic methods either are too expensive or have unacceptably low diagnostic accuracy, with false positive rates of up to 30% (19, 20). Our model provides a possibly more accurate way to distinguish these conditions. We first established a threshold value for *D*_{h} of 0.0045 by analyzing 10 training samples. We then analyzed 50 independent patient samples where diagnosis of either mild IDA or TT could be confidently established and calculated *D*_{h}. Fig. 4 shows that this *D*_{h} threshold had a diagnostic accuracy of 98%, correctly identifying 22/22 cases of IDA and 27/28 cases of TT and outperforming other published approaches by between 6% and 41% (20). See *Diagnosis of Microcytic Anemia* in *Materials and Methods* for details.

Our mathematical model, although simple in concept, captures important features of the complex pathophysiology of circulating RBCs and allows us to identify quantitative differences in the dynamics of health and disease, using readily available clinical measurements. This model also provides a useful framework for thinking about the dynamical processes governing physiological homeostasis and shows how an understanding of these dynamics motivates diagnostic tests and generates pathophysiological hypotheses.

## Materials and Methods

### Blood Samples.

Blood samples and CBC results were obtained from the clinical laboratory of a tertiary care adult hospital under a research protocol approved by the Partners Healthcare Institutional Review Board. Reticulocyte and CBC measurements were made within 6 h of collection (21) on a Siemens Advia 2120 automated hemanalyzer.

IDA was defined as mild reduction in hematocrit to no more than 20% below the lower limit of normal, low MCV, low ferritin, and historical evidence of normal MCV with normal hematocrit. Patients with acute illness, acute bleeding, transfusion in the prior 6 mo, concurrent hospitalization, chronic inflammatory illness, or hemoglobinopathy were excluded.

TT was defined as either a high hemoglobin A2 or heterozygosity for the presence of an α-globin gene mutation, as well as a reduction in hematocrit to no more than 20% below the lower limit of normal, low MCV, and normal ferritin. Patients with acute illness, acute bleeding, transfusion in the prior 6 mo, concurrent hospitalization, chronic inflammatory illness, or additional hemoglobinopathy were excluded.

ACD was defined as a reduction in hematocrit to no more than 20% below the lower limit of normal, normal or high ferritin, and a low or normal total iron binding capacity. Patients with acute illness, acute bleeding, transfusion in the prior 6 mo, concurrent hospitalization, or hemoglobinopathy were excluded.

### Clearance Function.

On the basis of observations of empirical RBC distributions, we model probability of RBC clearance as a function of the RBC volume and hemoglobin content. Each RBC's position in the volume–hemoglobin content plane is projected onto the MCHC line, and the probability of clearance (*d*) is defined as either a sigmoid or a step function of the distance from this projected point to a threshold, , on this line (Figs. 1 and 5; Fig. S2). Eq. **3** quantifies this relationship:

or

### Description of Simulation and Explicit Numerical Solution.

For a given set of parameters, Eq. **2** was solved numerically using a finite difference approximation for first-order and second-order spatial derivatives with boundary conditions of vanishing probability at volumes and hemoglobin contents outside the pathophysiological range and initial conditions equal to the empirically measured reticulocyte distribution. The volume–hemoglobin content plane was discretized with a constant mesh width and represented as a vector (**P**) of variables equal to the probability density in each mesh cell. Simulation results reported here were executed with a mesh width of 1.8 fL along the volume axis and 1.8 pg along the hemoglobin content axis. This mesh width was comparable to the analytic resolution of the empirical volume and hemoglobin content measurements. All results were confirmed using a smaller mesh width (1.2 fL and 1.2 pg). The convection contribution (**f**) was modeled numerically using an upwind finite difference approximation of the spatial derivative. The resulting linear system of ordinary differential equations was integrated using the MATLAB ode15s integrator and iterated until a steady state (**P**_{∞}) was reached.

The steady-state distribution for the numerical problem (**P**_{∞}) was also determined analytically by noting that the numerical approximations of the evolution (**J** + **L**) and clearance (**d**) terms are linear operators and that the integral scaling the birth process is a constant equal to the reciprocal of twice the mean age . The linear operators can be inverted, yielding a family of steady-state distributions indexed by the mean cell age, as show in Eq. **4**:

We found negligible differences between the steady-state distributions obtained by each approach.

### Parameter Fits.

We identified optimal neighborhoods in parameter space for each patient, using gradient and nongradient optimization methods. We started with a patient's empirically measured reticulocyte distribution and an initial randomly chosen parameter set. We calculated the resulting steady-state RBC distribution using Eq. **4**. This calculated distribution (**P**_{∞}) was then compared with the empirical distribution (**P _{CBC}**). The quality of the parameter estimates was quantified by computing an objective function equal to the sum of the normalized squared residuals for the discretized distributions, as shown in Eq.

**5**, where

*i*and

*j*represent indices of cells in the discretized volume–hemoglobin plane:

New parameter values were then chosen to reduce this objective function. We used gradient-based (lsqnonlin function) and non-gradient-based (fminsearch and patternsearch functions) optimization algorithms in MATLAB to search for optimal parameters. We constrained all parameters to be nonnegative and defined a uniform initial parameter space to exclude mean cell ages >1,000 or <5 d. We then picked initial parameters from this space, using latin hypersquare sampling. We imposed one optimization constraint, limiting α to be small enough that the mean cell age would be >5 d and large enough that the mean cell age would <1,000 d. Fig. 6 shows that the model provides a faithful reproduction of *P*_{CBC} for this healthy patient by comparing the simulated and measured steady-state probability distributions. When projected along the MCHC line, the empirical distribution for this patient has slightly higher density near the mode and slightly lower density in the regions to either side of the mode. See Movie S2 for more detail.

Fig. 7 shows the smallest local minima for parameters obtained from >200 optimizations for a single patient. Some results had local minima above the range of the axes. The best fits among all simulations form a small neighborhood of values for all parameters, demonstrating that the parameter estimation process reached a well-defined optimal neighborhood for this patient's blood sample.

### Predicting Iron Deficiency Anemia.

We tested the hypothesis that compensated or latent IDA can be predicted up to 90 d earlier than is currently possible on the basis of an expanding population of cells that project along the MCHC line closer to the origin than the mean. The projection operation is pictured in Fig. 5. We first determined the projected position (*u*) of each cell with volume (*v*) and hemoglobin (*h*) along the MCHC line:

We then calculated the fraction of projected cells located between the origin and 85% of , where *f*_{MCHC} is the probability density of the projected cells as a function of location on the MCHC line:

The 85% threshold was chosen by comparing the discrimination efficiency of different thresholds for the steady-state CBCs used in Fig. 2. The 85% threshold shown in Fig. 8 provided the greatest separation when compared with other thresholds (70, 75, 80 90, and 100%). We selected a threshold value for *P*_{0.85} of 0.121 on the basis of this training set.

We then identified 40 new and independent patient CBCs, all of which were normal. Twenty of these normal CBCs came from individuals who had additional normal CBCs 30–90 d later, and 20 of these normal CBCs came from individuals who presented with IDA no more than 90 d later. Patients with acute bleeding or any iron supplementation between the two CBCs were excluded. See *Blood Samples* in this section for definition of IDA. Fig. 3 shows that the threshold of 0.121 for *P*_{0.85} is able to predict IDA with a sensitivity of 75% and a specificity of 100% in this independent test group.

### Diagnosis of Microcytic Anemia.

To assess the diagnostic accuracy of *D*_{h} in differentiating the two most common causes of microcytic anemia, we fit parameters for 10 training cases: 5 with IDA and 5 with TT, as shown in Fig. 4. See *Blood Samples* for case definitions. The TT training set had *D*_{h} ranging from 1.7 × 10^{−15} to 2.3 × 10^{−15}. The IDA training set had *D*_{h} ranging from 0.009 to 0.043. We picked a threshold value equal to 0.0045, the average of the lowest *D*_{h} among the IDA training set and the highest of the TT training set. We then identified 50 new and independent cases, 22 with IDA and 28 with TT. All 22 IDA cases were correctly classified, and 27/28 TT cases were correctly classified for an overall diagnostic accuracy of 98%. We find this accuracy to be superior to that of four commonly cited discriminant functions (20): The Green and King formula yielded an accuracy of 92%; Micro/Hypo, 84%; Mentzer, 68%; and England and Fraser, 57%. See ref. 20 for further details on these other discriminant functions.

## Acknowledgments

We thank Alicia Soriano, Svetal Patel, Rosy Jaromin, Hasmukh Patel, David Dorfman, Paul Charest, the Harvard Crimson Tissue Bank, and the Partners Research Patient Data Repository for help acquiring samples and data. We thank Ricardo Paxson, Madhusudhan Venkadesan, and Shreyas Mandre for useful suggestions on the analysis; Frank Bunn, Carlo Brugnara, and Steve Sloan for helpful discussions regarding the findings; and Rebecca Ward for critical reading of the manuscript. The simulations in this paper were primarily run on the Orchestra cluster supported by the Harvard Medical School Research Information Technology Group. J.M.H. acknowledges support from National Institute of Diabetes and Digestive and Kidney Diseases Grant DK083242. L.M. acknowledges support from National Heart, Lung, and Blood Institute Grant HL091331.

## Footnotes

^{1}To whom correspondence may be addressed. E-mail: john_higgins{at}hms.harvard.edu or lm{at}seas.harvard.edu.Author contributions: J.M.H. and L.M. designed research; J.M.H. performed research; J.M.H. and L.M. analyzed data; and J.M.H. and L.M. wrote the paper.

Conflict of interest statement: J.M.H. and L.M. are listed as inventors on a patent application related to this manuscript submitted by their institutions.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1012747107/-/DCSupplemental.

## References

- ↵
- Fauci AS,
- et al.

- Fauci AS

- ↵
- d'Onofrio G,
- et al.

- ↵
- ↵
- Waugh RE,
- et al.

- ↵
- Willekens FL,
- et al.

- ↵
- ↵
- Willekens FL,
- et al.

- ↵
- Lew VL,
- Raftos JE,
- Sorette M,
- Bookchin RM,
- Mohandas N

- ↵
- ↵
- Zwanzig R

- ↵
- Kampen NGv

- ↵
- ↵
- Garner C,
- et al.

- ↵
- Robbins SL,
- Kumar V,
- Cotran RS

- ↵
- ↵
- ↵
- ↵
- Zarychanski R,
- Houston DS

- ↵
- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Biological Sciences
- Systems Biology