Significance

This study provides a proof of concept that circulating RNA in blood plasma can be used to differentiate multiple inflammatory syndromes in children that are difficult to diagnose clinically, as well as provide evidence that circulating RNA in blood plasma can also be used to characterize organ damage using the same assay.

Abstract

Inflammatory syndromes, including those caused by infection, are a major cause of hospital admissions among children and are often misdiagnosed because of a lack of advanced molecular diagnostic tools. In this study, we explored the utility of circulating cell-free RNA (cfRNA) in plasma as an analyte for the differential diagnosis and characterization of pediatric inflammatory syndromes. We profiled cfRNA in 370 plasma samples from pediatric patients with a range of inflammatory conditions, including Kawasaki disease (KD), multisystem inflammatory syndrome in children (MIS-C), viral infections, and bacterial infections. We developed machine learning models based on these cfRNA profiles, which effectively differentiated KD from MIS-C—two conditions presenting with overlapping symptoms—with high performance [test area under the curve = 0.98]. We further extended this methodology into a multiclass machine learning framework that achieved 80% accuracy in distinguishing among KD, MIS-C, viral, and bacterial infections. We further demonstrated that cfRNA profiles can be used to quantify injury to specific tissues and organs, including the liver, heart, endothelium, nervous system, and the upper respiratory tract. Overall, this study identified cfRNA as a versatile analyte for the differential diagnosis and characterization of a wide range of pediatric inflammatory syndromes.
The differential diagnosis of inflammatory syndromes in children is complex owing to their overlapping clinical manifestations, nonspecific symptoms, and developmental age-related barriers to communication. These challenges often result in delayed or inaccurate diagnoses, thereby impeding effective clinical management and increasing the risk of long-term adverse health effects. A key example is Kawasaki disease (KD), an inflammatory syndrome of unknown etiology that primarily affects children under 5 y of age. KD is often misdiagnosed because significant clinical signs overlap between KD and other inflammatory and/or infectious conditions. Accurate diagnosis is critical, as KD patients who do not receive intravenous immunoglobulin (IVIG) early in the course of illness have a substantially increased risk of developing coronary artery aneurysms, making KD the leading cause of acquired heart disease in children. Thus, there is a clear need for accurate molecular tests for inflammatory conditions such as KD to inform timely and appropriate treatment.
Currently, the differential diagnosis of pediatric inflammatory syndromes relies on clinical assessment of signs and symptoms and results from a broad array of laboratory tests. Culture-based and molecular assays are routinely used to identify viral and bacterial pathogens, but these tests do not interrogate the host response and hence are unable to differentiate between infectious and noninfectious conditions. Serologic metabolic and antigen biomarkers used for diagnosis often lack specificity (1, 2). To address these limitations, recent studies have explored the use of whole blood RNA transcriptome profiling of the human host response to assess disease severity and differentiate among inflammatory conditions including KD, multisysteminflammatory syndrome in children (MIS-C), viral, and bacterial infection (36). However, while the whole blood profile is indicative of the host immune response, it provides little information regarding the extent of inflammation-related cell injury or death in solid organ tissues. In contrast, cell-free nucleic acids in plasma, including cell-free DNA (cfDNA) and cell-free RNA (cfRNA) are promising analytes for evaluating inflammation as they are released by dead or dying cells originating from both the bloodstream and solid tissues (7). Many recent studies have explored the use of cell-free nucleic acids for monitoring of pregnancy (8, 9), cancer (10, 11), transplantation (12, 13), and infection (5, 1423), yet the potential of cell-free nucleic acids for the differential diagnosis of inflammatory syndromes remains largely unexplored.
In this study, we applied plasma cfRNA profiling by RNA sequencing (RNA-seq) to compare host immune and cellular injury responses associated with four different inflammatory and/or infectious syndromes in children. We identified shared signatures across multiple conditions, highlighting the importance of incorporating multiple comparison groups for the development of disease-specific biomarkers. Using cfRNA profiles, we built a machine learning model with high accuracy in differentiating between MIS-C and KD, two pediatric inflammatory syndromes with overlapping clinical presentations. We then expanded this methodology to construct a multiclass diagnostic classifier capable of distinguishing among KD, MIS-C, viral, and bacterial infections. In addition, we demonstrate that the cfRNA profile can be correlated with markers of tissue damage and may be able to differentiate among different viral infections. We propose an application of cfRNA profiling as a decision support tool for differential diagnosis of inflammatory syndromes in the clinical setting.

Results

Clinical Cohort.

We collected and analyzed 370 plasma samples from pediatric patients with inflammatory and infectious conditions at four hospitals in the United States, including Rady Children’s Hospital San Diego (UCSD), Emory, Children’s National Hospital (CNH), and University of California San Francisco (UCSF) (Fig. 1A). This sample set included patients diagnosed with KD, MIS-C, viral infection, bacterial infection, and other hospitalized pediatric controls, some but not all with inflammatory disease (for example, arthralgia, Crohn’s disease flare, parenchymal lung disease, chronic lung disease, toxic shock syndrome, and postvaccine myocarditis), as well as healthy children (Fig. 1A and SI Appendix, Table S1). Included cases of bacterial and viral infections were heterogeneous with respect to infection site and pathogen (Dataset S1). Age, sex, race, severity, and inflammatory status were recorded for each patient (Fig. 1 BE). All patient samples were collected within 4 d of hospitalization during the acute phase of disease (Materials and Methods). RNA next-generation sequencing was used to quantify the genes and cell types of origin of cfRNA in each sample, with an average of 20.7 million reads sequenced per sample (Materials and Methods). No batch effect was observed between samples from different hospitals (Materials and Methods and SI Appendix, Fig. S1 A and B). Machine learning models were built to identify biosignatures associated with different diseases.
Fig. 1.
Sample overview. (A) Sample counts and distribution of hospital of origin for each disease group. “Other” indicates other hospitalized pediatric controls (B) Age distribution, (C) C-reactive protein (CRP) levels, (D) sex distribution, and (E) ICU status distribution for each sample group.

cfRNA Signatures of Disease.

Our initial focus was on characterizing changes in cfRNA profiles in plasma that are common across the various inflammatory conditions. We performed pairwise differential gene abundance analysis between the healthy control group and each disease group (Materials and Methods, Fig. 2A, and Dataset S2). This analysis identified differentially abundant genes (DAGs) for all comparisons [Benjamini–Hochberg (BH) adjusted P-value < 0.05]. The smallest number of DAGs (n = 2,686) was identified when comparing healthy controls to patients in the other hospitalized control group, likely due to the smaller sample size and higher heterogeneity of this sample group. The greatest number of DAGs (n = 6,606) was identified when comparing healthy controls to patients with KD (Fig. 2A). Further analysis of the DAGs for all conditions revealed a significant number of shared genes (1,871 DAGs, Fig. 2A). Chief among these were histone protein coding genes that were elevated for all disease groups, indicating that an increase in histone transcripts is a universal indicator of inflammation (SI Appendix, Fig. S2A). The analysis also identified immune related transcripts Myeloperoxidase (MPO), Elastase, Neutrophil Expressed (ELANE), CD53, and C-X-C Motif Chemokine Receptor 2 (CXCR2) elevated in each disease group, homeostasis related transcripts CDK19, ANAPC5, and 32 mitochondrial protein coding RNAs in the control group (SI Appendix, Fig. S2B). Pathway analysis of the 1,871 overlapping DAGs revealed an enrichment of neutrophil and cell replication transcripts related to inflammation (Fig. 2B and Dataset S3). These findings point to shared signatures of inflammation among disease groups and emphasize the need for intergroup comparisons to develop disease-specific biosignatures.
Fig. 2.
Shared cfRNA signatures among different inflammation syndromes. (A) Differential abundance analysis using DESeq2 was performed between healthy controls and each other group individually. Vertical columns indicate the number of overlapping genes that are significantly differentially abundant between groups and controls (BH adjusted P-values < 0.05). Dots below bars indicate the groups being intersected. Horizontal columns indicate the total number of DAGs between groups and controls. (B) Significantly enriched pathways in the set of genes found to be differentially abundant between healthy controls and each disease group. Average P-value and fold change used for pathway analysis (Qiagen, IPA).

cfRNA Can Differentiate MIS-C and KD.

KD and MIS-C share many clinical characteristics: They are highly inflammatory, present with endothelial dysfunction, and have multiple overlapping signs, including skin rashes, mucosal involvement, and fever. There are currently no molecular tests to distinguish between KD and MIS-C. We therefore assessed whether cfRNA could be used to differentiate these severe pediatric inflammatory syndromes. We divided the KD (n = 101) and MIS-C (n = 97) samples into training (60%), validation (20%), and test (20%) sets, ensuring a roughly equal representation of hospital of origin and disease subclassification across all three sets (Fig. 3A). We used the training data for feature selection and to train machine learning models. We then used the validation set to select the final model, based on the model with the highest receiving operator characteristic area under the curve (AUC). Last, we evaluated the performance of the final model using the test set. To prevent influence of the test set on the training data and to ensure unbiased results, we normalized each set separately using a variance stabilizing transformation (Materials and Methods).
Fig. 3.
cfRNA distinguishes KD and MIS-C. (A) Overview of sample set and modeling scheme. (B) Volcano plot of differentially abundant transcripts between MIS-C and KD. Analysis was performed using the training dataset (DESeq2). (C) Adjusted P-value, base mean, and gene ROC AUC distributions for all significant genes from the training KD vs. MIS-C comparison. The red dashed line represents the threshold cutoff used for filtering prior to model training. The green shaded area indicates genes used. (D) ROC AUC values for the 14 machine learning classification models applied to training and validation sets. (E) ROC curves of the training, validation, and test sets using the GLMNET with LASSO regression algorithm. (F) Violin plots of the classifier scores from the GLMNET with LASSO regression algorithm.
In our initial analysis using the training set, we identified 1,242 DAGs between KD and MIS-C (DESeq2, BH adjusted P-value < 0.01 and base mean > 10; Fig. 3B and Dataset S4). We refined this gene list based on adjusted P-value, base mean, individual gene AUC, and fold change, resulting in a final tally of 132 genes for model input (Fig. 3C). We then trained 14 machine learning classification models, evaluating their performance on the validation set (Materials and Methods). The generalized linear model with elastic net regularization (GLMNET) model with least absolute shrinkage and selection operator (LASSO) regression exhibited the highest validation set AUC and was selected as our final model (Fig. 3D). A unique feature of the GLMNET LASSO algorithm is the feature selection step, which selected 25 genes for the modeling (SI Appendix, Table S2). Using this trained model, we tested the classification performance using the test set [receiver operating characteristic (ROC)-AUC train = 1.00, validation = 0.98, and test = 0.98] (Fig. 3E). We also observed similar distributions in the classification scores across sample sets, further confirming that there was little to no overfitting of the training and test sets (Fig. 3F).
We compared the KD vs. MIS-C gene panel to previously reported differentially expressed genes (DEGs) from a study of cultured endothelial cells treated with KD and MIS-C sera (24). We found significant overlap in genes; nine of the 25 genes in our panel were also identified as DEGs in the previously reported study (nominal P-value < 0.05). Of the overlapping genes, those elevated in acute KD are associated with autophagy (ANXA6, CD44, and PDCD4) and endothelial–mesenchymal transition (EndoMT) (STK17B and VIM), processes linked to endothelial cell function in inflammatory conditions (2529). This provides evidence that the key differences in endothelial dysfunction and cardiac outcomes between MIS-C and KD may be linked to differences in endothelial autophagy and activation of EndoMT pathways.

Multidisease Classification Using cfRNA.

We next asked whether cfRNA could be used in the more challenging scenario of multidisease classification. For this, we developed a machine learning framework that combines the outputs of one-vs.-one models using a random forest (RF) multiclass algorithm (Materials and Methods and Fig. 4A). We first split our dataset into training (70%) and test (30%) sets, with roughly comparable proportions of samples from patients with KD, MIS-C, viral infection, or bacterial infection, while also stratifying the groups evenly with respect to hospital of origin and disease subclassification. We performed differential abundance analysis for each pairwise comparison using only the training data (Dataset S5). Next, we trained individual one-vs.-one GLMNET models with LASSO regression for each sample group (MIS-C, KD, viral infection, or bacterial infection), using the top 100 genes identified in the differential abundance analysis. These genes were selected based on gene abundance AUC and level of significance (Materials and Methods). Each of the one-vs.-one models demonstrated high performance with individual model scores classifying samples with high accuracy (test AUC: min = 0.86, max = 0.99; test accuracy: min = 0.83, max = 0.92; Fig. 4B and SI Appendix, Table S3). The union of genes used by each GLMNET model with LASSO regression generated a 109-gene panel (Dataset S6). To combine the outputs of the individual models, we trained a multiclass RF classifier using the classification scores from the one-vs.-one models. Using this framework, our multiclass machine learning model achieved high accuracy in both the training and test sets (accuracy, train = 100% and test = 80%). The bacterial infection group had the highest rate of misclassified samples, likely due to the smaller sample size and heterogeneity of both pathogen and site of infection (SI Appendix, Table S3 and Dataset S1). The high performance of the multiclass machine learning model on a relatively small number of genes points to the potential utility of cfRNA in differentiating complex inflammatory conditions in a clinical setting.
Fig. 4.
Multiclass classification of pediatric disease using cfRNA. (A) Overview of machine learning framework used for multiclass classification. (B) ROC-AUC plots for each one-vs.-one model trained, along with train (Top) and test (Bottom) classifier score distributions. (C) Confusion matrix of reference and predicted diagnoses for train and test samples. Color indicates fraction of samples in each category.

Characterization of Disease Using cfRNA.

We further investigated whether cfRNA could be employed not only for classification but also for disease characterization. Since inflammation and/or infection can impact multiple organ systems, understanding organ-specific damage and function is crucial for guiding clinical management. Previous work from our group and others has shown that quantifying the cfRNA cell-type-of-origin (CTO) is a viable noninvasive method to assess cell, tissue, and organ-specific injury (17, 30). Here, we employed deconvolution of the cell types of origin of cfRNA using BayesPrism and the Tabula Sapiens human cell atlas as a reference. The CTO estimates were compared with known biomarkers and other indicators of specific organ injury or dysfunction (Materials and Methods and Fig. 5A). We first compared hepatocyte and intrahepatic cholangiocyte contributions to the cfRNA in plasma to alanine transaminase measurements (ALT, n = 141). We observed significantly elevated levels of cfRNA from hepatocytes and intrahepatic cholangiocytes in patients with high ALT (ALT > 100 IU/L) compared to patients with normal ALT (ALT < 40 IU/L, Fig. 5A). To explore cardiac function and damage, samples were categorized as having either normal or abnormal cardiac function (Materials and Methods). In the abnormal group, we observed increased levels of cfRNA from cardiac muscle cells and pericytes, likely indicative of increased cardiac cell injury or death. Interestingly, this group also exhibited elevated cfRNA levels from kidney epithelial cells, intrahepatic cholangiocytes, club cells and type I pneumocytes, and bronchial epithelium, suggesting other end organ damage associated with impaired cardiac function (Fig. 5A). Of note, patients with abnormal cardiac function did not have elevated levels of smooth or skeletal muscle cell derived cfRNA (SI Appendix, Fig. S3A). Last, to evaluate lung function and damage we focused on samples from healthy individuals and viral infection patients with COVID-19, further stratifying the COVID-19 samples by disease severity (Materials and Methods). We observed elevated levels of cfRNA from club cells and type I pneumocytes in moderate COVID-19 cases compared to healthy individuals, with a greater increase for severe cases (Fig. 5A). We observed similar trends in the levels of bronchial epithelium derived cfRNA; however, the difference was not statistically significant.
Fig. 5.
cfRNA as a clinical decision support tool. (A) Median scaled cell type of origin fractions for samples separated by liver damage, cardiac function, COVID-19 severity, and endothelial damage, and healthy controls (Materials and Methods). Stars indicate statistically significant differences between groups in comparison (Wilcoxon rank-sum test, BH adjusted P-value < 0.05). (BE) Case studies of patients with example clinical decision support tool results from the multiclass algorithm, along with measurements of endothelium, heart, liver, lung, and neuronal damage from the deconvolution data. Sample shown in red and healthy donor samples in gray. Z-scores calculated relative to healthy donor sample distributions. Endothelium refers to endothelial cell, heart to cardiac muscle cells, liver to hepatocyte, lung to club cell and type I pneumocyte, and neuronal to Schwann cell cfRNA cell type of origin fractions.
We next assessed whether cfRNA could be used to further stratify sample groups by distinguishing between COVID-19 infection from Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and infections from other viral pathogens [influenza, respiratory syncytial virus (RSV), Epstein-Barr virus (EBV), adenovirus, etc.]. We separated viral infection samples into COVID-19 and non-COVID-19 viral infections and randomly split the data into training (70%) and testing (30%) datasets. Next, we trained a GLMNET with LASSO regression model to differentiate between these groups (Materials and Methods). The trained model had high performance on the training and test sets despite including only seven genes (train AUC = 0.99 and test AUC = 0.93), demonstrating the potential of cfRNA for respiratory viral species differentiation (SI Appendix, Fig. S3 B and C).

cfRNA Profiling as a Clinical Decision Support tool.

We next simulated a cfRNA “report” for each patient in the multiclass test set (Dataset S7) and discuss four patient cases in detail to illustrate how cfRNA can be integrated to support clinical decision-making. Each report includes diagnostic predictions from the multiclass classification algorithm and predicted organ involvement levels as z-scores of CTO fractions in samples from patients with disease compared to healthy patients.
The first case is a 13-y-old male with very-high-risk B cell acute lymphoblastic leukemia in delayed intensification chemotherapy who presented with a 1-d history of fever (Fig. 5B). His laboratory assessments were notable for leukopenia, anemia, thrombocytopenia, elevated C-reactive protein (CRP), mildly elevated ALT, normal kidney function, and developed hypotension following admission. He had known household contacts with COVID-19, and his nasopharyngeal PCR was positive for SARS-CoV-2, but he did not develop respiratory symptoms. The final diagnosis was COVID-19 in an immunocompromised host. Despite the immunocompromised state of the patient and lack of respiratory symptoms common in COVID-19, the cfRNA algorithm correctly identified the patient as having a viral infection. Furthermore, the elevated levels of liver and heart derived cfRNA are interesting given the clinical manifestations of mildly elevated ALT and hypotension.
In the second case, the cfRNA model correctly predicted the patient as having MIS-C, but the classifier score for MIS-C and viral infection were very comparable, producing what could be considered a “borderline” result (Fig. 5C). The patient is a 4-y-old male who presented with a 1-wk history of abdominal pain, vomiting, and watery diarrhea, with subsequent development of fever, conjunctival injection, and swelling of the hands and feet. The patient had normal white blood cell counts, with lymphopenia, thrombocytopenia, hyponatremia, acidosis, and elevated brain natriuretic peptide (BNP), CRP, and erythrocyte sedimentation rate, with normal liver and kidney function. He was SARS-CoV-2 PCR negative but SARS-CoV-2 immunoglobulin G positive. No additional viral testing was performed. His course was complicated by hypotension, but an echocardiogram was grossly normal. The final diagnosis was MIS-C, which was the top prediction from the cfRNA model. Interestingly, viral infection was the second most probable diagnosis from the cfRNA model. MIS-C is derived from a viral infection (SARS-CoV-2), and MIS-C and viral infection can be difficult to differentiate. There are at least three possible interpretations for these results: 1) the model is correctly identifying MIS-C along with a viral signature from SARS-CoV-2, 2) the patient is misdiagnosed with MIS-C and has only a viral infection, or 3) the detection of a viral signature is due to errors in the model. We also observed elevated liver cfRNA despite the patient having normal liver enzymes along with elevated heart tissue derived cfRNA, which is consistent with the elevated BNP.
In the third case, the model incorrectly predicted the diagnosis in an immunocompromised patient, but the correct diagnosis of bacterial infection was the second most likely prediction (Fig. 5D). The patient is a 16-mo-old male with a history of Wiskott–Aldrich syndrome requiring a bone marrow transplant. He was admitted with fever, fatigue, pancytopenia with severe thrombocytopenia, elevated CRP, and blood cultures positive for Klebsiella pneumoniae. The source of the infection is unknown but was believed to be secondary to bacterial gut translocation or central line infection. His pancytopenia and thrombocytopenia were attributed to his history of Wiskott–Aldrich syndrome. He was treated with cefepime and responded clinically to a full course of antibiotics. The final diagnosis was bacterial infection, which was the second ranked prediction by the cfRNA model. Interestingly, the patient presented with some symptoms characteristic of MIS-C, the apparently erroneous top prediction made by the cfRNA model; specifically with severe thrombocytopenia and elevated CRP, which are part of the MIS-C case definition (31).
In the last case, the model predicted an incorrect diagnosis, but the results of the multiclass model would have been informative in providing the correct diagnosis (Fig. 5E). The patient presented with fever during the COVID-19 pandemic and was initially diagnosed with MIS-C. However, a family member of the patient had recently been diagnosed with KD. Given the known genetic predisposition associated with KD, a reevaluation of the original case was conducted. The patient was subsequently found to meet the American Heart Association’s case definition for KD and had initial laboratory values and a subsequent clinical course that were consistent with KD (including periungual desquamation). The patient was reclassified accordingly. Although the multiclass model did not predict the proper diagnosis of KD, the classifier scores for KD, bacterial infection, and viral infection were very similar, while the classifier score for MIS-C was much lower. Furthermore, the MIS-C vs. KD pairwise classifier accurately predicted KD. While the final diagnosis of the multiclass model was incorrect, the multiclass model and MIS-C vs. KD classifier results could have aided the clinicians in determining that the patient did not have MIS-C upon initial admission.

Discussion

In this work, we established a proof of concept that cfRNA can be used to differentiate among inflammatory syndromes in children across both one-on-one and multiclass comparisons. Interest in the use of blood-borne nucleic acids for diagnosis of inflammatory conditions is driven by continual advancements in sequencing technology, the ease of drawing blood, and the critical need for better diagnostic tools. Coote et al. recently introduced whole blood RNA multiclassification models for pediatric illnesses, distinguishing 6 broad and 18 granular categories. However, whole blood RNA biosignatures are primarily derived from immune cells, and do not provide information on cell, tissue, and organ damage. In contrast, cfRNA signals derive from dead and dying cells in blood and tissues, thus providing insight into immune dynamics and underlying tissue involvement (810, 1719, 30, 32, 33). Kalantar et al. used a combination of cfRNA, cfDNA, and whole blood RNA for differentiating sepsis and other causes of critical illness (5). However, the approach used by Kalantar et al. lacks consolidation into a unified multiclass model, requires the use of two different analytes (whole blood and plasma), and did not analyze cfRNA to characterize tissue and organ injury. Our work expands on these recent studies by presenting a multiclass framework using only cfRNA to differentiate, diagnose, and better characterize inflammatory syndromes in children.
The data from this study provide opportunities to better understand cfRNA profiles in health and disease. We observed cfRNA signatures that are common to all inflammatory conditions, including elevated histone related RNAs, neutrophil extracellular trap (NET) components MPO and ELANE (34), as well as immune markers interleukin enhancer binding factor 2 (ILF2), interferon gamma inducible protein 16 (IFI16), CD53, and CXCR2. NETs are composed of DNA, histones, and other proteins, and act to trap and neutralize pathogens while minimizing host cell damage (35). Our observation of cfRNA signatures of NET formation is consistent with the reported role of NETs in KD and MIS-C (36, 37).
To test the ability of cfRNA profiles to distinguish among pediatric inflammatory syndromes, we compared cfRNA profiles for patients diagnosed with KD and MIS-C, conditions that are very similar clinically. There is a need for improved molecular tools to discriminate between these two conditions as exemplified by a recent study by Day-Lewis et al. which reported significant overlap in signs and symptoms between KD and MIS-C (based on the 2023 case definition), with an estimated false positive rate of 8% (31). The cfRNA signature presented here has near-perfect accuracy and therefore has high potential for translation into a useful clinical molecular test. Furthermore, the identified gene signature provides mechanistic insight into KD and MIS-C. Beyond KD and MIS-C, we show that cfRNA can similarly differentiate between COVID-19 and other viral infections, opening the door for future studies to create more granular classification algorithms that can differentiate among pathogens at the species level based on cfRNA host profiling.
In this manuscript, we present multiple gene panels that accurately distinguish similar pediatric inflammatory conditions. Genes exhibiting high diagnostic relevance likely reflect the underlying molecular differences between conditions. For instance, the elevated expression of eukaryotic translation elongation factor 2 (EEF2) in KD suggest potential links to senescence and exposure to bacterial toxins, which is compelling given KD is thought to be triggered by an inhaled toxin or pathogen (38, 39). Conversely, the elevated expression of FKBP5 in MIS-C implicates a unique neuronal immune-stress response, which is relevant given the neuronal involvement reported for MIS-C (40, 41). Further analysis uncovered markers related to autophagy and EndoMT in the KD vs. MIS-C panel, aligning with previous findings comparing the transcriptome of endothelial cells treated with sera from KD and MIS-C patients. These observations may provide insights into the higher rates of acute cardiac involvement in MIS-C and the long-term arterial wall remodeling observed in KD. Dysregulation of autophagy in MIS-C could potentially lead to endothelial dysfunction, resulting in impaired blood flow to the heart and contributing to the higher rates of cardiac complications observed in this condition. Moreover, a lack of EndoMT pathway activation in MIS-C may account for lower rates of long-term remodeling compared to KD, despite the higher rates of acute cardiac involvement.
The multiclass gene panel can provide insights into key differences between conditions. For example, AKAP12, an A-kinase anchoring protein transcribed in endothelial cells (4244), was more abundant in KD and MIS-C as compared to viral and bacterial infections, and selected by each of the 1v1 models designed to differentiate KD and MIS-C from viral and bacterial infections. Additionally, BNIP3L was elevated in bacterial infection, potentially reflecting its role in mitophagy, while DNMT1, IFI27, SLFN5, and AHNAK were elevated in viral infection, and PABPC4 was elevated in KD.
The case studies focusing on both correct and incorrect predictions made by the classification model highlight the potential utility of cfRNA profiling in the clinical setting. They also underscore the importance of inclusion of as many “real-life” cases as possible during the development of classification models to maximize the specificity of their predictions. In particular, immunocompromised patients exhibit altered immune and tissue/organ damage responses, and more data will likely need to be collected and classifications models tailored to optimize performance of cfRNA profiling in this subgroup. However, we are encouraged that our results suggest that cell, tissue, and/or organ injury based on cfRNA levels can be detected even for those cases for which the model disease classification is incorrect.
Here, we present a multiclass model for differential diagnosis of inflammatory syndromes using plasma cfRNA. The final model includes just 109 genes, suggesting that translating the model to a multiplexed qRT-PCR based platform with a rapid turnaround time of a few hours is likely practical. Furthermore, one of the most compelling aspects of cfRNA profiling is the ability to quantify the extent of organ involvement. We demonstrate this concept with simultaneous quantification of injury to multiple organ systems (liver, cardiac, lung, endothelium). Available clinical tests for evaluating tissue injury (for example, alanine aminotransferase levels in the liver) enabled us to confirm the accuracy and cfRNA-based detection of organ injury. Even more compelling is the potential clinical utility of cfRNA in quantifying injury to tissues where current tests are not available or lack adequate sensitivity, such as endothelial and neuronal injury. In the future, simultaneous diagnosis and characterization of tissue injury may be critical in informing clinicians on the optimal management and treatment of their patients with inflammatory syndromes of unknown etiology.

Materials and Methods

Ethics Statement.

The UCSF Institutional Review Board (IRB) (#21-33403), San Francisco, CA; Emory University IRB (STUDY00000723), Atlanta, GA; Children’s National Medical Center IRB (Pro00010632), Washington, DC; and Cornell University IRB for Human Participants (2012010003), New York, NY, each approved the protocols for this study. All samples and patient information were deidentified for analysis and shared with collaborating institutions. At Emory University, the IRB-approved protocol was a prospective enrollment study under which parents provided consent and children assent as appropriate for age. At Children’s National Medical Center and UCSF, the IRB protocols were “no subject contact” sample biobanking protocols under which consent was not obtained and data were extracted from medical charts. At UCSD, the IRB reviewed and approved collection and sharing of samples and data (IRB #140220). Signed consent and assent were provided by the parent(s) and pediatric patient, respectively.

Sample Acquisition.

Samples were acquired from UCSF as previously described (17). Briefly, hospitalized pediatric patients were identified as having COVID-19 by testing positive with SARS-CoV-2 real-time PCR (RT-PCR). Residual whole blood samples were collected in EDTA lavender top tubes and diluted 1:1: in DNA/RNA shield (Zymo Research). The remaining blood was centrifuged at 2,500 rpm for 15 min and the available plasma was retained. All samples were stored at −80 °C freezer until used. Samples were acquired from Emory and Children’s Healthcare of Atlanta as previously described (17). Briefly, pediatric patients were classified as having COVID-19 by SARS-CoV-2 RT-PCR and as having MIS-C if they met the CDC case definition. Controls were healthy outpatients with no known history of COVID-19 who volunteered for specimen collection. Whole blood was collected in EDTA lavender top tubes and aliquoted for plasma extraction via centrifugation at 2,500 rpm for 15 min. Samples were stored at −80 °C and shipped on dry ice to either UCSF or Cornell for analysis. Samples were acquired from UCSD prior to any treatment in all subjects in EDTA lavender top tubes and centrifuged at 2,000 g for 10 min. Plasma was collected and stored at −80 °C. Samples were acquired from CNH as previously described (17). Briefly, pediatric patients were classified as having MIS-C if they met the CDC case definition. Whole blood samples were collected and centrifuged at 1,300× G for 5 min at room temperature. Plasma was aliquoted into a cryovial and frozen at −80 °C.

Clinical Data.

Patients were stratified as previously described (17). For the purposes of this study, patients were classified as having MIS-C by multidisciplinary teams that adjudicated whether a patient met the CDC case definition of MIS-C. COVID-19 was defined as any patient with PCR-confirmed SARS-CoV-2 infection within the preceding 14 d who did not also meet the MIS-C case definition. KD patients at UCSD met the AHA definition for complete or incomplete KD. Viral and bacterial infection patients enrolled at UCSD were adjudicated and a final diagnosis assigned by the research team (one ED clinician and one pediatric infectious disease expert) 2 to 3 mo after the acute illness to allow time for serologies, recurrence, and clinical recovery to be assessed. Patients with a self-limited illness that resolved without treatment and for whom viral studies were either negative or not done were classified as having a “viral syndrome.” Clinical data were abstracted from the medical record and submitted into a REDCap databases housed at UCSF or UCSD.

Sample Processing and Sequencing.

Samples were processed as described previously (17). Briefly, samples were received on dry ice, RNA was extracted, and libraries were prepared and sequenced on a NextSeq or NovaSeq Illumina sequencer. Reads were trimmed to 61 bp, and sequencing data were processed using a custom bioinformatics pipeline which included quality filtering and trimming, alignment to the human GRCh38 reference genome, and counting of gene features.

Sample Quality Filtering.

Using the sequencing data, quality control was performed by analyzing DNA contamination, rRNA contamination, total counts, and RNA degradation. DNA contamination was estimated by calculating the ratio of reads mapping to introns and exons. rRNA contamination was measured using SAMtools (v1.14). Total counts were calculated using featureCounts30 (v2.0.0). Degradation was estimated by calculating the 5′-3′ bias using Qualimap31 (v2.2.1). Samples were removed from analysis if either the intron to exon ratio was greater than 3, if a sample had less than 75,000 total feature counts, or if the 5′-3′ read alignment ratio bias was greater than 2.

Differential Abundance Analysis.

Gene transcript abundances were compared using a negative binomial model implemented using the DESeq2 R package (45). Gene transcript base mean abundance, adjusted P-value, and log2 fold change were taken from the DESeq2 Results output. Gene transcript AUC was calculated using VST transformed counts and the pROC R package (46).

Sample Partitioning.

Samples were partitioned for machine learning applications taking into consideration diagnosis, hospital of origin, and disease subclassification. MIS-C and COVID-19 samples were subclassified by severity, as previously defined (17). KD samples were subclassified by phenotypic subclusters, as previously defined (47). Non-COVID-19 Viral and bacterial infection samples were evenly partitioned by diagnosis and hospital of origin only.

Machine Learning: MIS-C vs. KD.

Samples were partitioned into training, validation, and test sets at a ratio of 60:20:20, partitioning evenly based on factors such as diagnosis, severity/subgroup, and hospital of origin. Subsequently, differential abundance analysis was conducted on the training data. Genes were filtered based on specific criteria (adjusted P-value < 0.01, base mean abundance > 100, gene transcript AUC > 0.65, and absolute log2 fold change > 0.25). Filtering criteria were chosen with the intention of selecting <150 genes. We tested multiple thresholds using our training and validation sample sets and found that the model results were not sensitive to the thresholds chosen. The test set was not used for determining thresholds for gene filtering.
Raw counts for the training, validation, and test sets were individually normalized using variance stabilizing transformation, and subsets were created based on the chosen transcript features. Variance stabilization transformation was performed using the DESeq2 package, and the dispersion function from the training set was used to transform the test and validation sets. Fourteen machine learning classification algorithms were employed using the R package Caret (https://doi.org/10.18637/jss.v028.i05), including generalized linear models (GLM) with Ridge and LASSO feature selection (GLMNETRIDGE and GLMNETLASSO), support vector machines with linear and radial basis function kernel (SVMLin and SVMRAD), RF, RF ExtraTrees (EXTRATREES), neural networks, linear discriminant analysis, nearest shrunken centroids (PAM), C5.0 (C5), k-nearest neighbors, naive bayes (NB), CART (RPART), and GLM. Training was performed using fivefold cross-validation and grid search hyperparameter tuning.
For each model, classification score thresholds were determined using Youden’s index on the training data. The trained models were then employed to predict labels for the validation set, and performance was assessed using the area under the ROC-AUC. The model achieving the highest AUC on the validation set was selected as the final model and subsequently applied to the test set, which had not been utilized in any phase of model training or selection. Prediction on both the validation and test sets utilized the Youden’s index threshold derived from the training set.

Machine Learning: Multiclassification.

Samples were partitioned into train and test sets at a ratio of 70:30, considering factors such as diagnosis, severity/subgroup, and hospital of origin. Raw counts for the training and test sets were individually normalized using variance stabilizing transformation, as implemented using the DESeq2 package, and the dispersion function from the training set was used to transform the test set. One-vs.-one GLMNET LASSO models were trained for each pairwise comparison of samples groups (e.g., KD vs. Viral Infection) using the top 150 significant features (adjusted P-value <0.05, base mean abundance > 50, and absolute log2 fold change > 1) ordered by gene transcript AUC calculated using the training data. Trained models were used to calculate classification scores for all samples in the train and test dataset, resulting in six classification scores for each sample. Classification scores were used to train a multiclass RF algorithm which assigned probability scores for each condition. The final condition with the highest probability score from the RF was assigned as the prediction for each sample.

CTO Analysis.

Cell type deconvolution was performed using BayesPrism (v1.1) (48) with the Tabula Sapiens single-cell RNA-seq atlas (Release 1) (33) as a reference. Cells from the Tabula Sapiens atlas were grouped as previously described by Vorperian et al. (30). Cell types with more than 100,000 unique molecular identifiers (UMIs) were included in the reference and subsampled to 300 cells using ScanPy (v1.8.1) (49). Deconvolution values were scaled from 0 to 1 for each cell type, and medians were calculated for plotting.
For liver damage assessment, samples were separated as having normal or high ALT measurements (normal: ALT < 40 IU/L, high: ALT > 100 IU/L). The ALT measurements were taken from the same blood draw as the plasma for cfRNA processing. For cardiac function, patients were categorized as abnormal if they had abnormal EKG/ECG and/or echocardiogram results. EKG/ECG and echocardiogram results were categorized as abnormal in the context of the patient narrative and final interpretation of the studies. For endothelial damage, samples were separated as either having KD/MIS-C or bacterial/viral infection. COVID-19 patients were determined to have moderate or severe disease using the following criteria:
Moderate: The patient must have been hospitalized due to COVID-19 respiratory disease and/or any systemic/nonrespiratory symptoms attributed to COVID-19 (e.g., neonatal fever, dehydration, new diagnosis diabetes, acute appendicitis, necrosis of extremities, diarrhea, encephalopathy, renal insufficiency, mild coagulation abnormalities, etc.).
Severe: The patient must have been hospitalized for COVID-19 with either high-flow oxygen requirement [high-flow nasal cannula, continuous positive airway pressure, bilevel positive airway pressure, intubation with mechanical ventilation, or extracorporeal membranous oxygenation] and/or evidence of end-organ failure [acute renal failure requiring dialysis, coagulation abnormalities resulting in bleeding or stroke, diabetic ketoacidosis, hemodynamic instability requiring vasopressors], and/or dying from COVID-19. These patients were almost always admitted to the intensive care unit (ICU).

Machine Learning: COVID-19 vs. Non-COVID-19 Viral Infection.

Training was done using the same method as the KD vs. MIS-C model. Briefly, viral infection samples were partitioned into train and test sets at a ratio of 70:30, considering factors such as COVID-19 status, severity/subgroup, and hospital of origin. Subsequently, differential abundance analysis was conducted on the training data. Genes were filtered based on specific criteria (adjusted P-value < 0.01, base mean abundance > 100, gene transcript AUC > 0.65, and absolute log2 fold change > 0.25), and the top 150 genes, as ordered by gene transcript AUC, were selected as inputs for machine learning analysis.
Raw counts for the train and test sets were individually normalized using variance stabilizing transformation, and subsets were created based on the chosen transcript features. A GLMNET with LASSO regression was trained on the training set using fivefold cross-validation and grid search hyperparameter tuning. Classification score thresholds were determined using Youden’s index on the training data. The trained models were then employed to predict labels for the test set using the Youden’s index threshold derived from the training set.

Quantification and Statistical Analyses.

The programming language R (v4.1.0) was utilized for all statistical analyses. Statistical significance was assessed through two-sided Wilcoxon signed-rank tests and Mann–Whitney U tests, unless specified otherwise. Machine learning algorithms were trained using the Caret R package, and pipelines were run using the Snakemake workflow management system (50, 51). In boxplots, boxes denote the 25th and 75th percentiles, the band within the box signifies the median, and whiskers extend to 1.5 times the interquartile range of the hinge. The alignment of all sequencing data was performed against the GRCh38 Gencode v38 Primary Assembly, with feature counting conducted using the GRCh38 Gencode v38 Primary Assembly Annotation (52).

Data, Materials, and Software Availability

Raw sequencing data in this study cannot be deposited in a public repository due to patient privacy concerns and lack of consent for a subset of the patient samples. Instead, deidentified RNA-seq count matrices have been uploaded to the National Center for Biotechnology Information Gene Expression Omnibus database and will be publicly available upon publication (GSE255555) (53). All code has been deposited on GitHub and will be available upon publication.

Acknowledgments

We thank the patients and their families for their help to further our understanding of pediatric inflammatory conditions. We would like to acknowledge staff members at the UCSF Clinical Laboratories and the UCSF Clinical Microbiology Laboratories for their help in identifying and retrieving patient whole blood samples. We thank the Cornell Genomics Center and the UCSF Center for Advanced Technology for help with sequencing libraries. At Emory, we thank Nadine Baida, Amrita Banerjee, Julia Bartol, Kushmita Bhakta, Caroline Ciric, Khalel De Castro, Khadijah Francois, Theda Gibson, Laila Hussaini, Grace Li, Wensheng Li, Austin Lu, Lisa Macoy, Clair Martin, Molly Morrison, Amy Muchinsky, Heather Nurse, Maria A. Perez, Etza Peters, Brianna Rice, Anna Siaw-Anim, Kathy Stephens, Elizabeth Grace Taylor, Ashley Tippett, Katie Zaks, and the Children’s Healthcare of Atlanta Research Laboratory for their contributions to specimen and data collection. At UCSD we thank the members of the PEMKDRG: Lukas Austin-Page, MD, Amy Bryl, MD, Joelle Donofrio-Ödmann, MD, Atim Ekpenyong, MD, David Gutglass, MD, Scott Herskovitz, MD, Paul Ishimine, MD, John Kanegaye, MD, Margaret Nguyen, MD, Mylinh Nguyen, MD, Kristy Schwartz, MD, Stacey Ulrich, MD, Tatyana Vayngortin, MD, and Elise Zimmerman, MD. We thank the members of the CHARMS study that contributed samples: Jocelyn Ang, MD, Margalit Rosenkranz, MD, Joseph Bochini, MD, Michelle Sykes, MD, Lerraughn Morgan, MD, Laura D’Addese, MD, and Maria Pilar Gutierrez, MD. This work was funded by NIH/National Institute of Child Health and Human Development grants R61HD105618, R33HD105618, and R33HD105593 (R.L.D., C.A.R., C.S., A.H.T., J.C.B., C.Y.C., and I.D.V.). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author contributions

C.J.L., M.D., R.L.D., C.A.R., J.C.B., C.Y.C., and I.D.V. designed research; C.J.L., A.B., J.L., E.B. processed samples; C.J.L., V.S., A.S.-G., A.B., W.S., J.N., M.E.W., M.O., M.A.G., J.-H.C., H.-M.H., H.W., J.K., C.S., A.H.T., J.C.B., C.Y.C., and I.D.V. analyzed data; W.S., J.N., M.E.W., M.O., M.A.G., P.E.M.K.D.R.G., C.H.A.R.M.S., J.-H.C., H.-M.H., C.S., A.H.T., M.D., R.L.D., C.A.R., J.C.B., and C.Y.C. identified and collected patient samples and clinical metadata; and C.J.L., J.C.B., C.Y.C., and I.D.V. wrote the paper.

Competing interests

C.J.L and I.D.V are inventors on submitted patents pertaining to cell-free nucleic acids (US patent applications 63/237,367 and 63/429,733). I.D.V. is listed as an inventor on submitted patents pertaining to cell-free nucleic acids (US patent applications 63/237,367, 63/056,249, 63/015,095, 16/500,929, 41614P-10551-01-US) and receives consulting fees from Eurofins Viracor. C.A.R. has received institutional support from ModernaTX, Inc., Pfizer Inc., BioFire Inc., GSK plc, MedImmune, Micron Technology Inc., Janssen Pharmaceuticals, Merck & Co., Inc., Novavax, PaxVax, Regeneron, and Sanofi Pasteur. She is co-inventor of patented RSV vaccine technology which has been licensed to Meissa Vaccines, Inc. C.Y.C. receives grant funding for research unrelated to this work from the Bay Area Lyme Disease Foundation, the Chan-Zuckerberg Biohub, and Abbott Laboratories, Inc.

Supporting Information

Appendix 01 (PDF)
Dataset S01 (CSV)
Dataset S02 (CSV)
Dataset S03 (CSV)
Dataset S04 (CSV)
Dataset S05 (XLSX)
Dataset S06 (CSV)
Dataset S07 (PDF)

References

1
E. Hisamuddin, A. Hisam, S. Wahid, G. Raza, Validity of C-reactive protein (CRP) for diagnosis of neonatal sepsis. Pak. J. Med. Sci. 31, 527–531 (2015).
2
M. Tan, Y. Lu, H. Jiang, L. Zhang, The diagnostic accuracy of procalcitonin and C-reactive protein for sepsis: A systematic review and meta-analysis. J. Cell. Biochem. 120, 5852–5859 (2019).
3
V. Servellita et al., A diagnostic classifier for gene expression-based identification of early Lyme disease. Commun. Med. 2, 92 (2022).
4
D. Habgood-Coote et al., Diagnosis of childhood febrile illness using a multi-class blood RNA molecular signature. Med. 4, 635–654.e5 (2023).
5
K. L. Kalantar et al., Integrated host-microbe plasma metagenomics for sepsis diagnosis in a prospective cohort of critically ill adults. Nat. Microbiol. 7, 1805–1816 (2022).
6
H. R. Jackson et al., Diagnosis of multisystem inflammatory syndrome in children by a whole-blood transcriptional signature. J. Pediatr. Infect. Dis. Soc. 12, 322–331 (2023).
7
C. J. Loy, L. Ahmann, I. De Vlaminck, W. Gu, Liquid biopsy based on cell-free DNA and RNA. Annu. Rev. Biomed. Eng. 26, 169–195 (2024), https://doi.org/10.1146/annurev-bioeng-110222-111259.
8
T. T. M. Ngo et al., Noninvasive blood tests for fetal development predict gestational age and preterm delivery. Science 360, 1133–1136 (2018).
9
M. N. Moufarrej et al., Early prediction of preeclampsia in pregnancy with cell-free RNA. Nature 602, 689–694 (2022).
10
M. H. Larson et al., A comprehensive characterization of the cell-free transcriptome reveals tissue- and subtype-specific biomarkers for cancer detection. Nat. Commun. 12, 2357 (2021).
11
D. Mathios et al., Detection and characterization of lung cancer using cell-free DNA fragmentomes. Nat. Commun. 12, 5060 (2021).
12
A. P. Cheng et al., Cell-free DNA profiling informs all major complications of hematopoietic cell transplantation. Proc. Natl. Acad. Sci. U.S.A. 119, e2113476118 (2022).
13
I. De Vlaminck et al., Circulating cell-free DNA enables noninvasive diagnosis of heart transplant rejection. Sci. Transl. Med. 6, 241ra77 (2014).
14
A. P. Cheng et al., A cell-free DNA metagenomic sequencing assay that integrates the host injury response to infection. Proc. Natl. Acad. Sci. U.S.A. 116, 18738–18744 (2019).
15
P. Burnham et al., Urinary cell-free DNA is a versatile analyte for monitoring infections of the urinary tract. Nat. Commun. 9, 2412 (2018).
16
A. P. Cheng et al., Cell-free DNA tissues of origin by methylation profiling reveals significant cell, tissue, and organ-specific injury related to COVID-19 severity. Med. 2, 411–422.e5 (2021).
17
C. J. Loy et al., Nucleic acid biomarkers of immune response and cell and tissue damage in children with COVID-19 and MIS-C. Cell Rep. Med. 4, 101034 (2023).
18
Y. Wang et al., Plasma cell-free RNA characteristics in COVID-19 patients. Genome Res. 32, 228–241 (2022).
19
A. Chang et al., Circulating cell-free RNA in blood as a host response biomarker for the detection of tuberculosis. Nat. Commun. 15, 4949 (2024).
20
I. De Vlaminck et al., Temporal response of the human virome to immunosuppression and antiviral therapy. Cell 155, 1178–1187 (2013).
21
I. De Vlaminck et al., Noninvasive monitoring of infection and rejection after lung transplantation. Proc. Natl. Acad. Sci. U.S.A. 112, 13336–13341 (2015).
22
P. Burnham et al., Separating the signal from the noise in metagenomic cell-free DNA sequencing. Microbiome 8, 18 (2020).
23
O. Mzava et al., A metagenomic DNA sequencing assay that is robust against environmental DNA contamination. Nat. Commun. 13, 4197 (2022).
24
J. Kim et al., Endothelial cell response in Kawasaki Disease and multisystem inflammatory syndrome in children. Int. J. Mol. Sci. 24, 12318 (2023).
25
M. Wang et al., ANXA6/TRPV2 axis promotes lymphatic metastasis in head and neck squamous cell carcinoma by inducing autophagy. Exp. Hematol. Oncol. 12, 43 (2023).
26
L. Zhang et al., CD44 connects autophagy decline and ageing in the vascular endothelium. Nat. Commun. 14, 5524 (2023).
27
H. Chen et al., Role of programmed cell death 4 (PDCD4)-mediated Akt signaling pathway in vascular endothelial cell injury caused by lower-extremity ischemia-reperfusion in rats. Med. Sci. Monit. 25, 4811–4818 (2019).
28
Y. Lan et al., STK17B promotes carcinogenesis and metastasis via AKT/GSK-3β/Snail signaling in hepatocellular carcinoma. Cell Death Dis. 9, 236 (2018).
29
M. He et al., miR-483 targeting of CTGF suppresses endothelial-to-mesenchymal transition: Therapeutic implications in Kawasaki disease. Circ. Res. 120, 354–365 (2017).
30
S. K. Vorperian, M. N. Moufarrej, Tabula Sapiens Consortium, S. R. Quake, Cell types of origin of the cell-free transcriptome. Nat. Biotechnol. 40, 855–861 (2022).
31
M. Day-Lewis et al., Updated case definition of MIS-C and implications for clinical care. Pediatrics 153, e2023063259 (2024).
32
S. Toden et al., Noninvasive characterization of Alzheimer’s disease by circulating, cell-free messenger RNA next-generation sequencing. Sci. Adv. 6, eabb1654 (2020).
33
Tabula Sapiens Consortium et al., The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans. Science 376, eabl4896 (2022).
34
S. A. Rizo-Téllez, M. Sekheri, J. G. Filep, Myeloperoxidase: Regulation of neutrophil function and target for therapy. Antioxid. Basel Switz. 11, 2302 (2022).
35
M. J. Kaplan, M. Radic, Neutrophil extracellular traps: Double-edged swords of innate immunity. J. Immunol. 189, 2689–2695 (2012).
36
Y. Yoshida et al., Enhanced formation of neutrophil extracellular traps in Kawasaki disease. Pediatr. Res. 87, 998–1004 (2020).
37
B. P. Boribong et al., Neutrophil profiles of pediatric COVID-19 and multisystem inflammatory syndrome in children. Cell Rep. Med. 3, 100848 (2022).
38
S. Liu, T. S. Matsui, N. Kang, S. Deguchi, Analysis of senescence-responsive stress fiber proteome reveals reorganization of stress fibers mediated by elongation factor eEF2 in HFF-1 cells. Mol. Biol. Cell 33, ar10 (2022).
39
M. K. Mateyak, T. G. Kinzy, ADP-ribosylation of translation elongation factor 2 by diphtheria toxin in yeast inhibits translation and cell separation. J. Biol. Chem. 288, 24647–24655 (2013).
40
A. S. Zannas, T. Wiechmann, N. C. Gassen, E. B. Binder, Gene-stress-epigenetic regulation of FKBP5: Clinical and translational implications. Neuropsychopharmacology 41, 261–274 (2016).
41
S. M. Bova et al., Neurological involvement in multisystem inflammatory syndrome in children: Clinical, electroencephalographic and magnetic resonance imaging peculiarities and therapeutic implications. An Italian Single-center experience. Front. Pediatr. 10, 932208 (2022).
42
J. H. Seo et al., AKAP12 supports blood-brain barrier integrity against ischemic stroke. Int. J. Mol. Sci. 21, 9078 (2020).
43
S. Akakura, I. H. Gelman, Pivotal role of AKAP12 in the regulation of cellular adhesion dynamics: Control of cytoskeletal architecture, cell migration, and mitogenic signaling. J. Signal Transduct. 2012, 529179 (2012).
44
M. Y. Radeva, D. Kugelmann, V. Spindler, J. Waschke, PKA compartmentalization via AKAP220 and AKAP12 contributes to endothelial barrier regulation. PLoS One 9, e106733 (2014).
45
M. I. Love, W. Huber, S. Anders, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
46
X. Robin et al., pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 12, 77 (2011).
47
H. Wang et al., Subgroups of children with Kawasaki disease: A data-driven cluster analysis. Lancet Child Adolesc. Health 7, 697–707 (2023).
48
T. Chu, Z. Wang, D. Pe’er, C. G. Danko, Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology. Nat. Cancer 3, 505–517 (2022).
49
F. A. Wolf, P. Angerer, F. J. Theis, SCANPY: Large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
50
J. Köster, S. Rahmann, Snakemake–A scalable bioinformatics workflow engine. Bioinform. Oxf. Engl. 28, 2520–2522 (2012).
51
M. Kuhn, Building predictive models in R using the caret package. J. Stat. Softw. 28, 1–26 (2008).
52
A. Frankish et al., GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
53
C. Loy et al., Data from “Plasma Cell-free RNA Signatures of Inflammatory Syndromes in Children”. Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc=GSE255555. Deposited 11 February 2024.

Information & Authors

Information

Published in

The cover image for PNAS Vol.121; No.37
Proceedings of the National Academy of Sciences
Vol. 121 | No. 37
September 10, 2024
PubMed: 39240972

Classifications

Data, Materials, and Software Availability

Raw sequencing data in this study cannot be deposited in a public repository due to patient privacy concerns and lack of consent for a subset of the patient samples. Instead, deidentified RNA-seq count matrices have been uploaded to the National Center for Biotechnology Information Gene Expression Omnibus database and will be publicly available upon publication (GSE255555) (53). All code has been deposited on GitHub and will be available upon publication.

Submission history

Received: February 23, 2024
Accepted: July 25, 2024
Published online: September 6, 2024
Published in issue: September 10, 2024

Keywords

  1. cell-free RNA
  2. pediatrics
  3. inflammation
  4. diagnostics
  5. machine learning

Acknowledgments

We thank the patients and their families for their help to further our understanding of pediatric inflammatory conditions. We would like to acknowledge staff members at the UCSF Clinical Laboratories and the UCSF Clinical Microbiology Laboratories for their help in identifying and retrieving patient whole blood samples. We thank the Cornell Genomics Center and the UCSF Center for Advanced Technology for help with sequencing libraries. At Emory, we thank Nadine Baida, Amrita Banerjee, Julia Bartol, Kushmita Bhakta, Caroline Ciric, Khalel De Castro, Khadijah Francois, Theda Gibson, Laila Hussaini, Grace Li, Wensheng Li, Austin Lu, Lisa Macoy, Clair Martin, Molly Morrison, Amy Muchinsky, Heather Nurse, Maria A. Perez, Etza Peters, Brianna Rice, Anna Siaw-Anim, Kathy Stephens, Elizabeth Grace Taylor, Ashley Tippett, Katie Zaks, and the Children’s Healthcare of Atlanta Research Laboratory for their contributions to specimen and data collection. At UCSD we thank the members of the PEMKDRG: Lukas Austin-Page, MD, Amy Bryl, MD, Joelle Donofrio-Ödmann, MD, Atim Ekpenyong, MD, David Gutglass, MD, Scott Herskovitz, MD, Paul Ishimine, MD, John Kanegaye, MD, Margaret Nguyen, MD, Mylinh Nguyen, MD, Kristy Schwartz, MD, Stacey Ulrich, MD, Tatyana Vayngortin, MD, and Elise Zimmerman, MD. We thank the members of the CHARMS study that contributed samples: Jocelyn Ang, MD, Margalit Rosenkranz, MD, Joseph Bochini, MD, Michelle Sykes, MD, Lerraughn Morgan, MD, Laura D’Addese, MD, and Maria Pilar Gutierrez, MD. This work was funded by NIH/National Institute of Child Health and Human Development grants R61HD105618, R33HD105618, and R33HD105593 (R.L.D., C.A.R., C.S., A.H.T., J.C.B., C.Y.C., and I.D.V.). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author contributions
C.J.L., M.D., R.L.D., C.A.R., J.C.B., C.Y.C., and I.D.V. designed research; C.J.L., A.B., J.L., E.B. processed samples; C.J.L., V.S., A.S.-G., A.B., W.S., J.N., M.E.W., M.O., M.A.G., J.-H.C., H.-M.H., H.W., J.K., C.S., A.H.T., J.C.B., C.Y.C., and I.D.V. analyzed data; W.S., J.N., M.E.W., M.O., M.A.G., P.E.M.K.D.R.G., C.H.A.R.M.S., J.-H.C., H.-M.H., C.S., A.H.T., M.D., R.L.D., C.A.R., J.C.B., and C.Y.C. identified and collected patient samples and clinical metadata; and C.J.L., J.C.B., C.Y.C., and I.D.V. wrote the paper.
Competing interests
C.J.L and I.D.V are inventors on submitted patents pertaining to cell-free nucleic acids (US patent applications 63/237,367 and 63/429,733). I.D.V. is listed as an inventor on submitted patents pertaining to cell-free nucleic acids (US patent applications 63/237,367, 63/056,249, 63/015,095, 16/500,929, 41614P-10551-01-US) and receives consulting fees from Eurofins Viracor. C.A.R. has received institutional support from ModernaTX, Inc., Pfizer Inc., BioFire Inc., GSK plc, MedImmune, Micron Technology Inc., Janssen Pharmaceuticals, Merck & Co., Inc., Novavax, PaxVax, Regeneron, and Sanofi Pasteur. She is co-inventor of patented RSV vaccine technology which has been licensed to Meissa Vaccines, Inc. C.Y.C. receives grant funding for research unrelated to this work from the Bay Area Lyme Disease Foundation, the Chan-Zuckerberg Biohub, and Abbott Laboratories, Inc.

Notes

Preprint server: This manuscript has been uploaded to MedRxiv (https://doi.org/10.1101/2024.03.06.24303645).
This article is a PNAS Direct Submission.

Authors

Affiliations

Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14850
Venice Servellita
Department of Laboratory Medicine, University of California, San Francisco, CA 94143
Alicia Sotomayor-Gonzalez https://orcid.org/0000-0002-8930-745X
Department of Laboratory Medicine, University of California, San Francisco, CA 94143
Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14850
Joan S. Lenz
Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14850
Emma Belcher
Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14850
Will Suslovic
Division of Pediatric Infectious Disease, Children’s National Hospital, Washington, DC 20010
Jenny Nguyen
Department of Laboratory Medicine, University of California, San Francisco, CA 94143
Meagan E. Williams
Division of Pediatric Infectious Disease, Children’s National Hospital, Washington, DC 20010
Department of Laboratory Medicine, University of California, San Francisco, CA 94143
Michael A. Gardiner
Department of Pediatrics, Rady Children’s Hospital-San Diego, San Diego, CA 92123
Department of Pediatrics, Kawasaki Disease Research Center, University of California San Diego, La Jolla, CA 92093
the C.H.A.R.M.S. the Study Group1
Division of Infectious Diseases, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30307
Center for Childhood Infections and Vaccines, Children’s Healthcare of Atlanta, Atlanta, GA 30322
Hui-Mien Hsiao
Division of Infectious Diseases, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30307
Center for Childhood Infections and Vaccines, Children’s Healthcare of Atlanta, Atlanta, GA 30322
Department of Pediatrics, Kawasaki Disease Research Center, University of California San Diego, La Jolla, CA 92093
Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT 06510
Department of Pediatrics, Kawasaki Disease Research Center, University of California San Diego, La Jolla, CA 92093
Department of Pediatrics, Rady Children’s Hospital-San Diego, San Diego, CA 92123
Department of Pediatrics, Kawasaki Disease Research Center, University of California San Diego, La Jolla, CA 92093
Meghan Delaney
Division of Pediatric Infectious Disease, Children’s National Hospital, Washington, DC 20010
Department of Pediatrics, George Washington University, School of Medicine & Health Sciences, Washington, DC 20052
Roberta L. DeBiasi
Division of Pediatric Infectious Disease, Children’s National Hospital, Washington, DC 20010
Department of Pediatrics, George Washington University, School of Medicine & Health Sciences, Washington, DC 20052
Division of Infectious Diseases, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30307
Center for Childhood Infections and Vaccines, Children’s Healthcare of Atlanta, Atlanta, GA 30322
Department of Pediatrics, Rady Children’s Hospital-San Diego, San Diego, CA 92123
Department of Pediatrics, Kawasaki Disease Research Center, University of California San Diego, La Jolla, CA 92093
Department of Laboratory Medicine, University of California, San Francisco, CA 94143
Department of Medicine, Division of Infectious Diseases, University of California, San Francisco, CA 94158
Chan-Zuckerberg Biohub, San Francisco, CA 94158
Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14850

Notes

2
To whom correspondence may be addressed. Email: [email protected], [email protected], or [email protected].
1
A complete list of the members of the Pediatric Emergency Medicine Kawasaki Disease Research Group (PEMKDRG) and the Characterization of Multisystem Inflammatory Syndrome in Children (CHARMS) Study Group is available in the supporting information.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Altmetrics




Citations

Export the article citation data by selecting a format from the list below and clicking Export.

View Options

View options

PDF format

Download this article as a PDF file

DOWNLOAD PDF

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Personal login Institutional Login

Recommend to a librarian

Recommend PNAS to a Librarian

Purchase options

Purchase this article to access the full text.

Single Article Purchase

Plasma cell-free RNA signatures of inflammatory syndromes in children
Proceedings of the National Academy of Sciences
  • Vol. 121
  • No. 37

Figures

Tables

Media

Share

Share

Share article link

Share on social media