New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Machine learning reveals systematic accumulation of electric current in lead-up to solar flares
Edited by Katepalli R. Sreenivasan, New York University, New York, NY, and approved April 17, 2019 (received for review November 29, 2018)

Significance
Reliable flare forecasting is essential for improving preparedness for severe space weather consequences. Flares also serve as probes of solar magnetic processes and the emergence of flux at the solar surface. Training machine-learning (ML) algorithms using magnetic-field observations for improving flare forecasting has been extensively studied in prior literature. Instead, here we use ML to understand the underlying mechanisms governing flares. We train ML algorithms to classify flaring and nonflaring active regions (ARs) with high fidelity and report statistical trends for AR evolution days before and after M- and X-class flares. These trends are interpreted in terms of existing models of subsurface magnetic field and flux emergence. Our results also provide hypotheses for achieving reliable flare forecasting.
Abstract
Solar flares—bursts of high-energy radiation responsible for severe space weather effects—are a consequence of the occasional destabilization of magnetic fields rooted in active regions (ARs). The complexity of AR evolution is a barrier to a comprehensive understanding of flaring processes and accurate prediction. Although machine learning (ML) has been used to improve flare predictions, the potential for revealing precursors and associated physics has been underexploited. Here, we train ML algorithms to classify between vector–magnetic-field observations from flaring ARs, producing at least one M-/X-class flare, and nonflaring ARs. Analysis of magnetic-field observations accurately classified by the machine presents statistical evidence for (i) ARs persisting in flare-productive states—characterized by AR area—for days, before and after M- and X-class flare events; (ii) systematic preflare buildup of free energy in the form of electric currents, suggesting that the associated subsurface magnetic field is twisted; and (iii) intensification of Maxwell stresses in the corona above newly emerging ARs, days before first flares. These results provide insights into flare physics and improving flare forecasting.
By virtue of buoyancy, magnetic fields generated in the interior of the Sun rise to the photosphere—the visible solar surface—and emerge as bipolar active regions (ARs) (1, 2). Emerging flux and electric currents energize the coronal magnetic field that is rooted in ARs (3). Magnetic reconnection occasionally releases free energy built up in the coronal loops in violent events such as solar flares (4, 5). M- and X-class flares, producing X-ray flux
The complex nature of AR dynamics hinders straightforward interpretation of flare observations, although AR magnetic-field features related to flare activity are known from case and statistical studies (11⇓–13). Recurrent flares are found to be associated with continuously emerging magnetic flux (14). ARs producing M- and X-class flares contain a prominent high-gradient region separating opposite polarities (15). Magnetic helicity and electric current are found to be accumulated in ARs before major flares (16, 17). Minutes before the onset of flares, increased Lorentz forces in ARs are observed as a result of elevated pressure from the coronal magnetic field (18, 19). Such AR features can be quantified using photospheric vector–magnetic-field data (20) from the Helioseismic and Magnetic Imager (HMI) (21) on board the Solar Dynamics Observatory (SDO) (22).
Machine learning (ML)—efficient in classifying, recognizing, and interpreting patterns in high-dimensional datasets—has been applied to predict flares using many AR features simultaneously. Such studies are aimed at developing a reliable forecasting method and identifying features most relevant to flare activity (23⇓⇓–26), obtaining new AR features that yield better forecasting accuracy (25, 27), and comparing performances of different ML algorithms (28). Flare prediction accuracy is expected to depend on forward-looking time, i.e., how far in advance flares can be predicted. Existing studies, which use AR observations ranging from 1 h to 48 h before flares, suggest, however, that forecasting accuracy is largely insensitive to forward-looking time (24, 27, 29). Thus, flaring ARs may exist in a flare-productive state long before producing a flare. This motivates the present work where we explicitly train ML algorithms to classify between photospheric magnetic fields of flaring and nonflaring ARs. The trained machine builds a correlation (probability distribution function) between AR photospheric magnetic fields and flaring activity in AR coronal loops. We analyze time evolution of machine correlation between AR magnetic fields and flaring activity to investigate (i) whether magnetic fields from flaring and nonflaring ARs are intrinsically different, (ii) statistical evolution in flaring ARs days before and after flares, and (iii) the development of emerging ARs days before first flares.
Methods
We consider ARs between May 2010 and April 2016. Using the GOES X-ray flux catalog (30), we identify ARs that produce at least one M- or X-class flare during its passage across the visible solar disk as flaring and otherwise as nonflaring. We consider only ARs with maximum observed area >25
Data used for classification of flaring and nonflaring ARs: AR magnetic-field features (SHARPs) used for training ML algorithms
Results
Classification of Flaring and Nonflaring Active Regions.
We chronologically split the available AR data into two parts. The training and validation data comprise ARs between May 2010 and December 2013 and the test data comprise ARs between January 2014 and April 2016. We explicitly study the development of newly emerged ARs, identified from the first recorded observation within
Data used for classification of flaring and nonflaring ARs: Number of ARs and M- and X-class flares considered
We consider observations from flaring ARs which are within
A straightforward performance measure for classification problems is accuracy, defined as the fraction of correctly classified observations; i.e.,
Time Evolution of Machine Prediction.
We are particularly interested in time evolution of magnetic fields in flaring ARs, and hence we obtain recall of the machine prediction on time series of observations from flaring ARs. A time series
We can now obtain time evolution of machine prediction for flaring ARs in the test data using the trained SVM. Note that none of the observations from the test data were considered during training and cross-validation of the machine, i.e., SVM, performance. Thus, all observations in the test data are previously “unseen” by the machine. Similar to the training data, recall or identification rate is consistently high (
Time evolution of recall
The number of observations separated from flares by
Average prediction and classification for all flaring and nonflaring AR observations in the test data using SVM
Evolution of Magnetic Fields in Flaring Active Regions.
We have trained an SVM to distinguish between SHARP features derived from magnetic fields in flaring and nonflaring ARs with high fidelity. To understand magnetic-field evolution in ARs, we analyze TP and FN populations from flaring ARs and TN and FP populations from nonflaring ARs, as categorized by the machine. We include SHARP features from all ARs in the training and validation data as well as the test data. In Table 4, time- and population-averaged values of SHARP features over flaring AR observations separated from flares
Average values of SHARP features over flaring and nonflaring AR magnetic-field observations categorized by the SVM
Categories of SHARP features are further highlighted by the Pearson correlation matrix in Fig. 2. Strongly correlated features are divided into the following groups: (i) extensive features (area, total unsigned flux, total free energy, total Lorentz force, total unsigned vertical current, and total unsigned current helicity), (ii) features that scale with electric current in AR (absolute net current helicity and sum of net current per polarity), (iii) measures of AR nonpotential energy (mean free energy and area with shear
Pearson correlation matrix for SHARP features (see Table 1 for description). Based on the degree of correlation, SHARP features group together in categories representing (i) AR magnetic-field scale, (ii) AR energy buildup, (iii) AR nonpotentiality, (iv) Schrijver R value, and (v) Lorentz force on AR. P value of correlation between total vertical Lorentz force (TOTFZ) and R value is 0.09. All other P values are
SHARP features from each of the groups above characteristically evolve before and after flares. For the mth entry of each SHARP feature vector, we calculate the time evolution of population-averaged value
Time evolution of population-averaged values of SHARP features (see Table 1 for description) before and after flares. Average SHARP feature values over true positive (TP) and false negative (FN) flaring AR observations within
Development of Emerging Flaring ARs.
Our analysis shows that extensive SHARP features characteristically distinguish flaring and nonflaring AR populations. Also, values of the extensive features remain approximately constant days before and after flares. On the contrary, newly emerged ARs must start with small values of the extensive SHARP features. Therefore, we are interested in understanding how emerging ARs transition to flare-productive states before the first flare. For emerging flaring ARs, we compile observations in a time span
Machine identification and time evolution of SHARP features for emerging flaring ARs. These emerging ARs are first observed within
Discussion
We have trained an SVM to classify SHARP features derived from magnetic fields of flaring and nonflaring ARs. The SHARP features used for training (Table 1) include extensive AR magnetic-field features, features that scale with electric current in ARs representing energy buildup, features that scale with AR nonpotential energy, flux near the polarity inversion line, and vertical Lorentz force on ARs. The trained machine classifies flaring AR observations, separated from flare events
A time series of AR magnetic-field observations in the form of SHARP features
Since the machine prediction
This work demonstrates the importance of testing the machine on samples from ARs that are not part of training. Such a restriction is not explicitly imposed in any prior work related to flare forecasting using ML (e.g., refs. 24, 26, and 28). Here, we show that SHARP features corresponding to extensive AR quantities (such as total unsigned flux, area, etc.) are leading contributors to the machine classification and that the average values of these SHARP features do not change appreciably over a timescale of a few days. A machine trained on observations from a set of ARs, and then tested on observations from the same ARs (albeit for different flares), is likely to have higher recall because it has already added to its memory the information it saw in training, namely a similar set of SHARP features. Hence, for accurate testing of the machine, it is important that training and test data do not contain observations from the same ARs.
Class imbalance between flaring and nonflaring ARs implies that even a false-positive rate of
Acknowledgments
D.B.D. is thankful to Andrés Muñoz-Jaramillo and Monica Bobra for insightful discussions. S.M.H. acknowledges funding from the Ramanujan fellowship; the Max-Planck partner group program; and the Center for Space Science, New York University, Abu Dhabi. Computing was performed on the SEISMO cluster at the Tata Institute of Fundamental Research.
Footnotes
- ↵1To whom correspondence should be addressed. Email: dattaraj.dhuri{at}tifr.res.in.
Author contributions: S.M.H. and M.C.M.C. designed research; D.B.D. performed research; D.B.D. analyzed data; and D.B.D., S.M.H., and M.C.M.C. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1820244116/-/DCSupplemental.
Published under the PNAS license.
References
- ↵
- Cheung MCM,
- Isobe H
- ↵
- Stein RF
- ↵
- Leka KD,
- Canfield RC,
- McClymont AN,
- van Driel-Gesztelyi L
- ↵
- Shibata K,
- Magara T
- ↵
- Su Y, et al
- ↵
- Eastwood JP, et al
- ↵
- McIntosh PS
- ↵
- Rust DM, et al
- ↵
- ↵
- Barnes G, et al
- ↵
- Schrijver CJ
- ↵
- Leka KD,
- Barnes G
- ↵
- ↵
- Nitta NV,
- Hudson HS
- ↵
- Schrijver CJ
- ↵
- Park S-H, et al
- ↵
- Kontogiannis I,
- Georgoulis MK,
- Park S-H,
- Guerra JA
- ↵
- Sun X,
- Hoeksema JT,
- Liu Y,
- Kazachenko M,
- Chen R
- ↵
- Fisher GH,
- Bercik DJ,
- Welsch BT,
- Hudson HS
- ↵
- Hoeksema JT, et al
- ↵
- Scherrer PH, et al
- ↵
- ↵
- Ahmed OW, et al
- ↵
- Bobra MG,
- Couvidat S
- ↵
- Florios K, et al
- ↵
- Jonas E,
- Bobra M,
- Shankar V,
- Hoeksema JT,
- Recht B
- ↵
- Raboonik A,
- Safari H,
- Alipour N,
- Wheatland M
- ↵
- Nishizuka N, et al
- ↵
- Huang X, et al
- ↵
- Hurlburt N, et al
- ↵
- Bobra MG, et al
- ↵
- Sun X, et al
- ↵
- Wheatland MS,
- Litvinenko YE
- ↵
- Longcope DW,
- Welsch BT
- ↵
- Nie J, et al
- Hamdi SM,
- Kempton D,
- Ma R,
- Boubrahimi SF,
- Angryk RA
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Physical Sciences
- Astronomy