# History of art paintings through the lens of entropy and complexity

^{a}Departamento de Física, Universidade Estadual de Maringá, Maringá, PR 87020-900, Brazil;^{b}Faculty of Natural Sciences and Mathematics, University of Maribor, 2000 Maribor, Slovenia;^{c}School of Electronic and Information Engineering, Beihang University, Beijing 100191, China;^{d}Complexity Science Hub, 1080 Vienna, Austria

See allHide authors and affiliations

Edited by Herbert Levine, Rice University, Houston, TX, and approved July 19, 2018 (received for review January 3, 2018)

## Significance

The critical inquiry of paintings is essentially comparative. This limits the number of artworks that can be investigated by an art expert in reasonable time. The recent availability of large digitized art collections enables a shift in the scale of such analysis through the use of computational methods. Our research shows that simple physics-inspired metrics that are estimated from local spatial ordering patterns in paintings encode crucial information about the artwork. We present numerical scales that map well to canonical concepts in art history and reveal a historical and measurable evolutionary trend in visual arts. They also allow us to distinguish different artistic styles and artworks based on the degree of local order in the paintings.

## Abstract

Art is the ultimate expression of human creativity that is deeply influenced by the philosophy and culture of the corresponding historical epoch. The quantitative analysis of art is therefore essential for better understanding human cultural evolution. Here, we present a large-scale quantitative analysis of almost 140,000 paintings, spanning nearly a millennium of art history. Based on the local spatial patterns in the images of these paintings, we estimate the permutation entropy and the statistical complexity of each painting. These measures map the degree of visual order of artworks into a scale of order–disorder and simplicity–complexity that locally reflects qualitative categories proposed by art historians. The dynamical behavior of these measures reveals a clear temporal evolution of art, marked by transitions that agree with the main historical periods of art. Our research shows that different artistic styles have a distinct average degree of entropy and complexity, thus allowing a hierarchical organization and clustering of styles according to these metrics. We have further verified that the identified groups correspond well with the textual content used to qualitatively describe the styles and the applied complexity–entropy measures can be used for an effective classification of artworks.

Physics-inspired approaches have been successfully applied to a wide range of disciplines, including economic and social systems (1⇓–3). Such studies usually share the goal of finding fundamental principles and universalities that govern the dynamics of these systems (4). The impact and popularity of this research has been growing steadily in recent years, in large part due to the unprecedented amount of digital information that is available about the most diverse subjects at an impressive degree of detail. This digital data deluge enables researchers to bring quantitative methods to the study of human culture (5⇓–7), mobility (8, 9), and communication (10⇓–12), as well as literature (13), science production, and peer review (14⇓⇓⇓–18), at a scale that would have been unimaginable even a decade ago. A large-scale quantitative characterization of visual arts would be among such unimaginable research goals, not only because of data shortage but also because the study of art is often considered to be intrinsically qualitative. Quantitative approaches aimed at the characterization of visual arts can contribute to a better understanding of human cultural evolution, as well as to more practical matters, such as image characterization and classification.

While the scale of some current studies has changed dramatically, the use of quantitative techniques in the study of art has some precedent. Efforts can be traced back to the 1933 book *Aesthetic Measure* by the American mathematician Birkhoff (19), where a quantitative aesthetic measure is defined as the ratio between order (number of regularities found in an image) and complexity (number of elements in an image). However, the application of such quantitative techniques to the characterization of artworks is much more recent. Among the seminal works, we have the article by Taylor et al. (20), where Pollock’s paintings are characterized by an increasing fractal dimension over the course of his artistic career. This research article can be considered a landmark for the quantitative study of visual arts, inspiring many further applications of fractal analysis and related methods to determine the authenticity of paintings (21⇓⇓⇓–25) and to study the evolution of specific artists (26, 27), the statistical properties of particular paintings (28) and artists (29⇓–31), art movements (32), and many other visual expressions (33⇓–35). The most recent advances of this emerging and rapidly growing field of research are comprehensively documented in several conference proceedings and special issues of scientific journals (36⇓–38), where contributions have been focusing also on artwork restoration tools, authentication problems, and stylometry assessment procedures.

To date, relatively few research efforts have been dedicated to study paintings from a large-scale art historical perceptive. In 2014, Kim et al. (39) analyzed 29,000 images, finding that the color-use distribution is remarkably different among historical periods of western paintings, and, moreover, that the roughness exponent associated with the grayscale representation of these paintings displays an increasing trend over the years. In a more recent work, Lee et al. (40) have analyzed almost 180,000 paintings, focusing on the evolution of the color contrast. Among other findings, they have observed a sudden increase in the diversity of color contrast after 1850, and showed also that the same quantity can be used to capture information about artistic styles. Notably, there is also innovative research done by Manovich and coworkers (41⇓–43) concerning the analysis of large-scale datasets of paintings and other visual art expressions by means of the estimation of their average brightness and saturation.

However, except for the introduction of the roughness exponent, preceding research along similar lines has been predominantly focused on the evolution of color profiles, while the spatial patterns associated with the pixels in visual arts remain poorly understood. Here, we present a large-scale investigation of local order patterns over almost 140,000 visual artwork images that span several hundred years of art history. By calculating two complexity measures associated with the local order of the pixel arrangement in these artworks, we observe a clear and robust temporal evolution. This evolution is characterized by transitions that agree with different art periods. Moreover, the observed evolution shows that these periods are marked by distinct degrees of order and regularity in the pixel arrangements of the corresponding artworks. We further show that these complexity measures partially encode fundamental concepts of art history that are frequently used by experts for a qualitative description of artworks. In particular, the complexity measures distinguish different artistic styles according to their average order in the pixel arrangements, enable a hierarchical organization of styles, and are also capable of automatically classifying artworks into artistic styles.

## Results

Our results are based on a dataset comprising 137,364 visual artwork images (mainly paintings), obtained from the online visual arts encyclopedia WikiArt (https://www.wikiart.org). This webpage is among the most significant freely available sources for visual arts. It contains artworks from over 2,000 different artists, covering more than a hundred styles, and spanning a period on the order of a millennium. Each one of these image files has been converted into a matrix representation whose dimensions correspond to the image width and height, and whose elements are the average values of the shades of red, green, and blue (RGB) of the pixels in the RGB color space. For further details, we refer to *Materials and Methods*.

From this matrix representation of the artwork images, we calculate two complexity measures: the normalized permutation entropy H (44) and the statistical complexity C (45). As described in *Materials and Methods*, both measures are evaluated from the ordinal probability distribution P, which quantifies the occurrence of the ordinal patterns among the image pixels at a local scale. Here, we have estimated this distribution by considering sliding partitions of size

## Evolution of Art

A careful comparison of different artworks is one of the main methods used by art historians to understand whether and how art has evolved over the years. Works by Heinrich Wölfflin (49) and Alois Riegl (50), for example, can be considered fundamental in this regard. They have proposed to distinguish artworks from different periods through a few visual categories and qualitative descriptors. Visual comparison is undoubtedly a useful tool for evaluating artistic style. However, it is impractical to apply at scale. This is when computational methods show their greatest advantage. Nevertheless, to be useful, it is important that derived metrics are still easily interpreted in terms of familiar and disciplinary-relevant categories.

We note that the complexity–entropy plane partially (and locally) reflects Wölfflin’s dual concepts of linear versus painterly and Riegl’s dichotomy of haptic versus optic artworks. According to Wölfflin, “linear artworks” are composed of clear and outlined shapes, while, in “painterly artworks,” the contours are subtle and smudged for merging image parts and passing the idea of fluidity. Similarly, Riegl considers that “haptic artworks” depict objects as tangible discrete entities, isolated and circumscribed, whereas “optic artworks” represent objects as interrelated in deep space by exploiting light, color, and shadow effects to create the idea of an open spatial continuum. The notions of order/simplicity versus disorder/complexity in the pixel arrangements of images captured by the complexity–entropy plane partially encode these concepts. Images formed by distinct and outlined parts yield many repetitions of a few ordinal patterns, and, consequently, linear/haptic artworks are described by small values of H and large values of C. On the other hand, images composed of interrelated parts delimited by smudged edges produce more random patterns, and, accordingly, painterly/optic artworks are expected to yield larger values of H and smaller values of C. It is also worth mentioning that Wölfflin’s and Riegl’s dual concepts are limiting forms of representation that demarcate the scale of all possibilities (51). In this regard, the continuum of H and C values may help art historians to grade this scale.

In this context, we ask whether the scale defined by H and C values is capable of unveiling any dynamical properties of art. To answer this question, we estimate the average values of H and C after grouping the images by date. Because the artworks are not uniformly distributed over time (see *Materials and Methods*), we have chosen time intervals containing nearly the same number of images in each time window. Fig. 1 shows the joint evolution of the average values of C and H over the years (i.e., the changes in the complexity–entropy plane), where a clear and robust (*SI Appendix*, Fig. S1) trend is observed. This trajectory of H and C values shows that the artworks produced between the ninth and the 17th centuries are, on average, more regular/ordered than those created between the 19th and the mid-20th century. Also, the artworks produced after 1950 are even more regular/ordered than those from the two earlier periods. We observe further that the pace of changes in the complexity–entropy plane intensifies after the 19th century, a period that coincides with the emergence of several artistic styles (such as Neoclassicism and Impressionism), and also with the increase in the diversity of color contrast observed by Lee et al. (40).

The three regions in Fig. 1 defined by the values of H and C correspond well with the main divisions of art history. The first period (black rectangle) corresponds to Medieval Art, the Renaissance, Neoclassicism, and Romanticism, which developed until the 1850s (52). The second period (red rectangle) corresponds to Modern Art, marked by the birth of Impressionism in the 1870s, and by the development of several avant-garde artistic styles (such as Cubism, Expressionism, and Surrealism) during the first decades of the 20th century. Finally, the latest period corresponds to the transition between Modern Art and Contemporary/Postmodern Art. The specific date marking the beginning of the Postmodern period is still an object of fierce debate among art experts (52). Nevertheless, there is some consensus in that Postmodern Art begins with the development of Pop Art in the 1960s (52).

By carrying the analogy between the complexity–entropy plane and the concepts of Wölfflin and Riegl forward, the transition between the art produced before Modernism and Modern Art represents a change from linear/haptic to painterly/optic in the representation modes. This thus agrees with the idea that artworks from the Renaissance, Neoclassicism, and Romanticism usually represent objects rigidly distinguished from each other and separated by flat surfaces (49, 53, 54), while modern styles such as Impressionism, Fauvism, Pointillism, and Expressionism are marked by the use of looser and smudged brushstrokes to avoid the creation of pronounced edges (49, 53, 54). Intriguingly, the transition between Modern and Postmodern Art is marked by an even more intense and rapid change from painterly/optic to linear/haptic representation modes. This fact appears to agree with the Postmodern idea of art as being instantly recognizable, made of ordinary objects, and marked by the use of large and well-defined edges [such as in Hard Edge Painting and Op Art artworks (53, 54)].

The conceptions of art history proposed by Wölfflin and Riegl consider that art develops through a change from the linear/haptic to the painterly/optic mode of representation, which agrees with the first transition observed in Fig. 1. However, for Riegl, this development occurs through a single and continuous process (56), while Wölfflin has a cyclical conception of this transition that seems more consistent with the overall dynamical behavior of H and C. On the other hand, this cyclical conception is not compatible with the local persistent behavior of the changes in the complexity–entropy plane. Indeed, recent studies of art historians, such as the work of Gaiger (51), argue that neither of these conceptions hold when analyzing the entire development of art history. For Gaiger, the dual categories of Wölfflin and Riegl should be treated as purely descriptive concepts and not linked to a particular change over time.

Another possibility for understanding the underlying mechanisms of the dynamical behavior unveiled by the complexity–entropy plane is evolutionary theories of art (56, 57). These recently proposed theories consider art from different perspectives, such as adaptation, a by-product of the brain’s complexity, or sexual and natural selection aimed at sharing attention, and suggest that art’s evolutionary contribution was to foster social cohesion and creativity. According to these theories, art history is driven by the interplay between audience preference and the artist’s desire to engage attention and expand these preferences. This feedback mechanism among artists and the public would be responsible for propelling art toward its unprecedented degree of specialization, innovation, and diversity, and could also explain what has driven artists and artistic movements to follow the historical path depicted in Fig. 1.

## Distinguishing Among Artistic Styles

We now ask whether the complexity–entropy plane is capable of discriminating among different artistic styles in our dataset. To do so, we calculate the average values of H and C after grouping the images by style. We also limit this analysis to the 92 styles having more than 100 images each (corresponding to ∼90% of data; see *SI Appendix*, Fig. S5 for name and number of images of each one) to obtain reliable values for the averages. Fig. 2 shows that the artistic styles are spread over the complexity–entropy plane, and the average values of H and C are significantly different for the majority of the pairwise comparisons (∼92%; see *SI Appendix*, Fig. S7). However, we also observe styles with statistically indistinguishable average values.

We note further that the arrangement of styles is in agreement with the general trend in the average values of H and C over time in which most Postmodern styles are localized in a region of smaller entropy and larger complexity values than are modern styles (such as Expressionism, Impressionism, and Fauvism). This arrangement maps the different styles into a continuum scale whose extreme values partially reflect the dichotomy of linear/haptic versus painterly/optic modes of representation. Among the styles displaying the highest values of C and the smallest values of H, we find Minimalism, Hard Edge Painting, and Color Field Painting, which are all marked by the use of simple design elements that are well-delimited by abrupt transitions of colors (53, 54). Styles displaying the smallest values of C and the highest values of H (such as Impressionism, Pointillism, and Fauvism) are characterized by the use of smudged and diffuse brushstrokes, and also by blending colors to avoid the creation of sharp edges (53, 54).

## Hierarchical Structure of Artistic Styles

The values of H and C capture the degree of similarity among artistic styles regarding the local ordering of image pixels. This fact enables us to test for a possible hierarchical organization of styles with respect to this local ordering. To do so, we have considered the Euclidean distance between a pair of styles in the complexity–entropy plane as a dissimilarity measure between them. Thus, the closer the distance between two artistic styles, the more significant is the similarity between them, whereas pairs of styles separated by large distances are considered more dissimilar from each other. Fig. 3*A* shows the matrix plot of these distances, where we qualitatively observe the formation of style groups.

To investigate the clustering between artistic styles systematically, we use the minimum variance method proposed by Ward (58) to construct a dendrogram representation of the distance matrix. This method is a hierarchical clustering procedure that uses the within-cluster variance as the criterion for merging pairs of clusters. Fig. 3*B* depicts this dendrogram, unveiling an intricate relationship among the artistic styles in our dataset. By maximizing the silhouette coefficient (59) (as described in *Materials and Methods* and *SI Appendix*, Fig. S8), we find that 0.03 is the optimal threshold distance that maximizes the cohesion and separation among the clusters of styles. This threshold distance yields 14 groups of styles indicated by the different colors in Fig. 3.

These groups partially reflect the temporal localization of different artistic styles and their evolution reported in Fig. 1. In particular, several styles that emerged together or close in time are similar regarding the local arrangement of pixels and thus belong to the same group. For instance, the first five groups of Fig. 3*B* contain mainly Postmodern styles. On the other hand, these groups and their hierarchical structure organize the styles regarding their mode of representation in the scale delimited by the dichotomy of linear/haptic versus painterly/optic. This fact is more evident when examining groups in both extremes of order and regularity in the complexity–entropy plane. The right-most group of Fig. 3*B*, for example, contains styles that use relatively small brush strokes and avoid the creation of sharp edges. This fact is particularly evident in artworks of Impressionism, Pointillism, and Divisionism, but it is also evident in Neo-Baroque and Neo-Romanticism, and in the works of muralists (such as David Siqueiros and José Orozco), as well as in the abstract paintings of P&D (Pattern and Decoration). While devoted to patterning paintings (such as printed fabrics), P&D is considered a “reaction” to Minimalism and Conceptual Art (which are located in the other extreme of the complexity–entropy plane) that avoids restrained compositions by means of a subtle modulation of colors as in the works of Robert Zakanitch, who is considered one of the founders of P&D (60). As we move to groups characterized by high complexity and low entropy, we observe the clustering of styles marked by the presence of sharp edges and very contrasting patterns, usually formed by distinct parts isolated or combined with unrelated materials. That is the case for the group containing Op Art, Pop Art, and Constructivism, but also for the group formed by Kinetic Art, Hard Edge Painting, and Concretism (53, 54).

We can also verify the meaningfulness of these groups by comparing the clustering of Fig. 3*B* with an approach based on the similarities among the textual content of the Wikipedia pages of each artistic style. To do so, we have obtained the textual content of these webpages and extracted the top 100 keywords of each one by applying the term frequency–inverse document frequency approach (61). We consider the inverse of 1 plus the number of shared keywords between two styles as a measure of similarity between them. Thus, styles having no common keywords are at the maximum “distance” of 1, while styles sharing several keywords are at a closer distance.

By using a similar hierarchical clustering procedure to the one used in Fig. 3, we obtain 24 clusters of artistic styles from the Wikipedia text analysis (*SI Appendix*, Fig. S9). This number of clusters is much larger than the 14 clusters obtained from the complexity–entropy plane. However, both clustering approaches share similarities, which can be quantified by using the clustering evaluation metrics homogeneity *h*, completeness *c*, and v measure (62). Perfect homogeneity (

## Predicting Artistic Styles

Another possibility of quantifying the information encoded by the values of H and C is trying to predict the style of an image based only on these two values. To do so, we have implemented four well-known machine learning algorithms (63, 64) (nearest neighbors, random forest, support vector machine, and neural network; see *Materials and Methods* for details) for the classification task of predicting the style of images for all 20 styles that contain more than 1,500 artworks each. For each method, we estimate the validation curves for a range of values of the main parameters of the algorithms with a stratified n-fold cross-validation (63) strategy with *A* shows the validation curves for the k-nearest neighbors as a function of the number of neighbors. We note that this method underfits the data if the number of neighbors is smaller than ∼250. Conversely, the cross-validation score saturates at ∼0.18 if the neighbors are 300 or more, and there is no overfitting up to 500 neighbors. Another relevant issue for statistical learning is related to the number of data necessary to properly train the model. To investigate this, we again use a stratified n-fold cross-validation strategy with *B* shows the training and cross-validation scores for the k-nearest neighbors, where we observe that both scores increase with training size. However, this enhancement is very small when more than ∼50% of the data are used for training the model. *SI Appendix*, Fig. S10 shows results analogous to those presented in Fig. 4 as obtained with the other three machine learning algorithms.

By combining the previous analysis with a grid search algorithm, we determine the best combination of parameters enhancing the performance of each statistical learning method. Fig. 4*C* shows that the four algorithms display similar performances, all exhibiting accuracies close to 18%. We have further compared these accuracies with those obtained from two dummy classifiers. In the stratified classifier, style predictions are generated by chance but respecting the distribution of styles, while predictions are drawn uniformly at random when using the uniform classifier. The results in Fig. 4*C* show that all machine learning algorithms have a significantly larger accuracy than is obtained by chance. This result thus confirms that the values of H and C encode important information about the style of each artwork. Nevertheless, the achieved accuracy is quite modest for practical applications. Indeed, there are other approaches that are more accurate. For instance, Zujovic et al. (65) achieved accuracies of ∼70% in a classification task with 353 paintings from five styles, and Argarwal et al. (66) reported an accuracy of ∼60% in a classification task with 3,000 paintings from 10 styles. However, our results cannot be directly compared with those works, since they use a much smaller dataset with fewer styles and several image features, while our predictions are based only on two features. Our approach represents a severe dimensionality reduction, since images with roughly 1 million pixels are represented by two numbers related to the local ordering of the image pixels. In this context, an accuracy of 18% in a classification with 20 styles and more than 100,000 artworks is not negligible. Moreover, the local nature of H and C makes these complexity measures very fast, easy to parallelize, and scalable from the computational point of view. Thus, in addition to showing that the complexity–entropy plane encodes important information about the artistic styles, we believe that the values of H and C, combined with other image features, are likely to provide better classification scores.

## Discussion and Conclusions

We have presented a large-scale characterization of a dataset composed of almost 140,000 artwork images that span the latest millennium of art history. Our analysis is based on two relatively simple complexity measures (permutation entropy H and statistical complexity C) that are directly related to the ordinal patterns in the pixels of these images. These measures map the local degree of order of these artworks into a scale of order–disorder and simplicity–complexity that locally reflects the qualitative description of artworks proposed by Wölfflin and Riegl. The limits of this scale correspond to two extreme modes of representation proposed by these art historians, namely, to the dichotomy between linear/haptic (

By investigating the dynamical behavior of the average values of the complexity measures used, we have found a clear and robust trajectory of art over the years in the complexity–entropy plane. This trajectory is characterized by transitions that agree with the main periods of art history. These transitions can be classified as linear/haptic to painterly/optic (before and after Modern Art) and painterly/optic to linear/haptic (the transition between Modern and Postmodern Art), showing that each of these historical periods has a distinct degree of entropy and complexity. While Wölfflin’s conception of art history in terms of a cyclical transition between linear and painterly does not withstand the local time persistence in the values of H and C nor the critical scrutiny of Gaiger (51) and other contemporary art historians, it is quite consistent with the global evolution depicted in the complexity–entropy plane. For Wölfflin, the transition from linear to painterly is governed by a “natural law in the same way as physical growth,” and “to determine this law would be a central problem, the central problem of history of art” (ref. 49, p. 17). However, the return to the linear “lies certainly in outward circumstances” (ref. 49, p. 233), and, in the context of Fig. 1, it is not difficult to envisage that the transition from Modern to Postmodern was driven by the end of World War II, the event that usually marks the beginning of Postmodernism in history books.

In addition to unveiling this dynamical aspect of art, the values of H and C are capable of distinguishing between different artistic styles according to the average degree of entropy–complexity in the corresponding artworks. We emphasize that the location of each style in the complexity–entropy plane partially reflects the duality linear/haptic versus painterly/optic, and thus can be considered as a ruler for quantifying the use of these opposing modes of representation. Also, the distances between pairs of styles in the complexity–entropy plane represent a similarity measure regarding these art history concepts. By using these distances, we find that different styles can be hierarchically organized and grouped according to their position on the plane. We have verified that these groups reflect well the textual content of Wikipedia pages used for describing each style, and they also reflect some similarities among them, in particular regarding the presence of soft/smudged/diffuse or well-defined/sharp/abrupt transitions. We have further quantified the amount of information encoded in these complexity measures by means of a classification task in which the style of an image is predicted based solely on the values of H and C. The obtained success rate of approximately 18% outperforms dummy classifiers, in turn showing that these two measures carry meaningful information about artwork style.

Since our two complexity measures are based entirely on the local scale of an artwork, they, of course, cannot capture all of the uniqueness and complexity of art. However, our results nevertheless demonstrate that simple physics-inspired metrics can be connected to concepts proposed by art historians and, more importantly, that these measures do carry relevant information about artworks, their style, and their evolution. In the context of Wölfflin’s metaphor about the evolution of art: “A closer inspection certainly soon shows that art even here did not return to the point at which it once stood, but that only a spiral movement would meet the facts.” (ref. 49, p. 234), we may consider the complexity–entropy plane as one of the possible projections of Wölfflin’s spiral.

## Materials and Methods

### Data.

The digital images used in this study were obtained from the visual arts encyclopedia WikiArt (https://www.wikiart.org/), which is one of the largest online and freely available datasets of visual artworks available to date. By crawling the web pages of WikiArt in August of 2016, we downloaded 137,364 digitalized images and metadata related to each artwork, such as painter (there are 2,391 different artists), date, and artistic style (e.g., Impressionism, Surrealism, and Baroque). The style labels provided by WikiArt are generated and collaboratively maintained by the users of that webpage. For the analysis of the temporal evolution, we have excluded all images whose composition dates were not specified (33,724 files). Fig. 5*A* depicts the number of images per year in our dataset, where we observe that these artworks were created between the years 1031 and 2016. Fig. 5*B* shows that the cumulative fraction of artworks in our dataset is well approximated by an exponential growth with the characteristic time equal to

### Matrix Representation of Image Files.

All image files are in JPEG format with 24 bits per pixel (8 bits each for red, green, and blue colors in the RGB “color space”), meaning that each pixel of the image is characterized by 256 shades of red, green, and blue, which, in total, allows *SI Appendix*, Fig. S2). We have therefore resorted to using the simple average value.

### The Complexity–Entropy Plane.

By using the matrix representation of all images, we calculate the normalized permutation entropy H and statistical complexity C for each one. This technique was originally proposed for characterizing time series (44, 48), and, only recently, it has been generalized to use with higher-dimensional data such as images (46, 47). Here, we shall present this technique through a simple example (for a more formal description, we refer to the original articles). Let the matrix

Having the probability distribution

Despite the value of H being a good measure of randomness, it cannot adequately capture the degree of structural complexity present in A (47). Because of that, we further calculate the so-called statistical complexity (45, 68, 69)

We estimate the values of *SI Appendix*, Fig. S3), thus practically limiting our choice to

### Independence of H and C in Relation to Image Dimensions.

The image files obtained from WikiArt do not have the same dimensions. *SI Appendix*, Fig. S3 shows that image width and height have a similar distribution, with average values equal to 895 pixels for width and 913 pixels for height. Also, 95% of the images have width between 313 and 2,491 pixels, and height between 323 and 2,702 pixels. Because of these different dimensions, we have tested whether the values of *SI Appendix*, Fig. S4 shows scatter plots of the values of

### Finding the Number of Clusters with the Silhouette Coefficient.

The hierarchical organization of artistic styles presented in Fig. 3*B* enables the determination of clusters of styles. To do so, we must choose a threshold distance for which styles belong to different clusters. The number of clusters naturally depends on this choice. A way of determining an optimal threshold distance is by calculating the silhouette coefficient (59). This coefficient evaluates both the cohesion and the separation of data grouped into clusters. The silhouette coefficient is defined by the average value of*SI Appendix*, Fig. S8 shows the silhouette coefficient as a function of the threshold distance used for determining the clusters in Fig. 3*B*. We observe that the coefficient displays a maximum (of 0.57) when the threshold distance is 0.03. Thus, this threshold distance is the one that maximizes the cohesion and separation among the artistic styles. By using this value, we find 14 different groups of artistic styles shown in Fig. 3. Also, a similar approach yields the 24 different clusters that are associated with the similarities among the Wikipedia pages of the styles reported in *SI Appendix*, Fig. S9.

### Implementation of Machine Learning Algorithms.

All machine learning algorithms used for predicting the artistic styles from an image are implemented by using functions of the Python scikit-learn library (71). For instance, the function sklearn.neighbors.KNeighborsClassifier implements the k-nearest neighbors. In statistical learning, a classification task involves inferring the category of an object by using a set of explanatory variables or features associated with this object, and the knowledge of other observations (the training set) in which the categories of the objects are known. For the results presented in the main text, we have included 20 different styles that each have more than 1,500 images in our analysis (see *SI Appendix*, Fig. S5 for names). However, similar results are obtained by considering a larger number of styles. For example, the overall accuracy of different learning algorithms is approximately 13% if we consider all of the styles with more than 100 images each (*SI Appendix*, Fig. S11) (compared with ∼18% if only 20 styles are considered).

Thus, our classification task involves identifying the artistic style (the categories) of an image from its entropy H and complexity C (the set of features). To perform the classification, data are randomly partitioned into n equally sized samples that preserve the total fraction of occurrences in each category. One of the samples is used for validating the algorithm, and the remaining

We estimate the training and the cross-validation scores for each machine learning algorithm as a function of their main parameters (the validation curves). This is common practice for estimating the best trade-off between bias and variance errors. Bias errors occur when the learning methods are not properly taking into account all of the relevant information about the explanatory variables that describe the data (underfitting). Variance errors, on the other hand, usually happen when the complexity of the learning model is too high, that is, high enough even for modeling the noise in the training set (overfitting). *SI Appendix*, Fig. S10 shows the validation curves for the four learning methods that we use in our study. The parameters that we have studied are the number of neighbors in the case of the k-nearest neighbors algorithm, the number and the maximum depth of trees in the case of the random forest method, the parameter associated with the width of the radial basis function kernel and the penalty parameter for the support vector machine classification, and the so-called *SI Appendix*, Fig. S10 and used for obtaining the results shown in Fig. 4*C*.

In addition to the validation curves, we have also estimated the learning curves, that is, the dependence of the training and the cross-validation scores on the size of the training set. This practice is also common when dealing with statistical learning algorithms, since very small training sets are usually not enough for fitting the model, while adding unnecessary data may introduce noise to the model. The results presented in *SI Appendix*, Fig. S10 show that the cross-validation score increases with the training size for all algorithms. However, this growth is practically not significant if the training set exceeds 50% of the data.

## Acknowledgments

This research was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior Grants 440650/2014-3 and 303642/2014-9 and Slovenian Research Agency Grants J1-7009 and P5-0027.

## Footnotes

- ↵
^{1}To whom correspondence may be addressed. Email: matjaz.perc{at}uni-mb.si or hvr{at}dfi.uem.br.

Author contributions: H.Y.D.S., M.P., and H.V.R. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1800083115/-/DCSupplemental.

Published under the PNAS license.

## References

- ↵
- Mantegna RN,
- Stanley HE

- ↵
- Wang Z, et al.

- ↵
- ↵
- Stanley HE

- ↵
- Michel JB, et al.

- ↵
- Dodds PS, et al.

- ↵
- Schich M, et al.

- ↵
- ↵
- Deville P, et al.

- ↵
- Onnela JP, et al.

- ↵
- Jiang ZQ, et al.

- ↵
- Saramäki J, et al.

- ↵
- Hughes JM,
- Foti NJ,
- Krakauer DC,
- Rockmore DN

- ↵
- Kuhn T,
- Perc M,
- Helbing D

- ↵
- Perc M

- ↵
- ↵
- Sinatra R,
- Wang D,
- Deville P,
- Song C,
- Barabási AL

- ↵
- Balietti S,
- Goldstone RL,
- Helbing D

- ↵
- Birkhoff GD

- ↵
- ↵
- ↵
- Taylor RP,
- Micolich AP,
- Jonas D

- ↵
- Taylor RP, et al.

- ↵
- Jones-Smith K,
- Mathur H,
- Krauss LM

- ↵
- De la Calleja EM,
- Cervantes F,
- De la Calleja J

- ↵
- Boon JP,
- Casti J,
- Taylor RP

- ↵
- Alvarez-Ramirez J,
- Ibarra-Valdez C,
- Rodriguez E

- ↵
- Pedram P,
- Jafari GR

- ↵
- Taylor R

- ↵
- Hughes JM,
- Graham DJ,
- Rockmore DN

- ↵
- Shamir L

- ↵
- Elsa M,
- Zenit R

- ↵
- Castrejon-Pita JR,
- Castrejón-Pita AA,
- Sarmiento-Galán A,
- Castrejón-Garcıa R

- ↵
- ↵
- Montagner C,
- Linhares JMM,
- Vilarigues M,
- Nascimento SMC

- ↵
- Stork DG,
- Coddington J

*Computer Image Analysis in the Study of Art*, Proceedings of SPIE (Int Soc Opt Photonics, Bellingham, WA), Vol 6810. - ↵
- Stork DG,
- Coddington J,
- Bentkowska-Kafel A

*Computer Vision and Image Analysis of Art*, Proceedings of SPIE (Int Soc Opt Photonics, Bellingham, WA), Vol 7531. - ↵
- Stork DG,
- Coddington J,
- Bentkowska-Kafel A

*Computer Vision and Image Analysis of Art II*, Proceedings of SPIE (Int Soc Opt Photonics, Bellingham, WA), Vol 7869. - ↵
- ↵
- Lee B,
- Kim D,
- Jeong H,
- Sun S,
- Park J

- ↵
- Ushizima D,
- Manovich L,
- Margolis T,
- Douglass J

- ↵
- Manovich L

- ↵
- Yazdani M,
- Chow J,
- Manovich L

- ↵
- ↵
- ↵
- ↵
- Zunino L,
- Ribeiro HV

- ↵
- ↵
- Wölfflin H

- ↵
- Riegl A

- ↵
- Gaiger J

- ↵
- Danto AC,
- Goehr L

- ↵
- Kleiner FS

- ↵
- Hodge AN

- Blatt SJ,
- Blatt ES

- ↵
- Gottschall J,
- Wilson DS

- Boyd B

- ↵
- Tinio PPL,
- Smith JK

- Nadal M,
- Gómez-Puerto G

- ↵
- ↵
- ↵
- Swartz A

- ↵
- Chowdhury GG

- ↵
- Rosenberg A,
- Hirschberg J

- ↵
- Hastie T,
- Tibshirani R,
- Friedman J

*The Elements of Statistical Learning: Data Mining, Inference, and Prediction*, Springer Series in Statistics (Springer, New York). - ↵
- Müller A,
- Guido S

- ↵
- Zujovic J,
- Gandy L,
- Friedman S,
- Pardo B,
- Pappas TN

- ↵
- Agarwal S,
- Karnick H,
- Pant N,
- Patel U

- ↵
- ↵
- Lamberti PW,
- Martin MT,
- Plastino A,
- Rosso OA

- ↵
- ↵
- Reshef DN, et al.

- ↵
- Pedregosa F, et al.

## Citation Manager Formats

## Article Classifications

- Physical Sciences
- Applied Physical Sciences