Nature 463, 1027-1028 (25 February 2010) -Published online 24 February 2010
Bruno A. Olshausen & Michael R. DeWeese
A mathematical method has been developed that distinguishes between the paintings of Pieter Bruegel the Elder and those of his imitators. But can the approach be used to spot imitations of works by any artist?
What makes the style of an artist unique? For paintings or drawings, what comes to mind are specific brush or pen strokes, the manner in which objects are shaded, or how characters or landscapes are portrayed. Art historians are skilled at identifying such details through visual inspection, and art collectors and museums currently rely on this type of expert analysis to authenticate works of art. Might it be possible to automate this process to provide a more objective assessment? Is it possible to teach a computer to analyse art? In an article in Proceedings of the National Academy of Sciences, Hughes et al.1 demonstrate that subtle stylistic differences between the paintings of Pieter Bruegel the Elder and those of his imitators, which were at one time misattributed by art historians, may be reliably detected by statistical methods.
Hughes and colleagues' work is the latest in a stream of research findings that have emerged over the past few decades in the field of 'image statistics'. The players in this field are an unlikely cadre of engineers, statisticians and neuroscientists who are seeking to characterize what makes images of the natural environment different from unstructured or random images (such as the 'static' on a computer monitor or television). Answering this question is central to the problem of coding and transmitting images over the airwaves and the Internet, and, it turns out, it is just as important for understanding how neurons encode and represent images in the brain.
The first image statisticians were television engineers, who, as early as the 1950s, were trying to exploit correlations in television signals to compress the signals into a more efficient format. Around the same time, pioneering psychologists and neuroscientists such as Fred Attneave and Horace Barlow were using ideas from information theory to work out how the particular structures contained in images shape the way that information is coded by neurons in the brain. Since then, others have succeeded in developing specific mathematical models of natural-image structure — showing, for example, that the two-dimensional power spectrum varies with spatial frequency, f, roughly as 1/f2 (ref. 2), and that the distribution of contrast in local image regions is invariant across scale3, 4, 5.
Investigators also began applying these and related models to characterize the statistical structure of paintings by particular artists. It was shown, for example, that Jackson Pollock's drip paintings have fractal structure6, and that Bruegel's drawings could be distinguished from those of his imitators by the shape of the histogram of wavelet filter outputs, which represent how much spatial structure is present at different scales and orientations7. It is this latter work that formed the basis for Hughes and colleagues' study1. Instead of using standard wavelet filters, they apply a set of filters that are adapted to the statistics of Bruegel's drawings through a method known as sparse coding.
In a sparse-coding model, local regions of an image are encoded in terms of a 'dictionary' of spatial features; importantly, the dictionary is built up, or trained, from the statistics of an ensemble of images, so that only a few elements from the dictionary are needed to encode any given region. Essentially, sparsity forces the elements of the dictionary to match spatial patterns that tend to occur in the images with frequencies significantly higher than chance, thus providing a snapshot of structure contained in the data. Neuroscientists have shown that such dictionaries, when trained on a large ensemble of natural scenes, match the measured receptive-field characteristics of neurons in the primary visual cortex of mammals. These and other empirical findings have lent support to the idea that sparse coding may be used by neurons for sensory representation in the cortex8.
Rather than attempting to form a generic code adapted to natural scenes, Hughes et al.1 asked what sort of dictionary results from training on one specific class of image — the drawings of Pieter Bruegel the Elder. The dictionary that emerges, not surprisingly, differs from that adapted for natural scenes. In some sense, Hughes et al. have evolved an artificial visual system that is hyper-adapted to Bruegel's drawings. Such a visual system will be adept at representing other drawings from this class — that is, other authentic drawings by Bruegel — because they result in sparse encodings. However, it will not be so adept at representing images outside this class, such as drawings by other artists and even those attempting to imitate Bruegel, because they will result in denser encodings — more dictionary elements will be needed to describe each image region (Fig. 1). To put it another way, a picture may be worth a thousand words, but if it's an authentic Bruegel, it should take only a few Bruegel dictionary elements to represent it faithfully.
Hughes and colleagues1 show that small image patches taken from a collection of authentic works by Pieter Bruegel the Elder (a) can be used to generate a 'dictionary' of visual elements attuned to the statistics of his style (b). A test image (c) can then be authenticated by recreating it with a combination of dictionary elements. If recreation of the test image requires only a few dictionary elements, it is sparse, and labelled 'authentic', whereas if accurate encoding of the test image requires many dictionary elements, it is labelled as an 'imitation'.
Can such an approach be used to authenticate works by any artist? And how robust can one expect it to be in practice? Key to the success of this study1 is the fact that all of the analyses were performed on one particular type of artwork produced by Bruegel — drawings of landscapes. However, Bruegel worked in a variety of media, and his subject matter spanned a wide range of content. Moreover, an individual artist may use various styles. Developing algorithms capable of generalizing across these variations presents a much more challenging problem. Another concern is that it may be possible to defeat this method by generating images that are sparse for a wide range of dictionaries. For example, a geometrical abstract painting by Piet Mondrian would presumably yield a highly sparse representation using a dictionary trained on nearly any artist. Worse still, images randomly generated from the learned dictionary elements would also exhibit high sparsity but would look nothing like a real Bruegel. Thus, sparsity alone may be too fragile a measure for authentification.
One might question other technical choices made by the authors, such as the exclusive use of kurtosis (a statistical measure often used to quantify the degree of 'peakedness' of a probability distribution) to characterize the sparsity of filter outputs; and the analysis of statistical significance is at times puzzling. But Hughes and colleagues have taken a bold step. This is an exciting area of research that goes even beyond forgery detection. Indeed, it begs the question of whether it might be possible to fully capture the style of an artist using statistics. The field of natural-image statistics has advanced beyond the simple sparse-coding models used here, and it is now possible to characterize complex relationships among dictionary elements9, 10. Intriguingly, all of these models are generative — that is, they can be used to synthesize images matching the statistics captured by the model, as has already been done successfully with textures11. One exciting possibility is that computers could generate novel images that convincingly emulate the style of a particular artist. Perhaps someday the best Bruegel imitators will be computers.