How do Photographs Represent Things?

Of puppets, philosophy and pictorial representation

Here’s a photograph of a young man performing a puppet show in the square before Notre Dame Cathedral in Paris. He calls himself Bululu Theatre. On-screen, this is just a small, glassy looking, two-dimensional collection of independently resolved, variously toned pixels. If you have printed this off, it is just a small, flat, flimsy piece of paper marked by a rectangular smattering of inkjet (or similar) dots. I have a nice, larger version, picked out by a scattering of silver-halide particles on rag content paper, framed on my wall.


A man stands behind a tall, thin screen with a puppet on his raised right arm - in the square in front of Notre Dame Cathedral, Paris, France. He calls his act: Bululu Theatre.


In sharp contrast, the real puppeteer that I ran across in Paris was nothing like this. He was fleshy and animated, robustly three-dimensional and full of colour. By walking around I could view him from different sides, angles and distances. (The back of him wasn’t flat and white with diagonally printed lines of ‘Kodak’!) As I shifted, the background shifted. When I looked at the puppeteer, the buildings became fuzzy. When I looked at the buildings, he became fuzzy.

Given that seeing the real puppeteer in Paris is so little like seeing the puppeteer in a photograph, how exactly do we know the photograph shows a puppeteer? This is the philosophical approach – to ask the ‘little kid’ questions and try to explain what appears obvious to everyone else.

If you think this is daft, escape now.

What follows are thumbnail sketches of the five most popular philosophical responses to the conundrum of pictorial representation – how a bunch of lifeless dots on paper (or paint strokes on canvas, or pencil marks on sketch pad) can show us things in the real (or even fictional) world.

One theory suggests that the pattern of dots resemble the puppeteer. Another states that they create a sort of illusion of the puppeteer. The Semiotic theory holds that the patterns of dots act like a symbol, which we read and interpret as a puppeteer. The Make-Believe theory, not surprisingly, states that we make believe we see a puppeteer in the photograph. The Seeing-In theory suggests that something much more fundamental is going on and that we see a puppeteer in the dots.

Let’s take each of these theories in turn. My opinions should be taken with a grain of salt. Not everyone will be happy to apply all aspects of these theories to photographs – they are traditionally applied to paintings and other hand-made pictures unmediated by the optical-mechanical-digital actions of a camera. I hope the article will get the mental machinery rumbling and let you judge for yourself which theory might best capture our everyday experience of things through photographs.


Resemblance Theory

Large red heart painted on the back of a narrowboat reflected in the water.

A relatively simple version of the resemblance theory is based on the objective similarity between a picture and its subject matter out there in the real world – puppeteers, footballers, landscapes, movie stars, buildings, pet cats and so on. This view holds that the person looking at a picture compares something which is in front of them (the picture in their hand, say) with something that is absent (the real thing – which may be on the other side of town, at the bottom of the sea, or long dead). How this comparison is made, though, is not clearly specified in the theory. Some might say that it is a ‘mental picture’ to which a photograph is compared. But this seems to simply pass the buck to a middleman, for now we are faced with explaining how a ‘mental picture’ resembles (is objectively similar to) the subject matter (the real thing). And, unfortunately, mental items seem even more obscure than photographs. Even if you could resolve this, you would still need to say a lot more about how you compare something mental with something in your hand.

There is also a difference in logic between something resembling something else and it representing something else. If A resembles B, then B resembles A. If A represents B, it doesn’t follow that B represents A. The silhouette of children on a school crossing sign and real school children may resemble each other, and the silhouette may represent real children, but the real school children don’t represent the silhouette on the sign.

Another problem concerns the wide variety of non-figurative, artistic, highly distorted and fictional pictures we create. It’s difficult to see how a photograph of a blurred bus or a painting of a unicorn visually resemble or are objectively similar to anything. For we cannot see things this way without cameras, or see them period.


Illusion Theory

In the illusion theory a picture represents its subject matter by delivering to us an illusion of that subject matter. The art historian and aesthetic theorist EH Gombrich explains it in this way: a viewer’s attention rapidly flickers back and forth between seeing the picture and seeing a perceptual illusion of its subject matter. The theory is a psychological one; to a lesser or greater degree we are inclined to believe that the thing in the picture is the real thing, while at the same time we (somehow) know it to be a picture. The role of human imagination obviously plays a central role. In Gombrich’s view these illusions arise not through resemblance, but essentially through a process of convention based on the evolving historical practice of people making pictures and our experience of those conventions. As the historical aspects of the theory could easily form the basis for another article, I won't go into it here.

Jastrow's Duck-rabbit: Outline figure that can be seen as both a duck facing left and rabbit looking up.

I like to think human psychology and imagination play a role in understanding what pictures represent. Perhaps it is scalable, for example we use less imagination in understanding realistic photographs than impressionistic paintings. Imagination certainly seems to be present when we look at things like Rorschach ink-blots and illusions like Jastrow's duck-rabbit (left) – which can be seen as a duck or a rabbit. But ink-blots and illusions are intended to be ambiguous and to be interpreted in multiple ways. Ordinary pictures, like photographs and paintings and billboards, usually convey their subject matter pretty straight-forwardly.

I think what bothers me about the notion of illusion is that it is tied up with deception; illusions give us a false or mistaken belief about something present to our senses. If, when looking at a photograph, I increasingly found myself in the grip of a pictorial illusion (believing it to be the thing itself), surely I would also increasingly lose my grip on the fact that it was a photograph. But common experience suggests that we are simply not decieved in this way. Does Gombrich’s rapid flickering of our visual senses between photograph and illusion explain this? Possibly, but it’s difficult to say.

It seems that to run with the illusion theory you would need to have a good story – probably deeply rooted in human psychology and cognitive processing – as to how pictorial illusion can be so different from ordinary illusion.


Semiotic Theory

Close-up view of cuniform tablet characters

Strict semiotic views, such as that advanced by Nelson Goodman, hold that pictures, like all signs, are conventional – they represent their subject matter by belonging to a symbol system similar to how natural languages represent their subject matter. As a consequence, understanding what a picture represents is not grounded in visual experience but in understanding the symbol system. For Goodman, the difference between how pictures represent subject matter and how sentences (and maps, graphs and wiring diagrams) represent subject matter is that pictures are symbolically ‘dense’ (they have no parts, such as words and sentences) and ‘replete’ (they have more visual features which are ‘relevant’ to how they represent subject matter).

The theory is complex, and although there are some nice, broad parallels between how we experience pictures and language, there are some serious disanalogies. Take a landscape photograph. The photo delivers a visual experience of a view, a sentence describing a landscape does not. This difference in the experience seems (to me) to be not just a difference in degree – of symbolic ‘denseness’ and visual ‘repleteness’ – but one of kind. It is a different kind of experience. Also, the way in which we learn to ‘read’ a picture is radically different from how we learn to read a language. Children can identify subjects in pictures from a very early age without ‘learning’ the ‘symbol system’. We can work out unfamiliar subject matter from unfamiliar representations of them. None of this is possible with even the simplest written language or symbol set without a good deal of work and practice. To speak of a ‘visual language’ in anything more than very general terms (such as in discussing art and artistic practice) seems misplaced.

In the semiotic view, any object (including any picture) – just like any surface marking (including letters, numbers, sentences, pictograms and symbols) – can represent anything we wish to denote, by convention. We simply baptise things and supply a key to inform others. I think it is in this respect that the counterintuitive nature of the view can be seen most sharply. For example, surely no convention could get us to see the Mona Lisa as visually representing anything other than a portrait of a woman. It is difficult to imagine a convention in which we understand the Mona Lisa as visually representing three Siberian garage mechanics.



Detail of dodo bird and Alice in Wonderland panel on a fairground ride

The philosopher Kendall Walton developed the make-believe theory of pictorial representation, which holds that when we look at a picture we are participating in a fiction. The successful picture is a prop which allows the viewer to enter into a ‘game of make-believe’. Pictures represent their subject matter by our (correctly) making believe what is represented is actually there before us. Resemblance plays no part and, like the illusion view, it seems to capture well the role imagination plays in understanding pictures.

A seemingly fatal problem arises, though, in identifying exactly what is to be make-believedly seen when we look at a picture. For to get this right we need to know what the picture represents – and this is precisely what the theory is meant to be explaining. Also, like the semiotic view above, make-believe has difficulties in explaining how, for example, young children can learn to recognise objects by first seeing them in picture books.



Seeing-in is a psychologically based theory, developed by Richard Wollheim, which places visual experience at the centre of pictorial representation and exploits the human mind’s innate capacity to generate visual experiences out of itself. Think of the last time you lay on your back on a summer day and mused on the passing clouds overhead. Seeing a face in the clouds, like seeing a man in the moon or a battle scene in the stains on a wall, are all examples of seeing-in. The difference between seeing a man in the moon and a man in a picture, though, is that there is a correct way to understand the picture, which is set by the intentions of the person who made it, in conjunction with our observation of the marks on the surface of the picture (ink jet dots, paint blobs, pencil lines, and so on).

Think of the last time you lay on your back on a summer day and mused on the passing clouds overhead. Seeing a face in the clouds, like seeing a man in the moon or a battle scene in the stains on a wall, are all examples of seeing-in.


Our natural capacity for seeing-in – seeing things in pictures which may be absent from view (such as the puppeteer who is in Paris, or, 16 years after the photograph was taken, is probably no longer standing there) or fictional things (such as a unicorn) – is based on what Wollheim calls ‘twofoldness’. Twofoldness is our ability to see simultaneously the surface marks of a picture and the effects of these marks (the subject matter) as two elements of one and the same perceptual experience. Note that there is no 'flickering' here.

Wollheim contrasts this psychological account with the semiotic theory. He says that we don’t ‘read’ a picture to perceive what is visually represented, seeing-in is more basic and logically prior to this. For example, in a painting we may ‘read’ the lamb at the foot of a cross as a symbol for Christ, but that ‘reading’ cannot build into the experience of representational seeing because we must first see the lamb in the picture before we can recognise it as a symbol.

The weakness of the seeing-in view lies in whether you are willing to accept the ‘twofoldness’ of visual experience. It is difficult to see how you can say much more about this at present. Wollheim himself is openly sceptical of any deeper explication of twofoldness through either psychology or biology.

The strength of the seeing-in thesis is its bedrock grounding in visual experience. We do have an innate ability to generate visual experiences: dreams are the most dramatic examples; seeing faces in clouds seem good waking examples. There is no need to deal with the seemingly slippery concepts of resemblance, illusion and make-believe, and it avoids the unintuitive aspects of the semiotic view.

The seeing-in view also packs a bonus. Some pictures are difficult to understand. In art, conceptual, abstract, cubist and naive pictures, for example, can take time and work to understand and see what is represented. Similarly, what is represented in some ‘unusual’ photographs can also be difficult to understand – such as photographs produced using long-exposures, multiple exposures, strobe lighting, microscopes, unusual light sorces or digital manipulation (which don’t really look like their subjects under ‘normal viewing conditions’). The resemblance and illusion views often find it difficult or impossible to explain how we see what is represented in these difficult types of pictures – some of which simply stand marooned outside of the theories.

Because the seeing-in theory states that the picture maker sets an ‘intentional standard of correctness’ (either consciouly or unconsciously), this gives us a firm position from which to begin in trying to understand difficult or obscure pictures. The measure of success for a photographer or other picture maker is the extent to which we (the ordinary viewer) can understand what their picture represents.


And so …

I find the seeing-in thesis a persuasive and natural account of how photographs and other pictures represent things in the world and outside it. I think it captures in a natural manner how we experience ordinary, realistic photographs, and it makes sense of how we come to understand more unusual and difficult non-record ones. The twofoldness of perceptual experience is still an unknown quantity, but one I am willing to accept until it is disproved or superceded by some further refinement.

Is this your experience of photographs?


Jim Batty


Text and photographs copyright © 2021 by Jim Batty. All rights reserved.

Feel free to link to or share this page. No part of this work may be reproduced, stored in a retrieval system, or transmitted, in whole or in part, in any form or by any means, without signed written permission from the author. A standard publishing fee is payable in advance for any editorial or commercial use.


For a fuller philosophical treatment of photographic understanding and the role of intention, try my MPhil thesis   An Intentional Understanding of Photographs (2002)

Be the First to Comment

This website uses web stat cookies. And nothing else.More details