Tuesday, December 11, 2007
Last week (3-6th December, 2007) I attended a Mini-Symposium on Representations of the Visual World in the Brain organised by the Rank Prize Funds. The symposium was a small gathering of highly respected established researchers and young researchers across the field of Visual Science. Attendees included Ron Rensink (UBC), Nancy Kanwisher (MIT), Mike Land (Sussex), Ben Tatler and Ben Vincent (Dundee), Jens Helmert (Dresden), Jim Brockmole (Edinburgh), Melissa Vo (Munich), and many others from across Europe and North America.
The symposium was an incredibly stimulating, intense experience and I have to express my immense gratitude to the Rank Prize Funds for organising it (aside: the Fund was established by the late Lord Rank to support scientific research in his two main interests Nutrition and OptoElectronics; Rank was also the founder of Rank Film....a rather coincidental overlap with my interests). The symposium was held in the wonderfully picaresque Wordsworth Hotel, Grasmere, Cumbria. Not that I got to appreciate much of it as I was too busy being intellectually stimulated.
My enjoyment of the symposium was rounded off by my being awarded the Prize for the Best Presentation by a Young Researcher for my presentation entitled 'Facilitation of Return'. It is a great compliment for the work I am doing with John Henderson to be acknowledge by such a distinguished group of researchers.
.....and Lord Rank
Thursday, November 15, 2007
Richard Wiseman, the magician, psychologist, science communicator and author of Quirkology: the Curious Science of Everyday Lives asked me and John Henderson to measure viewers’ eye movements whilst they watched one of his magic tricks. The Colour Changing Card Trick is a very clever use of a perceptual phenomenon known as ‘inattentional blindness’. Check out the trick before reading on:
Inattentional Blindness is an absence of awareness of some detail or event in the visual world due to a failure to attend to it. This absence can often be strikingly large (as in Richard’s card trick) or, most famously in Simons & Chabris (1999) ‘Gorillas in our Midst' experiment. In the Simons & Chabris experiment subjects were told to watch a video of two teams, one wearing white, the other wearing black pass basketballs within their teams. Half of the subjects were told to count the number or passes made by the team wearing white. The other half were told to count the passes made by the team wearing black. Half way through the video a man wearing a black Gorilla suit walked through the scene, stopped in the middle of the scene, waved at the camera and then walked out of shot. When asked after the video if they had noticed anything odd during the video, the majority of subjects failed to report the Gorilla! The probability of noticing the Gorilla was greater when the subjects had been instructed to attend to the black team (58% detection) compared with the white team (27%) indicating that the task had biased the subjects attention either towards black or white objects. The subject’s selective attention shapes the details of the scene that reach the level of conscious awareness and subsequent memory but, importantly the subject is not aware that their awareness is in any way partial. This mismatch between what the subject think they see and what they actually see is what creates the shock at the end of the Gorilla experiment or the Colour Changing Card Trick.
The Colour Changing Card Trick uses a simple card trick to distract viewer attention from what is actually going on, namely the changing of both presenters’ T-shirts, the backdrop, and the table cloth. Such misdirection is a classic tool of any magic performance. All changes are made off camera when the continuous camera shot zooms in to a close-up. This removes the actual change itself from view, leaving only the result. In order for viewers to notice the change they must have previously attended to the object that has changed and have sufficient memory of that object to notice that its current form is different. By measuring viewer eye movements during the trick we can see whether viewers attend to the objects before the changes and whether there is any increase in attention to the objects after the change. Such an increase may indicate the precise moment at which the viewer notices the change.
The results are still being analysed but for now Richard has posted a video illustrating the eye movements of 9 subjects whilst they watched the trick (5 men and 4 women). The one red spot is the gaze location of a woman who detected the Female presenter’s T-shirt change.
This video was created using my own Gazeatron software.
As can be clearly seen from the video most viewers look in roughly the same parts of the scene at the same time. This close control over where viewers are looking is exactly the intention of the magician. By ensuring such systematic viewing the magician can hide changes/manipulations in the unattended spaces. It is only once one of the viewers notices the change (the red spot) that their gaze location begins to differ from everyone else’s.
If you want to know more about the psychology of misdirection and its relationship to eye movements check out Gustav Kuhn’s research at the
Tuesday, July 10, 2007
The technology available for recording the direction of a person’s gaze has taken massive leaps forward in the last few years. The cost, precision, usability, and, most importantly discomfort of using an eye tracker has now reached a level where any psychology research lab or usability/HCI assessment centre can use eye tracking. Recording the focus of a person’s overt attention (as well as other measures such as pupil dilation, blink rate and eyelid closure) can provide a real-time measure of their experience and an indication of their cognitive processes. Such an insight can complement existing methodologies and allow researchers to understand the dynamics of human experience.
At the recent Experimental Psychology Society conference held here in Edinburgh, eye tracking was clearly rising in popularity as many different researchers applied it to areas such as reading studies, lexical processing of speech, facial expression recognition, social attention, object perception, working memory, visual search, scene perception, and, of course, attention research. The current popularity of eye tracking can be directly related to the increases in technology. Most of the researchers currently using eye tracking are not, primarily attention researchers. The tools provided by the eye trackers allow them to use eye movements as an index of other phenomenon such as real-time cognitive processes. Previously, a research lab would be required to build their own display and analysis tools from scratch (using programming environments such as C or Matlab). This placed eye tracking research clearly out of reach for most people and required a considerable understanding of basic oculomotor control and attention. I believe the current renaissance of eye tracking (of which
One big obstruction to the advancement of eyetracking in to other areas of psychology is its current incompatibility with dynamic scenes. Many areas of psychology are interested in understanding human behaviour in realistic settings such as social interactions, conversations in the real-world, moving through the world, performing actions, complex tasks such as driving, and even watching TV (or is that just me :). As anybody who has ever tried to record eye movements in one of these settings will know, it is phenomenally complicated and time consuming. For example, Mike Land’s influential research on goal-oriented attention during tea making required the construction of novel eye movement technology and the hand coding of every frame of the resulting video! It is no wonder that eye tracking of dynamic scenes is so uncommon when the analysis is so laborious.
The situation has recently got a lot better with the introduction of new eye tracking software. The Tobii and Eyelink (SR Research) eye tracking systems now come with software for displaying dynamic scenes (e.g. videos, animations, etc). However, being able to display the videos is pointless if there is no easy way to analyse the resulting data. No systems, that I am currently aware of assist in the analysis of eye movements in dynamic scenes.
This lack of support for eye movement researchers interested in using dynamic stimuli has motivated me to make a call-to-arms. I know of a growing number of researchers, both in academia and industry who are struggling with the problems, both practical and theoretical associated with recording eye movements in dynamic scenes. No support structure exists for these researchers, no common source of knowledge or tools, and no where they can go to ask for help. Because of this I’ve decided to put out a call to all researchers using or interested in performing eyetracking in dynamic scenes. The Eyetracking and Dynamic Scenes [EDS] Interest Group will comprise a mailing list to which members can post information relating to their research, ask for help, and post useful resources (e.g. software) and references.
If you are interested in signing up for the Eyetracking and Dynamic Scenes [EDS] mailing list, please e-mail me at tim.smith [at] ed.ac.uk with the subject “[EDS] registration” and include the following content in the body of your e-mail:
Position e.g. Research Fellow
Affiliation e.g. name of University or business
Short summary of your research interests.
Existing eye tracking equipment you use.
Together we can make the experience of researching eye movements in dynamic scenes pleasurable, practical, and painless.
Saturday, March 17, 2007
I’ve been invited to give a presentation to the Centre of Cognitive Neuroscience and Cognitive Systems at the
The following morning I’m giving a guest lecture as part of Murray Smith’s Cognition and Emotion in Film course. This sounds like a fantastic course. I wish there had been a similar course when I was an undergraduate. Murray Smith is a very active member of the cognitive film theory community and particularly the Society for Cognitive Studies of Moving-Images. His work on emotion and empathy in film viewing is very influential.
I’m looking forward ton bouncing ideas around with his students.
Now that David has alluded to my findings and given a brief description of my presentation you’re probably interested in finding out more. Sadly, I’m going to have to ask you to watch this space just a while longer. As is the way in academia, the publication of academic papers takes a long time and the paper that describes my findings is not yet ready for public distribution…..I know, I know: I’m a big tease. I promise you it will be worth the wait and I’ll publish the paper on my blog as soon as it is available.
In the meantime I can give you a glimpse of the “little yellow dots” David refers to.
This image is a screengrab of a software tool I created called (rather cheesily) Gazeatron. The tool allows me to plot the gaze position of multiple viewers on to the video they were viewing. The image above illustrates where 17 people were looking during this frame of the film (the yellow spots were viewers who could hear the audio and the pink were viewers who could not). Gazeatron allows you to see the same data in real-time as the video plays. By observing the swarming behaviour of the gaze positions whilst multiple viewers watch a film you gain an incredibly detailed insight into the viewing experience. Gazeatron also provides automated analysis for features of the eye movements to provide objective measures in supplement to the subjective observations.
Existing eye tracking tools do not allow you to analyse film viewing in this way and, I would argue reducing viewer attention to a film to static screenshots or data points does not give you a feel for the dynamics of the viewing experience. I’ll work on posting a video of Gazeatron so you can all see what I mean.
A bit of background on eye tracking. Each spot in the image above represents the point where a single viewer is looking. This is important as it tells us, roughly the part of the visual field they are attending to and, therefore processing at any moment in time. You may think you are aware of the whole visual field but in fact you are only able to process a very small portion to a high degree of accuracy at any one time. When you want to process a new part of the visual field you shift your eyes (perform a saccadic eye movement) so that the light from the new target is projected into the region of highest sensitivity in the eye, referred to as the fovea. These saccadic eye movements are very quick and we are not aware of them as our brains “stitch” the images either side together to create the impression of a stable visual world. By recording these eye movements we can infer the moment-by-moment experience of a viewer.
Eye movements can be recorded using a technique known as Eye tracking. There a variety of ways to track somebody’s eyes such as sclera coil and dual-purkinje (some clearly more scary than others). The most common technique used today, and the one I use is Corneal-Reflection. These trackers shine infrared lights onto the eye and film the reflected image using an infrared camera. By locating the iris and the infrared light reflected off the cornea the gaze of the viewer can be calculated. The gaze is simply a vector pointing out from the viewer’s eye into space. Therefore, eye trackers can be used to tell us where people are looking on a computer screen, table top, real-world interaction or….whilst watching a film.
The eye trackers I use are the Eyelink II head-mounted and the Eyelink 1000 tower mounted tracker, both from SR Research. These trackers are located around the
Tracking a viewer’s eyes whilst they watch a film is not as simple as you might think. The Eyelink trackers all come with software that allows you to present videos but they do not, currently have accompanying tools for analysing the eye movement data in the way I’ve described above. Most other trackers do not provide assistance in presenting films and a lot of previous researchers have resorted to tracking viewers using a head-mounted real-world tracker and recording a video to see what they are looking at (a similar technique is used in driving studies). The only other tracker I have used that is suitable for presenting films is Tobii. This system is incredibly easy to use as it is focussed at usability studies and as a hands-free interface for disabled users. The Tobii eye trackers are incredibly well designed but their price, ~£17,000 puts them out of the reach of most users (the price issue is the same with all eye trackers). Their accuracy is also not as good as the Eyelink systems which is why most vision researchers don’t use them.
If you’re looking for a cheaper option there is the option of building your own eye tracker. Derrick Parkhurst has developed open-source software and instructions for how to construct the necessary hardware to build your own eye tracker. The openEyes project is a great idea although I’m yet to have a go. If you have a go, best of luck and tell me how it goes.
If anybody has any further questions about eye tracking and film please either post a comment below or e-mail me.
As for what I have found by eye tracking film viewers well……that’ll still have to wait. Sorry. For the time being I hope you enjoy the picture of little yellow and pink spots. Who’d have thought seeing spots could be so useful!
Tuesday, March 13, 2007
I am writing this blog post from a hotel room in
Thanks to The Comm. Arts department for being so incredible hospitable, special thanks to Jeff Smith for being my guide and David Bordwell for being so receptive to my imposition.
Now some background: the reason why this visit to
Who knows, maybe the technology will suddenly take both a cost and technological leap forward and it’ll become accessible to all. Watch this space……
Returning to SCMS, the conference was held in the Chicago Hilton (very swish….well the lobby is anyway; the conference presentation rooms/bedrooms are a tad odd). I was a complete conference geek, attending almost every session. Considering that the days ran 8:15am-8pm this is quite an achievement. The reason I attended so many sessions was because of the incredible range of interesting presentations. Everything from a bit of Cognitive Film Theory (Jonathan Frome, Joe Kickasola, Mark Minett), masses on New Media, Interactive Media, and Videogames, emotions and film, including discussion of automatic facial expression recognition (Kelly Gates), and even a presentation on the Queering of Kevin Smith (it doesn’t take much ;)…Carter Soles). This year there seemed to be a lot of presentations on the impact on-line distribution, web video, and interactive TV and media such as videogames were having on our classical theories of film and television. Fascinating stuff. One of my most satisfying panels was debating the implications of interfaces for interactive TV content e.g. TIVO and PVRs, and their effect on our relationship to the film/TV content. Does the interface, which is meant to empower the viewer by allowing them access to the content actually compete with the content itself?
So, all-in-all a great conference and trip to
Monday, January 08, 2007
Following on from my discussion of Lars Von Trier’s Automavision and Lookey I thought you would like to see some examples of Automavision. David Bordwell, the author of a phenomenal number of outstanding books on the subject of film has written a very interesting blog post on the subject here. He has had the good fortune of viewing The Boss Of It All, unlike myself and capturing some screenshots. The effect is intriguing. Automavision appears to create unmotivated, and classically imperfect framings which Von Trier accentuates by cutting rapidly between very similar shots. Bordwell notes that the result is that almost every cut is a Jump Cut, a violation of the 30 degree rule that causes the image to jump uncomfortably and creates ambiguous temporal relationships between the shots. The same effect occurred in Dancer in the Dark (see a video example here) and can be said to have contributed to the overall discomfort felt by viewers of the film.
What is most interesting about Von Trier’s use of Jump Cuts is that, whilst they abandon the classic continuity style’s preservation of temporal continuity within scenes they still retain a clear cohesion that allows the viewer to understand the action represented. It is almost as if Von Trier, in consciously violating the dimensions of continuity prescribed by the classic continuity style he is revealing extra dimensions of continuity that he uses to create extra significances within his films.
Also, as noted by Antithesis Boy in his comment on my last post the Automavision technique shares a lot of similarities with virtual camera control in videogames. Automavision uses a computer to randomly generate framings for the camera and the result is non-classical framings. Videogames often take place in a 3D virtual environment and require a virtual camera to follow the action within this space. This camera is computer controlled and the objective is to create the best framings possible (e.g. Jhala and Young, 2006) but the result is often unacceptable and bizarre framings. The difference between the two systems is that a Virtual Camera is positioned relative to the objects within a scene where as Automavision (as far as I can make out) does not care what the scene is. This is why Automavision loses significant objects off the edge of the screen or frames them oddly. In his blog post, Bordwell notes that Von Trier does not always choose the most outrageous framing generated by Automavision indicating that he realises that the unconventional framings have a particular effect on the viewer and should be mixed with more conventional compositions to create the intended viewing experience. If Automavision is truly random, as implied by Von Trier then there is no way to control the degree to which the resulting framings are unconventional. This means that Von Trier must keep refilming and regenerating framings with Automavision in order to get suitable shots.
The next generation Automavision system could improve its ability to generate unconventional framings by incorporating some of the intelligence of Virtual Cameras. If it is able to apply the classical framing conventions then it can knowingly violate them. This would also allow it to modify the conventionality of its framings by varying the relative influence of chance and the framing conventions. To do this it would need to begin processing the visual scene, which is no easy task and deciding where the significant objects should be positioned in the frame. An interesting reverse application of this technique would then be to apply Automavision 2.0 to a videogame. Anybody up for a game of Halo directed by Lars Von Trier?