Scene Classification by Humans and Artificial Vision Systems

Dr. Lester Loschky

Department of Psychology

Dr Loschky's work revolves around improving artificial vision systems. For this study, recent work in the laboratory has investigated the relative importance for scene classification of information from the center of the field of view versus the visual periphery. This is related to the use of multi-resolutional imagery, in which one puts highest image resolution in the center of vision, and lower image resolution in the visual periphery, as a means of economizing on image processing resources and transmission bandwidth (Loschky, McConkie, Yang & Miller, 2005; Reingold, et al., 2003). Our research has shown that imagery at the center of vision is more useful on a per-pixel basis, than imagery in the visual periphery. Specifically, viewers need about twice as many pixels in the periphery as they need in the center of vision in order to achieve equal scene classification performance. The method we used to come to these conclusions involved presenting scene images to viewers in which either central or peripheral scene information was blocked from view, as shown in Figure 1.

Figure 1

In “window” conditions, viewers only saw normal image information within a circular region at the center of the image, and all information outside that circle was blocked from view, i.e., replaced with gray. The reverse of this was in the “scotoma” conditions, in which viewers only saw normal scene information outside of a circular gray blocked-out region at the center of the image. Various sizes of exposed images were explored, and the critical radius (illustrated in Figure 1) was such that a window condition containing 30% of the image pixels produced equal performance to a scotoma condition showing the remaining 70% of the image pixels.

This was observed for ground-based imagery, but different perspectives may have different requirements for optimal scene classification. Specifically, it is possible that classification of aerial views of scenes requires a broader view, with more peripherally located information, whereas classification of oblique ground-based views of scenes requires a narrower view, with more centrally located information?

Also, due to the methods used for keeping subjects' visual attention central, the size of the periperal view necessary for classification may be affected. We hypothesize that the more broadly a subject focuses their attention, the larger the critical radius will be. Thus, we propose to do further research using an eye tracker, to ensure that subjects are fixating the center of the screen at the moment the image is flashed on every trial. We are therefore requesting funding to purchase an eye tracker and a dedicated computer to run its software.

Valid XHTML 1.0 Transitional