Experimental Procedure

In the version of the “visual-world” paradigm used here, participants see a visual display consisting of four black and white line drawings representing four distinct objects, arranged within a grid. An utterance presented through headphones mentions one of the objects. Participants’ task is to click on and move the mentioned object to another location on the screen, using the computer mouse.

 

 

Participants’ eye movements are monitored throughout the trial using a head-mounted eye tracker. (We are currently using the Eyelink II eye tracker, which samples eye coordinates with a frequency of 250 or 500 Hz.) The eye tracker consists of two miniature cameras that record the size of the pupil and the corneal reflection for each eye (which allows us to infer the position of the eye with respect to the head), as well as a camera recording the light reflection on a set of markers  positioned on the monitor that displays the visual stimuli.  By computing the position of the eyes with respect to the position of the markers, we can infer, in real time, the direction of participant’s eye gaze on a predefined area (the screen on which the visual stimuli are displayed).

 

 

 

 

 

The eye movements we are interested in are those generated as participants hear the name of the picture they must click on, thus from the onset of the spoken word candle (taken to be time 0). At that point, they may be fixating any picture on the screen (e.g., the picture of the necklace). As the speech signal unfolds and people gain access to the sounds of the picture’s name, they typically initiate an eye movement to a picture with a name that is consistent with the spoken information available so far. For example, the eyes may move to the picture of the candle, landing on it at time 60, moving away from it at time 340, landing on the picture of the candy at time 376, moving away from it at time 598, and finally landing on the target picture candle at time 642.

 

 

 

Thus, the pattern and timing of the eye movements performed as the name of the target picture is heard is taken to reflect on-going interpretation of the spoken word. For instance in the example above, both candy and candle interpretations appear to be simultaneously entertained for a brief period of time.

 

 

If we average a large number of trials from many participants, we can generate a plot showing how the proportion of fixations to each of the pictures changes over time. Here, the probabilities of fixating the candy and candle pictures increase simultaneously, while the probabilities of fixating the pear and necklace pictures decrease. Later, presumably when the spoken input begins to disambiguate between the candle and candy interpretations, the probability of fixating the candle picture continues to increase while that of fixating the candy picture begins to decrease.

 

 

 

 

Armed with this procedure for collecting and analyzing eye-movement data, we can address a large range of questions on spoken-word processing (see Publications for further detail).

 

 

Home