The cognitive interview: Does it successfully avoid the dangers of forensic hypnosis?



University of Pennsylvania School of Medicine

Seventy-two undergraduates viewed a videotape of a bank robbery that culminated in the shooting of a young boy. Several days later, participants were interviewed about their recollection of events in the film through baseline oral and written narrative accounts followed by random assignment to a hypnosis (HYP) condition, the cognitive interview (CI), or a motivated, repeated recall (MRR) control interview. Participants also completed a forced interrogatory recall test, which indexed potential report criterion differences between the interview conditions. In terms of information provided for the first time during treatment interviews, HYP led to greater productivity than the CI or the MRR interview, which did not differ significantly from each other. Evidence that these differences in recall resulted primarily from report criterion differences rather than differences in accessible memory was obtained from the forced interrogatory recall test. In this test, no differences were observed between the three interview conditions. Finally, the data revealed that participants’ hypnotic ability was associated with the recall of erroneous and confabulatory material for those tested in the HYP and CI conditions but not those in the MRR condition. This suggests that some CI mnemonics may invoke hypnotic-like processes in hypnotizable people.

Eyewitness accounts provide a crucial source of evidence in criminal investigations. Accordingly, the law enforcement community has had a longstanding interest in the development of techniques to increase the accuracy and detail of reports by eyewitnesses and victims of crimes. Forensic hypnosis is one such technique, with a number of advocates (e.g., Arons, 1967; Hibbard & Worring, 1996; Kroger & Douce, 1979; Niehaus, 1998; Reiser, 1980; Schafer & Rubio, 1978) who have provided persuasive anecdotal evidence of its value for enhancing eyewitness recall. Nevertheless, their claims have proven difficult to evaluate because the criteria for determining how helpful hypnosis was are subjectively determined and vary from case to case, much of the information reported consists of recollections that cannot be verified, and the new information obtained might have been provided eventually without the use of hypnosis.



The majority of laboratory findings regarding the effectiveness of hypnosis for memory enhancement are also at odds with the claims of supporters of forensic hypnosis. Many controlled studies found hypnosis to have no unique hypermnesic properties, and often it yields less accurate information than normal waking recall efforts (Dinges et al., 1992; Dywan & Bowers, 1983; Nogrady, McConkey, & Perry, 1985; Sheehan, 1988). A common interpretation of these findings is that hypnosis elicits additional recall by relaxing the hypnotized person’s report criterion or standard of certainty, causing him or her to become willing to report doubtful information that might otherwise have been withheld (Klatzky & Erdelyi, 1985; Orne, Whitehouse, Dinges, & Orne, 1988; Whitehouse, Dinges, Orne, & Orne, 1988). That is, various features of the induction, experience, and context of hypnosis encourage people to suspend their critical judgment, which in turn allows them to report a greater number of recollections, memory fragments, and inferences with certainty. Of course, there are some instances in which such information might facilitate a criminal investigation by providing new leads that can be corroborated by other, independent evidence. At other times, however, an eyewitness’s hypnotically derived recall might actually hamper an investigation (e.g., State v. Mack, 1980) or lead to the arrest of an innocent person (see Connors, Lundregan, Miller, & McEwen, 1996). Moreover, because jurors often are persuaded of the truthfulness of witnesses by the amount of detail provided and the confidence with which testimony is given in court, hypnotically influenced testimony presents a serious challenge to the judicial system (Orne, 1979). Thus, jurors become burdened with the responsibility of determining whether the additional information elicited by hypnosis is factual or erroneous, a distinction that previously hypnotized witnesses often are unable to make. This has led some courts to confine the testimony of witnesses exposed to hypnosis to only the information known to them before hypnosis. Unfortunately, even this determination cannot be made reliably (Whitehouse, Orne, Orne, & Dinges, 1991). In view of these concerns, the majority of US state supreme courts that have considered the question of hypnotically elicited testimony have either banned or severely restricted its admissibility in court.

As an alternative to the use of hypnosis in criminal investigations, Geiselman et al. (1984) developed the cognitive interview (CI), a promising interrogation procedure that combines a number of retrieval mnemonics that have been validated in laboratory studies of human memory. The objectives of these mnemonics are to maximize the featural overlap of the retrieval context with the memory trace of the incident in question and to encourage the witness to use several memory search strategies (i.e., mentally reconstructing the circumstances that surrounded the incident, including cognitive, physiological, and emotional states associated


with the original event; reporting every detail, regardless of its perceived importance; recalling events in a different order from the way in which they occurred; and adopting the perspective of someone else involved in the incident in order to describe the events from that person’s point of view). At relevant points in the interview, more specific prompts may be suggested to help the witness to remember, for example, the suspect’s appearance and speech or other important details such as a license plate number.

Follow-up research in the United States and Europe has been carried out to determine the active components of the CI (Memon, Cronin, Eaves, & Bull, 1996), its efficacy in actual criminal investigations (Fisher, Geiselman, & Amador, 1989) and with children, adult, and older adult eyewitnesses (Mello & Fisher, 1996; Memon, Holley, Wark, Bull, & Köhnken, 1996; Saywitz, Geiselman, & Bornstein, 1992), and the extent to which certain refinements, such as the addition of rapport-building strategies, use of imagery, and attention to the witness’s emotional circumstances and communication skills, have further increased the effectiveness of the CI (Fisher, Geiselman, Raymond, Jurkevich, & Warhaftig, 1987; Fisher & Geiselman, 1992; Geiselman & Fisher, 1997). These latter modifications, which have come to be known collectively as the enhanced CI (ECI), were intended to supplement the cognitive components of the CI with techniques that focus on optimizing the communication aspects of the interview (e.g., transfer of control from the interviewer to the witness, not interrupting the witness, sequencing questions to accommodate the witness’s report rather than using a standardized questioning procedure).

A recent meta-analysis of 42 studies (Köhnken, Milne, Memon, & Bull, 1999), which represented both versions of the CI, found the average effect size for correctly recalled information after a CI as opposed to a control interview to be quite strong (d = .87), despite large differences in interviewer variables, retention interval, interviewee population, and type of control condition across studies. However, the CI also produced a greater amount of incorrect details relative to control interviews, but the effect size estimate (d = .28) was much smaller than it was for correct details. Interestingly, the overall accuracy rate (proportion of correct information relative to total information reported) across these studies was essentially the same for CI (85%) and control (82%) interviews. Also of interest was the comparison of CI interview type (original CI vs. ECI), in which a significantly greater number of incorrect details was associated with the ECI, although the effect sizes for correct information did not differ significantly between the two versions of the CI when compared with control interviews.

In an early study using films of simulated crimes as stimuli and experienced law enforcement personnel as interviewers (Geiselman, Fisher,


MacKinnon, & Holland, 1985) the original CI was evaluated against a hypnosis interview and a standard police interview. The findings revealed that hypnosis and the CI produced comparable results and that both yielded a greater amount of correct information than did the standard interview. At the same time, neither resulted in more incorrect information than was elicited by the standard interview.

Although the latter finding is not entirely consistent with the evidence from numerous scientific studies that hypnotically elicited recall tends to be unreliable (see McConkey & Sheehan, 1995; Orne et al., 1988), there are certain unique procedural aspects of the study by Geiselman et al. (1985) that should be taken into account. For instance, the law enforcement interviewers in their study were permitted wide latitude in the conduct of the hypnosis interviews. The unknown variation in technique and suggestive metaphors used in the hypnosis interviews makes it difficult to identify features that might have contributed to their finding of equivalence between hypnosis and the CI. Another concern is their use of a brief hypnotizability assessment instrument, which has been shown to correlate poorly with standardized scales of hypnotic ability (Orne et al., 1979). The adoption of such an instrument renders their sample essentially unselected for hypnotizability and precludes evaluation of any mediational role of hypnotic ability in influencing response to the interview procedures. This issue is important because, to be considered a viable and distinct option to the forensic use of hypnosis, it is desirable that CI mnemonics be shown to selectively elicit additional accurate recall rather than the wholesale productivity enhancement that is characteristic of hypnosis and typically is correlated with the participant’s hypnotic ability. Without this assessment it is difficult to know whether certain CI instructions serve merely as surrogates for hypnotic procedures, thereby unleashing attendant concerns about the accuracy and credibility of subsequent eyewitness testimony.

In this regard, several clinicians and researchers have raised questions about the distinctiveness of the CI techniques from hypnosis and kindred procedures. For example, in a conceptual critique of CI mnemonics, Allison (1996) outlined the correspondence of the various CI components with techniques and suggestions often used in forensic hypnosis interviews. Because hypnosis can occur in hypnotizable people without a formal induction procedure (Estabrooks, 1948), the possibility exists that certain CI directives (e.g., context reinstatement) may inadvertently induce hypnotic experiences (e.g., age regression) that elicit recollections based on combinations of fact, fantasy, and confabulation, thereby creating the same potential for historical revision that plagues hypnotic interviews. Another expressed concern (e.g., Memon & Stevenage, 1996) is that the CI instruction to “report everything” might encourage witnesses


to lower their subjective criteria for reporting information as factual, one of the mechanisms by which hypnosis is believed to increase memory output (Dinges et al., 1992; Klatzky & Erdelyi, 1985; Whitehouse et al., 1988). Still other investigators (Dobson & Markham, 1993; Douglas, 1996; Schooler & Loftus, 1986) have called attention to the importance of related individual difference variables such as imagery ability, developmental factors, and suggestibility in determining the effectiveness of interrogation procedures.

Thus far, only two studies (Dasgupta, Juza, White, & Maloney, 1994-1995; Geiselman et al., 1985) attempted to compare the CI with hypnotic interviews, and both reported the two techniques to be equally effective. At the same time, however, both studies failed to observe the characteristic increase in erroneous information and confabulations associated with the use of hypnosis for memory retrieval (McConkey & Sheehan, 1995; Orne et al., 1988), and both used the same brief hypnotizability assessment, which has poor validity (Orne et al., 1979). Given that hypnotically elicited testimony has been a major source of concern in the judicial system for several decades, we believe that more research, which devotes attention to these details, is needed to ensure that the CI truly overcomes limitations associated with the investigative use of hypnosis and does not itself compromise evidence in similar or other ways.

The current study is an initial step in this direction, attempting to replicate the findings of Geiselman et al. (1985) with respect to the recall productivity elicited by hypnosis and the cognitive retrieval techniques of the original CI. We adopted for study the original CI used by Geiselman et al. (1985) and by Dasgupta et al. (1994 -1995), as opposed to the ECI, because the ECI retains the major cognitive retrieval mnemonics of the original version (i.e., reinstate context, report everything, change order, and change perspective), our implementation of the CI included several features of the ECI (e.g., rapport building, active witness participation, promoting detailed responses, and sequencing; however, it did not include asking witnesses to close their eyes and visualize the events that they described), and, with the exception of one report by Fisher et al. (1987), a nearly equal number of studies involving either the ECI or the original CI show no significant difference in effect size when compared with a control interview (Köhnken et al., 1999).

One distinctive methodological departure in the present study is the use of a within-subject research design in which all participants were encouraged to provide a thorough oral and written report during a baseline interview before the treatment interview condition, which included similar oral and written components, was introduced. This design permits an evaluation of the extent to which the treatment interviews elicit additional information that was not previously accessible by the participant during


the baseline interview. The study also incorporates a widely used, standardized assessment of hypnotizability in the selection of the study sample and assignment to interview conditions, which makes it possible to determine the contribution, if any, of hypnotizability to interview outcomes. At the same time, we modified our control condition from Geiselman et al.’s (1985) standard police interview, in which experienced law enforcement offiers were instructed to carry out their usual interview procedures, to one that involved a plausible and motivating rationale to engage the participant in repeated retrieval attempts, a procedure we call a motivated, repeated recall (MRR) control condition. This control condition is similar to the structured interview (Köhnken, Thürer, & Zorberbier, 1994) that is used as a comparison condition in many European studies of the CI: It includes rapport building, attention to communication flow, and multiple retrieval opportunities, but it lacks the specific cognitive mnemonics of the CI.1 Finally, the current study provides a between-subject forced interrogatory recall assessment after participants have completed both baseline and treatment interviews. The purpose of this assessment is to establish the amount of information available to participants pertaining to investigationally relevant aspects of the witnessed event while obtaining ratings of their level of certainty. This permits an evaluation of whether the various interview conditions elicit additional recall beyond that which can be obtained by requiring the participant to report information without regard to certainty.

The design of the current study was also guided by efforts to reproduce conditions encountered in actual forensic investigations. For example, a common criticism of laboratory studies of eyewitness memory is that they do not address the kind of memory deficit that may occur in real crime situations, in which a witness or victim may have poor recollection because of traumatic amnesia or, possibly, repression. Accordingly, after extensive pilot testing of various stimuli, we used a filmed crime scenario that is documented (Loftus & Burns, 1982) and has been confirmed by us in pilot research to produce relative amnesia in some participants for specific details of the witnessed event.




Participants were 72 volunteers (36 men, 36 women) selected from a larger sample (N = 168) who participated in one of 17 small group screening sessions. The Harvard Group Scale of Hypnotic Susceptibility, Form A (HGSHS:A) of Shor and Orne (1962) and a number of psychological inventories 2 and research question-


naires were administered to all participants. The final study sample had a mean HGSHS:A score of 8 (range 4–12).


All participants viewed the stimulus film, “3:57 Friday Afternoon,” which depicts an armed bank robbery, followed by a chase through a parking lot in which a young boy is unexpectedly shot in the face, and ends with the robber’s escape in a getaway car driven by an accomplice. In previous research, Loftus and Burns (1982) demonstrated that witnessing the boy being shot resulted in a relative amnesia for certain details of the film, which was not observed among participants who saw the identical film but without the shooting segment.

Preliminary research carried out by our laboratory confirmed that memory was indeed poorer for events in the film that were proximal to the shooting incident. In addition, the film was rated significantly more emotionally upsetting than other films that we had screened, including the Los Angeles Police Department’s training film that had been used in past research on the CI by Geiselman et al. (1985).


Initial sessions, which included the group assessment of hypnotic responsiveness, were conducted in 17 groups, ranging in size from 4 to 14 volunteers, M = 10. Each session began with a research questionnaire period with one experimenter, followed by administration of the tape-recorded HGSHS:A with a second experimenter and two trained observers. After the hypnotizability assessment, participants completed their self-report response booklets, which were then collected for scoring by a third experimenter and two research assistants in another room to determine which participants would be asked to return for a second session. After a brief intermission, a fourth experimenter and assistant showed participants the 2.25-min film “3:57 Friday Afternoon.” Thus, each stage of the session used different members of the staff to maintain blindness. Immediately after the film, subjects completed a 13-item Film Evaluation Questionnaire, which inquired about their reactions to the film. The questionnaire was intended to provide closure and reduce the expectation that any future sessions might be concerned with participants’ memory of the events depicted in the film, thereby minimizing rehearsal or collaboration with other participants. When this was finished, participants were thanked and paid for their participation, and appointments were made for those who qualified for the second session.

Participants returning for Session 2 were randomly assigned to one of the three interview conditions, with the constraint that the groups were equivalent with respect to hypnotizability and sex. Session 2 consisted of a baseline phase and a treatment phase. The three experimenters (interviewers) were trained to administer the baseline and each of the treatment interviews. To ensure that potential differences between experimenters did not influence the baseline and treatment conditions, two procedures were used: The baseline was administered by one experimenter and the treatment by another, and each experimenter completed a third of all baselines and a third of all treatment interviews, randomly determined.


Participants were worked with individually during the second session, which occurred between 3 and 13 days (M = 6.5 days) after the initial group session. Each participant was escorted by an experimenter to a quiet room, seated in a comfortable chair, and equipped with a small clip-on microphone. They were then informed that the purpose of the session was to learn everything they could remember about the film they had viewed during the group session and were encouraged to report everything that they could recall about the details of the film. Upon completion of their oral report, participants were given paper and pencil and asked to write a detailed, comprehensive account of everything they could recall about the film. Once participants had completed their written narratives, they were returned to a waiting room and asked to complete a few questionnaires, after which they were introduced to a second experimenter.

The second experimenter escorted participants to a different room, where they were again seated in a comfortable chair and equipped with a small microphone. At this point, each participant was administered one of three treatments: hypnosis (HYP), the CI, or the MRR control condition.

Hypnosis interview. Participants in this group were initially informed that hypnosis would help them to relax and would enable them to tap subconscious levels of their minds, allowing them to remember even more details of the robbery film. After questions were answered, they were administered a hypnotic induction followed by deepening instructions and were then told that they were to watch “in their minds” a replay of the film as if it were appearing on a large television screen. They were told that during this replay, they would remain calm and comfortable, and they would be able to direct the replay as if by remote control. That is, they could slow the film down, speed it up, freeze the frame, and even zoom in on small details. This “TV technique” and the metaphors used in hypnosis drew heavily from the procedures outlined by Reiser (1980) and used in actual investigative situations. Participants were then asked to describe everything they were seeing and hearing. At the end of their oral recall, participants were asked a number of focused questions concerning specific details of the film (e.g., “How tall was the robber?” “Did anyone follow the robber out of the bank?”). Participants were then aroused from hypnosis, asked to rate how deeply hypnotized they had become (using a 10-point scale), and then given a brief mood scale to complete. Then participants were again given paper and pencil and were asked to provide a detailed final written statement of everything they could recall about the film.

Cognitive interview. Participants were initially informed that they would be able to remember additional details about the film if they were willing to try different ways of searching their memory. They were then presented with a 3" x 5" card that outlined four basic memory enhancement techniques (cf. Geiselman et al., 1985). After the participant had read the card, the experimenter described in more detail each of the techniques, which included reconstructing the circumstances surrounding the incident, reporting everything regardless of apparent importance, recalling events in a different order, and taking the perspective of another person in the film. Once participants indicated that they understood the various techniques, they were asked to think back to the last time they had been in the laboratory and had seen the film and were encouraged to reconstruct the circumstances and events of that visit. They were then asked to report everything


they could recall about the film, using the different techniques they had learned. After their oral recall, participants were asked a number of focused questions, completed a mood scale, and were asked to provide a comprehensive final written statement of everything they could recall about the film.

Control (MRR) interview. The MRR condition involved neither hypnosis nor key elements of the CI, such as reconstructing the circumstances surrounding the incident, changing perspectives, and recalling in a different order. Instead, participants in the control condition were given a credible rationale in an attempt to establish both an expectancy set and a level of motivation comparable to that of the HYP and CI conditions. This was done to ensure that whatever memory effects might accrue differentially from the interview procedures were not simply the result of a control group poorly motivated to recall and report information. Thus, participants in the MRR condition were informed that they would be instructed in the use of a number of basic memory retrieval techniques that would enable them to recall additional details about the film. (In fact, no specific instructions were given beyond vague allusions to information available in contemporary college textbooks.) For instance, they were told that being back in the laboratory (although in a different room from the one in which they originally viewed the film) might facilitate recall. 3 They were also told that searching their memory repeatedly could produce new information. Finally, they were told that by engaging in a variety of distractor tasks, they could overcome mental blocks, which would allow them to remember additional information. The distractor tasks were selected to be interesting and challenging. Participants completed five 1-min trials of a two-handed pursuit-rotor tracking task, after which they were asked to report everything they could remember about the film. After their oral recall, participants completed the Stroop Color-Word Interference Test and were then asked a number of focused questions, followed by completion of a brief mood scale. Participants then engaged in additional trials of the pursuit-rotor task, after which they were asked to provide a complete final written statement of everything they could recall about the film.

Forced interrogatory recall test. Upon completion of the second written recall, participants in all three interview conditions were escorted back to the waiting room and given a number of questionnaires to complete. Once finished, they were introduced to a new experimenter (not involved in the baseline or treatment interviews) who took them to a new room, seated them in a comfortable chair, and asked them to review their general perceptions and experiences as they related to participation in the experiment. During this postexperimental interview, participants were once again asked a series of questions concerning the film that called for specific responses. They were instructed to provide an answer for every question, even if they had to guess (i.e., forced interrogatory recall), and to rate each answer in terms of their confidence using a four-point scale ranging from 0 (just guessing) to 3 (certain). At the completion of the debriefing, participants were thanked and paid for their participation.

Scoring and data analysis

Data were obtained from the oral and written narratives provided during the baseline interview and the corresponding narratives produced during the treat-


ment interview and from responses to the forced interrogatory recall test conducted during the postexperimental interview. Narrative information was transcribed by three research assistants to typewritten form in a specially constructed format that was used to facilitate the derivation of scorable information units (IUs). 4 Four scorers who were blind to both treatment (i.e., HYP, CI, MRR) and classification (i.e., hypnotizability and sex) variables parsed the typewritten protocols into IUs and assigned them to three categories for each recall attempt: descriptions of people (e.g., the robber was a white man), actions (e.g., two bank employees ran after the robber), and objects (which included surroundings, e.g., the holdup note was yellow). A catalog of correct information about the film was developed by collating the detailed reports provided independently by three trained assistants, each of whom viewed the film seven times. The few inconsistencies across these reports were resolved by consensus of two additional viewers.

The recall protocols were scored by comparison of IUs against the catalog of correct information. In a few cases, participants reported details that could not be verified by reference to the catalog; these were resolved by reexamination of the film. Dependent variables submitted to analysis consisted of nonredundant information obtained from the oral and written recalls completed at baseline and that obtained during the treatment interviews: total correct, total incorrect, confabulations (i.e., filling in gaps with information that was not contained in the film), and attributions or inferences (e.g., “The teller was upset”). In the case of the treatment interviews, however, only new information obtained during oral and written recalls (i.e., not previously reported during baseline) was extracted for analysis.

Interscorer reliability was established on 20 written recalls and 20 oral recalls obtained from 10 pilot participants run through the entire protocol. The mean percentage agreement between the four scorers was 93.75%, range = 85 -100%.

Responses on the forced interrogatory recall test were scored as correct or incorrect, and the mean confidence ratings for both response types were computed for each participant.


Baseline recall

A preliminary analysis of variance determined that there were no significant differences in the mean retention interval for participants destined to receive either the HYP, M = 6.25 days, SD = 2.05; CI, M = 7.00 days, SD = 2.69; or MRR condition, M = 6.33 days, SD = 2.41; F(2, 69) = 0.72, ns. Each baseline dependent measure (i.e., total correct, total incorrect, confabulations, attributions) was submitted to a 2 (male, female) x 3 (HYP, CI, MRR) analysis of covariance, with the number of days since viewing the film serving as the covariate. With the exception of attributional information, all effects were nonsignificant (all Fs < 1.71), indicating that the various subgroups were equivalent with respect to the information that they provided at baseline (Table 1). With regard to attributional


Table 1. Mean (SD) correct, incorrect, confabulatory, and attributional or inferential material and accuracy rates for information reported during the baseline interview and new information reported for the first time during the treatment interview for hypnosis (HYP), cognitive interview (CI), and motivated repeated recall (MRR) control conditions

Information HYP CI MRR
Baseline interview      
Correct 71.71 (17.71) 64.92 (19.53) 74.21 (18.34)
Incorrect 9.25 (5.79) 11.04 (7.17) 12.17 (5.10)
Confabulations 0.58 (0.88) 0.88 (1.08) 1.17 (1.58)


7.38 (5.27) 8.25 (4.27) 7.25 (3.39)
Accuracy rate 81% 76% 78%
Treatment interview      
New correct 32.96 (10.40) 23.50 (11.01) 27.88 (8.61)

New incorrect

30.13 (18.35) 26.38 (14.61) 21.96 (9.84)

New confabulations

2.88 (2.58) 2.21 (2.13) 1.25 (1.75)

New attributions

5.83 (5.48) 2.75 (4.42) 2.42 (2.45)

Accuracy rate

46% 43% 52%

information, a sex x interview condition interaction was identified, F(2, 65) = 4.73, p = .012. Tukey HSD post hoc tests (alpha = .05) revealed that men who were about to be interviewed in the HYP condition reported nearly twice as much attributional information as women destined for the HYP interview.

Treatment interview recall

Novel information from the oral and written recall attempts derived from the treatment interviews and not previously reported during the baseline interview was partitioned into the following treatment measures: new correct, new incorrect, new attributions, and new confabulations. In turn, these measures were analyzed by 2 x 3 (sex x interview condition) analyses of covariance in which the number of days since viewing the film and the corresponding performance measure from the baseline phase were used as covariates. The analysis of new correct information derived from the treatment interview identified a significant difference between interview conditions, F(2, 64) = 5.24, p = .008. Tukey HSD tests (alpha = .05) determined that the HYP interview elicited significantly more new correct information than the CI. At the same time, the amount of new correct information elicited by the MRR interview was quantitatively intermediate, not differing significantly from either the HYP interview or the CI (Table 1). Analysis of new incorrect information obtained from the treatment interviews revealed a significant effect of sex, F(1, 64) = 8.13, p = .006,


whereby men reported significantly more incorrect new information than women. There was also a significant effect of interview condition on the reporting of new incorrect information, F(2, 64) = 4.17, p = .02, in which the HYP condition yielded reliably more new recall errors than the MRR condition, with the CI producing an intermediate number of new errors. Type of interview condition also influenced the production of new attributions, F(2, 64) = 3.66, p = .031, and new confabulations, F(2, 64) = 2.83, p =.067. In both cases, the HYP interview was found to elicit significantly more of these types of information than the MRR interview, and the CI produced an intermediate amount of such information. In the case of new confabulatory material, however, a sex x interview condition interaction also occurred, F(2, 64) = 5.62, p = .006, with men in the HYP condition producing a greater number of such responses than their counterparts in the MRR condition.

Forced interrogatory recall

The aforementioned analyses document the characteristic effect of hypnosis on recall productivity, but a more important question concerns whether any of the treatment interviews can be shown to be superior when productivity is controlled methodologically. This was assessed using a forced interrogatory recall test, which consisted of a fixed set of questions about the film that participants were required to answer by guessing if necessary. In addition to providing an answer to every question, subjects rated their confidence in each response. Under these circumstances, no differences between interview conditions could be detected in terms of either the mean number of correct responses, HYP = 37.5, CI = 38.0, MRR = 36.6, or the mean confidence ratings assigned to responses overall, HYP = 2.3, CI = 2.4, MRR = 2.2.

Hypnotizability and performance during treatment interviews

A one-way analysis of variance (anova) determined that the three treatment groups were equivalent with regard to hypnotizability, just as the sampling scheme was intended to accomplish, F(2, 69) = .315, ns, M = 7.79 for HYP, 8.25 for CI, and 7.83 for MRR. Next we conducted a series of hierarchical regression analyses to examine the extent to which each of the recall performance measures (correct, incorrect, attributions, and confabulations) was affected by the interaction of hypnotizability with interview condition, after controlling for the corresponding baseline measure (Table 2). The results of these analyses indicate no significant effects of hypnotizability, or its interaction with interview condition, in the retrieval of new correct information or new attributions. On the other hand, hypnotizability showed a significant main effect, and, more importantly, it interacted significantly with interview condition in predicting


Table 2. Hierarchical regression analyses examining the contribution of baseline performance, hypnotizability, interview condition, and the interaction of hypnotizability and interview condition (hyp × cond) on treatment memory performance

Dependent variable Predictor variable R2 Delta R2 F p

Correct recall

Baseline .218 .218 19.57 .01


.223 .004 0.39 .53


.272 .050 4.63 .04

Hyp × cond

.274 .002 0.14 .71

Incorrect recall

Baseline .083 .083 6.35 .01


.163 .080 6.63 .01


.247 .084 7.56 .01

Hyp × cond

.289 .042 3.93 .05


Baseline .020 .020 1.45 .23


.025 .005 0.35 .55


.123 .098 7.56 .01

Hyp × cond

.124 .001 0.08 .78
Confabulations Baseline .029 .029 2.06 .16


.117 .089 6.92 .01


.190 .073 6.10 .02
  Hyp × cond .269 .079 7.21 .01

new incorrect and confabulatory material. Post hoc Tukey HSD (alpha = .05) analyses of residual scores generated by the latter two regression models showed that the HYP condition produced significantly higher scores than either CI or MRR, but CI also yielded higher incorrect recall than the MRR condition. The nature of the interaction of hypnotizability with interview condition was determined by examining the relationship of hypnotizability to the generation of incorrect and confabulatory recall in each of the three conditions while controlling for initial baseline performance (Table 3). For both types of recall errors, it is evident from the magnitude

Table 3. Influence of hypnotizability on the production of new incorrect and confabulatory information in each interview condition (after controlling for corresponding baseline performance)

Dependent variable

Condition ß coeficient t p
New incorrect recall Hypnosis .407 2.097 .048

Cognitive interview

.388 2.080 .050
  Motivated repeated recall -.070 -0.404 .690
New confabulations Hypnosis .551 3.032 .006
  Cognitive interview .348 1.725 .099

Motivated repeated recall

-.142 -0.666 .512

of the ß coefficients (or semipartial correlations) that hypnotizability has its most profound impact on participants interviewed in the HYP condition, followed, in turn, by those interviewed with the CI. On the other hand, the same classes of recall errors produced using the MRR interview were unrelated to participants’ hypnotic ability.


The current study was intended as an independent replication and extension of the investigation by Geiselman et al. (1985), which found that the CI produced gains comparable to those of forensic hypnosis when both were compared with a standard police interview. The practical implication of their finding is that the CI should be considered the preferred investigative interviewing technique, particularly in light of the many legal challenges that have been mounted, justifiably, against hypnotically elicited evidence over the past several decades. Indeed, enthusiasm for the CI procedure appears to have fueled its adoption by numerous law enforcement agencies throughout North America, Europe, and Australia. However, the findings of the present research raise some concerns that suggest that its use by law enforcement may warrant caution. Here we outline these findings and discuss their relevance to this issue.

In this study, as in many prior studies (e.g., Dinges et al., 1992; Dywan & Bowers, 1983; Whitehouse et al., 1988), hypnosis was associated with a major increase in overall productivity rather than a selective enhancement of correct recall. Thus, although hypnosis did produce gains in new correct information, there were corresponding increases in new incorrect, attributional, and confabulatory material as well. Further evidence for this interpretation is provided by the results of the forced interrogatory recall test: When productivity was held constant by requiring all participants to provide one response (excluding “I don’t know”) to each question posed, group differences involving correct and incorrect information disappeared. In sum, the evidence suggests that hypnosis did not enhance recollection relative to nonhypnotic treatments, it merely augmented participants’ willingness to report information, irrespective of its accuracy.

Because the CI served as a second active treatment condition in the present study, and one that was specifically designed to facilitate memory retrieval, one must consider why the results pertaining to correct recall achieved with this technique were smaller than those obtained with hypnosis. One plausible explanation is based on the observation that hypnotic procedures lead participants to relax their report criteria. That is, as documented in prior work by our laboratory (Whitehouse et al., 1988; Whitehouse, Dinges, Bates, Orne, Powell, & Orne, 1991), hypnosis often elicits information that was actually available to the participant before-


hand but that he or she was previously too uncertain about to report. Although our data suggest that the CI may also lower participants’ report criteria somewhat (see also Bekerian & Dennett, 1993; Memon & Stevanage, 1996), as both correct and incorrect information increased for this condition, it appears that the criterion shift for the CI may be smaller than that produced by hypnosis.

The findings of the present study are at odds with the report by Geiselman et al. (1985), which failed to find reliable differences between hypnosis and the CI. Although there are numerous methodological differences between the two studies that may have contributed to this discrepancy (e.g., differences in film stimuli, professional background of interviewers, assessment of hypnotic ability, hypnotic procedures), the respective experimental designs used may be among the most significant disparities. Although we cannot be certain what results Geiselman et al. would have obtained had they used a within-subject design, we were able to reanalyze our own data in a manner comparable to their approach, without regard to baseline performance. Accordingly, we examined total correct, total incorrect, total confabulations, and total attributions after the treatment interview, using one-way anovas. No statistically significant differences emerged on any parameter between our three conditions. Notwithstanding the lack of a difference between the MRR condition and either HYP or CI at this one recall, a discussion of which follows, the apparent equivalence of HYP and the CI is consistent with the report of Geiselman et al. However, it is an illusion because our analyses involving changes from baseline scores clearly demonstrate the greater effect of hypnosis on recall productivity. Thus, evaluating the effects of hypnosis and the CI relative to a baseline measure of recall provides a more sensitive test when we consider the unique contribution of specific interview procedures on witness recall.

The equivalent performance of participants in the CI and MRR conditions was somewhat surprising given previous work with the CI, which consistently demonstrated the superiority of the technique over standard police interviews and the laboratory-derived structured interview (e.g., Köhnken et al., 1994). Although it is possible that the CI was not administered properly in the current study, this seems unlikely. To begin with, we obtained a number of training materials directly from Geiselman and Fisher, which included published and unpublished detailed descriptions of the procedure, audiotaped examples of actual interviews, and a training film illustrating the technique. Additionally, our interviewers received approximately 20 hr of training to conduct each type of interview, were unobtrusively monitored through a one-way screen by senior research staff in order to help them refine their technique, and worked with a number of pilot participants before the main study. Finally, if one ignores the


problems of increased productivity and focuses only on correct information, then our use of the CI resulted in a 36% increase in new correct information over baseline, which is a higher rate than that obtained in more than half of the studies reviewed in the meta-analysis by Köhnken et al. (1999) in which the CI was compared with either a standard police interview or a structured interview.

Again, it is possible that our use of a within-subject design, which required all participants to complete an exhaustive oral and written baseline narrative recall, might have mitigated the value of the “report everything” mnemonic of the CI, thereby rendering it less effective than it would otherwise be. However, baseline interview instructions were the same for all interviewees, regardless of their slated treatment interview conditions, yet all three groups were capable of producing more than a 40% increase in new accurate details (see Table 1). Perhaps the example used with participants in the MRR condition (i.e., the suggestion that their recollection would benefit simply because they returned to the laboratory, although to a different room from the one in which they viewed the film) to establish a credible basis for the various tasks they would be asked to perform had more than its intended motivating effect. For some participants, this casual remark might have encouraged a covert attempt to achieve context reinstatement, which, if successful, would have borrowed one of the cognitive mnemonics of the CI. This possibility also seems unlikely because the physical environments of the encoding and retrieval contexts were quite different, the experimenters afforded no time in their instructions for effective context reinstatement to take place, and the tasks between recall attempts were both physically and mentally engaging. Nevertheless, some form of spontaneous reinstatement of context occurring among some control participants cannot be ruled out.

Nonetheless, a more probable explanation is that the repeated-recall feature of our MRR condition enhanced its effectiveness relative to other control interviews that have been studied. Research on hypermnesia (e.g., Erdelyi, 1988; Roediger & Challis, 1989) has demonstrated convincingly that when participants engage in repeated retrieval efforts, they are able to produce substantial increments in correct recall over time. Perhaps most important in the current investigation was the great care taken to maximize participants’ motivation and to establish the control condition as a plausible memory enhancement technique. These measures were undertaken in order to ensure sustained effort across trials and lead subjects to expect their memories to improve. When evaluated against such an active control condition, it is perhaps not surprising that the CI did not fare as well as it has in other studies, especially those in which it was compared against a standard police interview.


Apart from these methodological issues that we believe bear on the current pattern of results, there remains the question of whether the CI can be regarded as distinct from hypnosis. As Allison (1996) pointed out, there are numerous parallels between the two, and many of the retrieval directives used in the CI, though based on principles derived from research on normal human memory, could readily induce the condition of hypnosis in hypnotizable people who perceive the context as sufficiently hypnotic-like. For instance, the instruction to reinstate the context of the originally witnessed event might be understood as a suggestion for age regression by a hypnotizable person, whereas recalling events from another’s perspective or in different temporal sequences might be considered permission for the hypnotizable person to set aside his or her reality orientation and to engage in uncensored fantasy and imagination.

To evaluate such issues, the participant’s hypnotizability, in addition to the context and procedures of the interview, must be considered. It is noteworthy in the current study that hypnotizability was not associated with increases in correct recall in any of the three interview conditions. Thus, the enhancement of correct recall among participants in the hypnosis condition was not a result of hypnosis per se. On the other hand, we did obtain the expected result that hypnotizability would be substantially associated with the outcomes of hypnosis interviews that routinely contribute to their unreliability (i.e., increases in erroneous and confabulatory information). This combination of outcomes further underscores the need for forensic investigators to avoid hypnosis or undue suggestion entirely. However, the novel and in many respects more important finding of this study was that the same problematic features were present as a function of hypnotizability for the original CI, even though they were significantly lower in magnitude than during hypnosis. Only the MRR control interview, which lacks most features of hypnotic interactions, showed no relationship to the participants’ hypnotic ability.

The most parsimonious explanation for the correspondences between the performance of hypnotically interviewed and cognitively interviewed participants seems to be that at least some (i.e., those with greater hypnotizability) in the CI condition were responding on the basis of hypnotic processes. This suggests that the CI has the potential to induce a hypnotic-like condition in some people and therefore is not immune to the concerns of the justice system currently directed toward the forensic use of hypnosis. The findings also indicate that the CI procedure, as implemented in the current study, did not engender the extent of poorly censored recall output that the hypnosis interview produced, but in terms of the amount of accurate information yielded, it was not superior to the MRR control condition, which was based on nonguided, repeated


effortful recollection. Accordingly, real benefit appears to accrue from repeated recall attempts (Erdelyi & Becker, 1974; Payne, 1987) whether or not supplemental cognitive mnemonics are also implemented.

Finally, it is important to note that we sought to establish positive expectancies for successful recovery of additional recollections in each of the three treatment conditions by asserting that engaging in the prescribed techniques would help participants enhance their recollection. However, it is possible that our attempt to equalize motivation across the comparison interview conditions, which is important from a methodological standpoint, might have inadvertently altered the witnesses’ responsiveness to the CI, which does not normally include such instructions. If so, this may account, to some extent, for the similarities that we found between the effects of hypnosis and those of the CI. In its favor, the CI did not exhibit as strong an association with hypnotizability as did the hypnotic interview in generating erroneous and confabulatory material. On the other hand, if such a simple communication between the interviewer and interviewee can substantially affect the outcome of the CI, then there are sound reasons for caution. After all, survey studies have shown that trained law enforcement officials do not reliably implement the CI in the prescribed manner, nor do they necessarily use all of its components (Clifford & George, 1996; Kebbell, Milne, & Wagstaff, 1999). Consequently, we do not yet know the range of circumstances under which hypnotic-like responses to the CI, such as those observed in this study, will manifest in other contexts, including its use in actual investigative situations. Only further systematic research, taking into account a number of related individual difference variables, such as imagery ability, absorption, fantasy proneness, suggestibility, and hypnotic capacity, can inform us whether the CI constitutes a reliably acceptable alternative to the dangers of forensic hypnosis in the justice system.


This research was supported in part by grant #87-IJ-CX-0052 from the National Institute of Justice and in part by a grant from the Institute for Experimental Psychiatry Research Foundation. The authors express their gratitude to Bernard Auchter of the National Institute of Justice and to R. Edward Geiselman, Ronald P. Fisher, and Elizabeth F. Loftus for their expert advice and for providing various stimulus and instructional materials that greatly facilitated this project. We are also grateful to Mary F. Auxier, Barbara R. Barras, Stacia Bates, Jennifer Button, Michele M. Carlin, Noel F. Carota, John W. Powell, and Mae C. Weglarski for their valuable administrative and technical contributions to this study.

Correspondence about this article should be addressed to Wayne G. Whitehouse, Unit for Experimental Psychiatry, University of Pennsylvania Medical School, 1013 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104 (e-mail: wayne.


whitehouse@temple.edu). Received for publication December 22, 2003; revision received June 24, 2004.

1. There is substantial evidence that multiple recall attempts can significantly enhance memory retrieval beyond a single recall opportunity, a phenomenon called hypermnesia (Erdelyi, 1988; Erdelyi & Becker, 1974; Payne, 1987). In actual investigative situations, witnesses tend to carry out repeated (covert as well as overt) recall attempts in anticipation and during the course of formal interviews. Furthermore, it is likely that repeated retrieval opportunities are implicitly arranged in most forensic hypnosis interviews, and they are explicitly provided for in the CI; therefore, an appropriate control condition must use the same potent retrieval medium, devoid of the types of suggestions and memory guidance instructions that characterize the comparison interview procedures.

2. To increase the sensitivity of the paradigm to the impact on memory of witnessing an emotionally upsetting event, we sought to include participants who appear to rely on repression as a coping mechanism. Following other research in operationalizing repression (Davis & Schwartz, 1987; Weinberger, Schwartz, & Davidson, 1979), we used personality measures that assess trait anxiety (a short form of the Taylor Manifest Anxiety Scale; Bendig, 1956) and psychological defensiveness (Social Desirability Scale of Crowne & Marlowe, 1964) to identify a subsample of participants who exhibited the repressor profile (i.e., low anxiety coupled with high defensiveness). Using the criteria outlined by Davis (1987) and Davis and Schwartz (1987), 37 participants qualified as repressors (defined as a score of 7 or less on the Manifest Anxiety Scale, coupled with a score of 15 or more on the Social Desirability Scale); the remaining 35 participants were classified as nonrepressors because their scores on the latter scales did not meet these criteria. Unfortunately, this proved to be a fruitless excursion because the only significant difference was an interaction with interview condition that occurred during baseline, before administration of the treatment interviews. Several possibilities present themselves as to why we did not find any meaningful effect of repressor status: Repression is not the mechanism responsible for memory deficits after exposure to depictions of violent acts, the combination of measures used to operationalize repression have shortcomings in validity or reliability, or repression selectively operates on particular types of stimulus material, such as affectively toned autobiographical memories (cf. Davis, 1987). In any case, repressor status was not included as an independent variable in any of the statistical analyses described in this report.

3. The point about returning to the laboratory, albeit to a different room from the one in which the film was originally shown, was intended to provide a credible basis for the types of techniques that were introduced in the MRR interview. No active attempt at context reinstatement occurred beyond this simple assertion. The likelihood of spontaneous context reinstatement actually occurring was minimal because the interview rooms (retrieval context) consisted of a soft reclining chair facing a desk where the interviewer was seated, with a one-way screen and picture along one wall, an occluded window, and a clock mounted on a floor stand. In contrast, the stimulus viewing room (encoding context) was approximately six times larger, with a wide-screen television, several rows of straight-back chairs for the participants, and library books lining the walls.


4. Transcripts of participants’ narratives were decomposed into scorable information units (IUs) according to a predetermined scheme that counted only relevant, nonredundant information. For example, if the participant reported that the robber wore a “blue polyester suit,” this information would be decomposed into six scorable units: [jacket] [blue] [polyester]; [pants] [blue] [polyester]. To determine their accuracy, such claims would then be compared with a catalog of correct information concerning the film that was developed by three laboratory assistants who independently provided detailed transcriptions of events in the film on each of seven viewing occasions. On very few occasions, a participant’s report necessitated reviewing the actual stimulus film for verification.


233 Cognitive Interview and Hypnotizability

234 Whitehouse et al.

