Orne, M.T. Communicatioon by the total experimental situation: Why it is important, how it is evaluated, and its significance for the ecological validity of findings. In P. Pliner, L. Krames, & T. Alloway (Eds.) Communication and affect. New York: Academic Press, 1973. Pp. 157-191.

COMMUNICATION BY THE TOTAL EXPERIMENTAL SITUATION: WHY IT IS IMPORTANT, HOW IT IS EVALUATED, AND ITS SIGNIFICANCE FOR THE ECOLOGICAL VALIDITY OF FINDINGS

Martin T. Orne

Institute of the Pennsylvania Hospital and University of Pennsylvania

It is often assumed that knowledge of a message permits a meaningful analysis of communication. Not only may a message fail to adequately reflect the sender's intent, but its interpretation will depend upon the context and the perceptions of the receiver. This paper will attempt an analysis of communication in the psychological experiment. This is of interest for two reasons: first, as a substantive example of a situation in which the communication as intended and perceived by the investigator may be quite different from the communication as perceived by the subject and, second, since much of our knowledge about psychology in general and communication in particular derives from experimental data, distortions in our understanding of such data become especially meaningful.

The experimental setting is generally conceptualized as a standard situation which permits systematic study of the subject's response to the independent variables under controlled conditions. It is customary in reports of empirical research to provide the detailed instructions given to the subject because of an implicit assumption that knowledge of these instructions -- the message -- will clarify the nature of the communication for the reader. It is further assumed that the subject's response under the conditions of the experiment will be representative of the individual's response in other

157

 

158 Martin T. Orne

circumstances. This chapter will suggest that the manner in which the experimental situation is generally conceptualized may be largely responsible for errors in communication and consequently for errors in interpretation, and, further, that it is essential for the investigator to understand how a particular experimental situation is perceived by the subject in order to draw sensible inference from the subject's responses. Some of the means for accomplishing this goal will be discussed as well as the implications of an alternative conception of the psychological experiment for the interpretation of research findings in general and especially for an understanding of attempted replications.

The Consequences of Being in an Experiment: The Psychological Experiment as a Unique Form of Social Interaction

When a subject is asked to participate in an experiment and agrees to do so, his perception of the situation and consequently his response to instructions, tend to be altered drastically. This can easily be documented with a simple demonstration.

You take a group of casual acquaintances and ask them, "Will you do me a favor?" On receiving an affirmative response, you say, "Do five pushups." Typically, the individual will reply, "Why?" "Are you crazy?" "What's the point?" and so on. A matched group of casual acquaintances is then taken and again asked to do you a favor. On receiving an affirmative reply, you ask, "Are you willing to participate in a psychological experiment?" When another affirmative reply is given, you again present the instruction, "Do five pushups." The typical question now becomes, "Where?"

A slightly different version of the same experiment was done by a student 1 who took some whole fried grasshoppers and asked casual acquaintances if they would eat them, eliciting a large number or refusals. Then he went to another group and asked if they would help him with a psychological experiment. On receiving affirmative responses, he took out a stopwatch and placed a platter of fried grasshoppers in front of the subjects and said, "I want to see how many of these you can eat in thirty seconds. Start!" Practically all of the subjects began to gobble the fried grasshoppers as rapidly as possible.

Apparently unambiguous communications tend to alter their meaning radically when they occur in the context of an experimental situation. This was particularly well illustrated by a series of substantive experiments on the question of whether hypnotized subjects can be compelled to carry out antisocial or self-destructive actions. First Rowland (1939) and then Young


1 The informal study was carried out by George J. Smiltens.

 

159 Communication by the Total Experiment

(1952) had independently shown that deeply hypnotized subjects could be compelled to pick up a poisonous snake with their bare hands, to remove a penny from fuming nitric acid with their bare fingers, and, finally, to throw a beaker of concentrated nitric acid at a research assistant -- actions that are clearly both self-destructive and antisocial. The face validity of this experiment was heightened by the report that the same subjects who were compelled to carry out such actions during hypnosis, when asked in the wake state whether they would be willing to carry them out, recoiled with horror. In a further replication of these studies, Orne and Evans (1965) used not only deeply hypnotized subjects but in addition included a special type of quasicontrol group (to be discussed more fully later) consisting of nonhypnotized individuals instructed to simulate hypnosis. In line with the previous findings, we also observed that five out of six deeply hypnotized experimental subjects could indeed be compelled to carry out these actions. Of greater importance, however, was our observation that six out of six nonhypnotized subjects asked to pretend to be hypnotized also performed the identical dangerous and antisocial acts. This led us to try the same procedure with random subject volunteers, and we noted that, depending upon the amount and type of social pressure involved, these behaviors could be elicited as easily from naive volunteers as from simulators or real hypnotized subjects. These were startling observations for us because we realized, as do most people, that no one in his right mind would perform such acts unless he had substantial grounds for so doing.

Only by analyzing the nature of the communication in context did it become clear that it was entirely different to ask a subject whether he would be willing to remove a penny from fuming nitric acid with his bare fingers from demanding that he do so with a clear expectation of compliance. The appropriate response to the question of whether the subject would be willing to carry out this obviously dangerous action is a resounding "No!" However, once the investigator communicates that he expects the subject actually to carry out the behavior, he inevitably also communicates that the behavior is, in fact, harmless, regardless of appearances. As careful postexperimental discussions with our subjects substantiated, the demand to carry out actions apparently dangerous to the self or others communicated that it must be safe to comply. Even though appearances might have indicated otherwise, their common sense helped them to realize that we could not and would not have a subject hurt in our experiments nor, for that matter, were research assistants sufficiently expendable to allow someone really to hurt them; subjects therefore assumed, correctly of course, that appropriate safeguards had been taken and complied with the demand. The implicit communication, then, to all subjects was one of expectation and safety.

 

160 Martin T. Orne

Certainly the meaning of the behavior as well as its implications were almost entirely determined by the context of "experiment." Instances such as these demonstrate the difficulties of interpreting the true meaning of both the instructions and the subjects' behavior without a full appreciation of the context, even when one is dealing with dramatic events which., seem to have face validity. Interestingly enough, none of our colleagues on the faculty could be persuaded to carry out any of these tasks. They, however, had not agreed to participate in any experiment.2 Their interpretation of the requested actions was very much like that of the previous investigators as well as readers of scientific articles -- but significantly different from that of the subjects who actually participated in the experiment. Perception is not only in the eye of the beholder, it is also in the position of the beholder -- whether observer or actor in the situation.

This difference between how an experiment may be perceived by the investigator versus how it may be perceived by the subject was again dramatically illustrated when we attempted to find an experimental task that subjects would rapidly discontinue, not because it was painful or excessively fatiguing, but rather because it was utterly meaningless and trivial (Orne, 1962). A serial additions task which most subjects found onerous to begin with was deliberately made progressively less meaningful. First, the subject was merely asked to carry out the task by the experimenter who took away the subject's watch and said he would be back "eventually. . . ." The next step was to provide an instruction card which told subjects to tear up the page they had just completed, throw it away, and then continue to go on working as rapidly and accurately as possible. When subjects continued to work for long periods even under these circumstances, the one-way screen used for monitoring was eliminated and the experiment was restructured so that it was clear the instructions would not change throughout the duration of the session.

Considerable care was taken to establish both the accuracy and rate of work under these circumstances in a manner that subjects were, in fact, unable to detect. Nonetheless, subjects continued to work both accurately and rapidly though apparently alone and unmonitored. Postexperimental discussion revealed, however, that subjects uniformly assumed -- correctly, of course -- that they were actually monitored throughout their performance. Furthermore, while we tried our best to make the task meaningless, it was not interpreted as such by any of our subjects who generously assumed that we must have had good and sufficient reasons for asking them to engage in


2 In an entirely different setting, Silverman (1968) has shown that a persuasive communication presented in an experimental context increases acquiescence significantly more than it does in a nonexperimental context.

 

161 Communication by the Total Experiment

such ridiculous behavior. Simply because the experimenter contrived the task to appear meaningless did not prevent the subject from inferring sensible purpose and responding accordingly.

Subjects were paid an hourly rate and it could be argued that this fact accounted for our difficulties in designing a meaningless experimental task which subjects would discontinue despite its legitimization by the experimental context itself. Anyone doubting the uniqueness of the experimental context or overly impressed with the effectiveness of small amounts of money as reinforcers may wish to try an analogous experiment in real life with his secretary. Simply ask her to type a perfect letter and, on completing it, have her proofread it and, having done so, tear it up. Then instruct her to repeat the procedure. Two, or possibly three, such trials should provide an opening for a new secretary -- despite the fact that secretaries are paid considerably more than experimental subjects, and their work is by the hour, not by the page. The crucial difference here is that the experimental subject, lacking appropriate means for determining why he is asked to perform certain actions, tends to assume that there is an important and legitimate purpose in the requests being made of him, whereas, in most nonexperimental situations, the individual is better equipped to evaluate the appropriateness of the request.

The validity of conclusions drawn from any experiment will depend upon the degree to which the experiment as it is conceived by the investigator corresponds to the way it is perceived by the subject. Unfortunately, the assumption of a one-to-one congruence is rarely met. An experience told to me by a well-known sleep researcher serves as a particularly graphic illustration. One of the investigators in his laboratory noticed that a subject had considerable difficulty in falling asleep. After about an hour, he inquired of the subject whether anything was bothering him. He was told no, everything was fine. After another hour had passed, he again inquired, but was told by the subject who still failed to fall asleep that he was quite all right. This continued until the early hours of the morning when the investigator finally entered the subject's room and insisted that something must be bothering him since he still was not asleep. At this point, the subject asked, somewhat incredulously, whether he meant that the mouse in his bed was not really part of the experiment!

In each of the examples cited, the appropriate interpretation of the experimental situation was not possible until the investigator had understood the experiment from the subject's point of view. It seems self-evident that the experiment as it is experienced by the subject, rather than how it is conceived by the experimenter, will determine how the subject behaves.

Though these instances have tended to emphasize cooperativeness on the part of the subject in the context of an experimental interaction, it would

 

162 Martin T. Orne

be quite inappropriate to assume that subjects are necessarily compliant or willing to tolerate discomfort or indignity if such is perceived as unnecessary to the experimental purpose. In other words, subjects will tolerate pain, discomfort, or boredom so long as they can feel it is essential for the experiment in question (Orne & Watson, 1957). They become quite annoyed and angry if they feel that their discomfort is the consequence of carelessness or ineptness on the part of the experimenter. For example, in some studies we have found it necessary to take repeated blood samples, requiring several venipunctures. Subjects tolerate this mildly uncomfortable procedure with remarkably good humor. However, in a study which required only a single venipuncture, subjects would become visibly annoyed if the assistant failed to hit the vein on his first attempt and found it necessary to repeat the procedure two or three times in order to obtain a blood sample. In this instance, it was obvious to the subject that the repeated discomfort was not necessary for the purpose of the experiment and was rather the fault of the experimenter or the staff he chose.

Again, this example illustrates how it is the subject's perception of the meaning of a procedure which determines his response, or, in other words, what meaning the total experimental context communicates to him. While the psychological experiment tends to shape the subject's perception and makes it likely that he attributes meaning and purpose to whatever tasks are required of him, this is true so long as the subject can assume that the requests are legitimate and necessary for the purpose of the research. When he is in a position to recognize that this is not the case, the subject will respond as would an individual in a nonexperimental setting. Thus, for example, asking a subject to sit for 30 min while an experimental drug takes effect rarely causes annoyance, whereas waiting the same amount of time for an experimenter who is unaccountably "delayed" is quite a different matter.

The Motivation of the Experimental Subject

Human subjects do not just appear in an experiment; rather they must be motivated or coerced to participate. How they respond to the experimental situation and their attitude toward the total enterprise will, to a considerable extent, be determined by their motives for participation.

A good deal of research has been carried out using coerced volunteers as subjects. It is a common practice to require students in introductory psychology courses to participate in several experiments as subjects, ostensibly as part of their education. Though some of these students would have freely chosen to participate, subject populations of coerced volunteers are liable to respond differently from true volunteers. On the one hand, there is the effect

 

163 Communication by the Total Experiment

of being forced to participate and, on the other, there is the absence of self-selection. The relationship between the "coerced" volunteer and the "true" volunteer is a complex one still requiring further research.

Subjects volunteering to participate in psychological experiments do so for a wide variety of reasons. They may be solicited by a friend, they may be curious and hope to learn something about psychology or about themselves, they may choose this manner of seeking help for personal problems, they may, if paid, see this as an easy way to make a few dollars, and so on. Despite the different idiosyncratic motives which may be involved, there are certain general characteristics of volunteer subjects that hold for most individuals. As I pointed out some ten years ago (Orne, 1962), over and above idiosyncratic motives,

College students tend to share (with the experimenter) the hope and expectation that the study in which they are participating will in some material way contribute to science and perhaps ultimately to human welfare in general. . . . Both subject and experimenter share the belief that whatever the experimental task is, it is important. . . . If we assume that much of the motivation of the subject to comply with any and all experimental instructions derives from an identification with the goals of science in general and the success of the experiment in particular, it follows that the subject has a stake in the outcome of the study in which he is participating. For the volunteer subject to feel that he has made a useful contribution, it is necessary for him to assume that the experimenter is competent and that he himself is a "good subject." . . . We might well expect then that as far as the subject is able, he will behave in an experimental context in a manner designed to play the role of a "good subject" or, in other words, to validate the experimental hypothesis. Viewed in this way, the student volunteer is not merely a passive responder in an experimental situation but rather he has a very real stake in the successful outcome of the experiment. . . . The subject's performance in an experiment might almost be conceptualized as problem-solving behavior; that is, at some level he sees it as his task to ascertain the true purpose of the experiment and respond in a manner which will support the hypotheses being tested. Viewed in this light, the totality of cues which convey an experimental hypothesis to the subject become significant determinants of subjects' behavior. We have labeled the sum total of such cues as the "demand characteristics of the experimental situation" [pp.778-779].

 

164 Martin T. Orne

A number of observations appear to support such an analysis of the subject's motivation. These observations include the remarkable willingness of subjects to comply with experimental instructions, their interest in learning more about an experiment in which they participated some time ago even though their own performance may not individually be identified as such, their willingness to return for follow-up experiments even at considerable cost of time and effort, the uniformly negative response subjects yield when they are led to believe that their data were not properly recorded due to equipment failure, their tendency not only to accept but also to exaggerate the importance of any research in which they were a participant, and so on.

While such an analysis of the experimental situation tends to hold under most circumstances, the subject's stake in the outcome will be maximized if he is a true volunteer,3 if he has been scheduled to participate in the study well in advance, if it has taken some effort of his own to take part in the experiment, if the amount of monetary reward has been relatively small, and if he has been treated in a professional fashion by an experimenter who appears truly interested in and concerned about the experiment itself. Finally, there appears to be a relationship between the amount of discomfort involved in participating in an experiment and the importance a subject attaches to it. Discomfort, provided it is perceived as an essential and unavoidable aspect of an experiment tends to make him even more convinced about the importance of the experiment [an observation which could also have been predicted from Festinger's (1957) theory]. Experiments which expose the subject to stress are rarely, if ever, seen as negative experiences regardless of the transient discomfort or pain elicited by the procedure, so long as the subject sees his own response as appropriate and is able to see the situation as a mastery experience.

Conversely, of course, subjects participating in a study may well feel


3 Our analysis of the experimental situation was based primarily on experiences with true volunteer subjects, since this is the population involved in the bulk of our research. In the absence of further data, caution is advisable in generalizing these observations to samples of coerced volunteers, though their similarity to true volunteers is likely to be greater than generally recognized. In considering differences between true and coerced volunteers, it is essential that the manner in which subjects are treated be held constant. Thus, there is a tendency for coerced volunteers, because of their ready availability, to be treated in a more offhand fashion and to be taken for granted in a manner that may be offensive to many subjects. When a human subject is treated as an object, this will have consequences for his behavior independent of the motivation which brings him to the research. Therefore, care must be taken in interpreting differences between settings as necessarily being a function of coerced volunteers. It seems likely that independent effects due to how subjects are treated as opposed to how they are solicited could be demonstrated.

 

165 Communication by the Total Experiment


annoyed and used. The more the above mentioned factors are absent, especially if the individuals with whom the subject has contact seem to view the experiment as trivial and if the experimenter himself seems bored, the more likely the subjects are to feel put upon and angry. Thus, if the setting is one where subjects come to feel like objects (which is almost assured if the experimenter himself conveys the feeling that the study is trivial), their responses may well be negativistic. Under such circumstances, any discomfort associated with the experiment will become highly aversive. Even in the absence of any discomfort, the subject's annoyance may be reflected in his experimental behavior which has led to Masling's (1966) observation about a "screw you" effect in psychological research. In our laboratory, we have failed to observe such a phenomenon, but we believe this to be a function of how subjects come to us and the manner in which they are treated. No doubt such a phenomenon could readily and predictably be produced by an appropriate -- or more correctly termed, inappropriate -- experimental setting.

One other characteristic of the experimental subject deserves special mention. Most subjects have a real concern about their own performance and also wish to be good subjects in the sense of behaving appropriately, the right way, the way normal individuals behave, and so on. As Rosenberg (1969) aptly points out, subjects see psychologists as individuals able to judge the competence, the intelligence, as well as the emotional adjustment of others, and they therefore value the experimenter's opinion of them. Rosenberg argues that "evaluation apprehension" is a powerful determinant of subjects' behavior. There can be no doubt of the importance of such a variable, and subjects will, of course, strive to look good, both in their own eyes and in the view of the investigator. To the extent that a subject sees one or another behavior as being interpreted by the experimenter as more healthy or more correct, such a perception will affect his response in the situation (Orne, 1969, 1970).

Recently, there has been some effort to specify precisely what are the most important motivational factors shared by experimental subjects. Rather than seeking to find one set of motives regnant in all experimental contexts, it would seem more appropriate to recognize that in some settings one or another of these factors will become operative; it would also seem appropriate to recognize that the nature of the subject population,4 how subjects are solicited, and how they are treated, will enhance one or another of these factors. Moreover, it would seem most important to recognize that what


4 Subjects' preexisting attitudes toward psychology will also vary, and these may have profound effects on their behavior in experiments (see Adair & Fenton, 1971).

 

166 Martin T. Orne

ever the matrix of motivations underlying the subject's perception, he will inevitably take some position vis-à-vis the experimental situation. This position will tend to shape his perception of what is communicated by the experimental situation and may consequently have profound effects upon his behavior. It seems inappropriate to conceptualize these motivational variables as simply another set of stimuli which impinge upon the subject, as this would tend to ignore the subject's active participation in the unique form of social interaction which is called the psychological experment.5

Cues That Determine the Subject's Perception of the Experimental Instructions

In order to understand the experimental procedure from the subject's point of view, it is necessary first to keep in mind that subjects are aware that they are not supposed to know too much about an experiment and have reason to distrust what they are told about the procedure. They listen to instructions, they respond to instructions, but they also recognize that the experimenter may have to be less than totally truthful. Subjects are usually quite comfortable in accepting the possibility that they may be deceived in an experiment since they recognize that such deception may be necessary for the purposes of research. Their understanding in this regard is perhaps more comprehensive than that of some of our colleagues who have argued that all deception is inherently degrading and repugnant.

Regardless of whether subjects are willing to tolerate being deceived, or are in fact told the truth, it would be foolhardy to assume that they accept what they are told uncritically and at face value. Subjects tend to treat experimental instructions in a fashion closely analogous to the way most people treat the used car salesman's assertion that "this particular 1962 Dodge has been driven only 2000 miles a year by a little old lady who used it only to go to church on Sunday."

It is for this reason that it is necessary to recognize that the subject's participation in the psychological experiment begins when he first hears about the experiment and contemplates whether he will or will not participate. A number of studies of cognitive dissonance (see Brehm & Cohen, 1962) have shown that the manner in which a decision to participate is arrived at may in and of itself have profound effects on how the subject subsequently behaves. Of even greater importance, however, may be the information which is made available to the subject when he is solicited.


5 See Rosnow and Aiken (in press) for an interesting alternative way of conceptualizing the mediation of demand characteristics.

 

167 Communication by the Total Experiment

Solicitation Cues

In order to have subjects participate in an experiment, they must somehow be solicited. Investigators do this by an announcement in class, on the bulletin board, by means of an ad, or by word of mouth. The experiment must somehow be described, and that description itself may profoundly affect who volunteers and how he responds. For example, hypnotizability among subjects who respond to an ad seeking "participants for a psychological experiment" is significantly lower than among subjects who respond to an ad seeking "subjects to participate in hypnosis experiments." (Hilgard, 1965; Shor & E. Orne, 1963).

Not infrequently, one finds a member of a fraternity volunteering, and, after he has participated, a large number of members from the same fraternity promptly volunteer. It would be naive to assume that the experience of participation for what might be considered the scout is the same as that of his fellow members. The latter have been reassured by statements such as "It is really nothing," or "It is a lot of fun," or "It is an easy way to make a few dollars," or "Don't worry about the electric shock they talk about they never really give it to you," or "It is really a put-on," and so on. Not only are they less anxious, but also they are more informed - - not necessarily correctly informed, but more informed.

Perhaps the most troublesome aspect of the kind of cues under discussion here is the fact that the cues are generally made available to subjects without the awareness of the experimenter. Indeed, most experiments are implicitly based on the premise that subjects arrive at the experimental room pristine and unaware of any aspects of the procedure. Yet inevitably a great deal of information has already been made available to the subjects. In order to run a psychological experiment, it is necessary first to schedule the subjects. Most frequently this is arranged by asking them to call, or by a sign-up list and later contacting them about an appointment time. Often the scheduling is arranged by a secretary or a research assistant, rarely by the experimenter himself unless he happens to be a graduate student. It is simply not possible to arrange an appointment time with a subject without providing a good deal of information about the experiment in which he is expected to participate. The amount and kind of information made available during this transaction inevitably will help shape the subject's response. Unfortunately, since the investigator is not generally present, he rarely if ever knows about what is actually communicated.

The importance of assessing solicitation cues became forcibly clear to us some years back in the process of evaluating the sensory deprivation phe-

 

168 Martin T. Orne

nomenon.6 We noticed that the first studies used 2 weeks of sensory deprivation and reported that various signs of breakdown could be observed after approximately 10 days. Following these, other studies involving only 1 week of deprivation were run, and evidence of severe disturbance was noted after about 5 days. Subsequent studies which involved 3 days of isolation showed similar effects after 2 days. Twenty-four-hour studies led to the observation of deleterious effects after only 16 hr. Finally, investigators using 8 hours of deprivation observed psychic disturbances after only 6 hr. The effect might best be described as a "two-thirds of the way through the experiment" breakdown, yet the investigators reported that their subjects were not told how long the period of total isolation would be.

As we tried to set up such an experiment, it soon became obvious that it was absolutely impossible to schedule a subject without indicating to him at least approximately how long the experiment would take. While subjects might well be willing to volunteer for an experiment of undetermined length, a subject being scheduled for a Tuesday experiment would say, "Well, if the experiment isn't over on Wednesday, I can't participate because I have an important exam." The helpful research assistant, faced with the task of persuading subjects to take part, will respond, "Oh, don't worry about it. Of course you will be finished." Indeed, this information was usually volunteered only indirectly; for example, subjects might be told, "You will be paid $2 an hour and the experiment pays $20." As we became more alert to this source of information, we became aware that one could not meaningfully demand of a scheduling assistant that appointments be arranged "without giving the subject any information." While the investigator may insist on being told that this was actually accomplished, in fact, it rarely can be.

Largely through the efforts of Emily Orne, our laboratory has gradually evolved scheduling procedures which do provide reasonable control over the precise information given. Unfortunately, it requires a great deal of effort and continuing attention to detail. Thus, at any given time, several experiments may be going on within the laboratory. We are careful not only to monitor the precise nature of the initial solicitation, but also to control the information that is provided to the subject by telephone. To do this, advertisements for each experiment ask the potential subject to call a specific individual at a specific telephone number. We use several noninterchangeable telephone lines, and a particular research assistant is assigned to answer calls from a particular line. Which ad the subject has read is identified both by the telephone number on which his call is received, and by the individual for whom he asks, as well as by how he describes the study. The designated


6 Orne, M.T., & Scheibe, K.E. The effects of sensory deprivation: A critical review, unpublished manuscript, 1962.

 

169 Communication by the Total Experiment

individual is the only one permitted to handle subject calls for a given study.

In advance of the experiment, a detailed list of the information which the research assistant is to make available to the subject is worked out, and, while scheduling, the assistant not only arranges the time, but also provides the specified information about the study in a casual conversation. This includes such aspects as what electrodes must be attached, whether shock may be involved, and so on. Furthermore, every effort is made to work out the kinds of questions the subject might ask, and they are gone over with the research assistant in advance, together with a set of answers designed to keep constant the amount of total information that is made available to each individual. Any additional questions which a particular subject then asks, as well as the answers given to them, must be noted. Thus, the process of handling all possible questions is continually monitored. A checksheet is provided for the research assistant where she notes each point as it is discussed in an apparently spontaneous, but actually carefully preprogrammed fashion.

Despite all of these precautions, we do not, of course, have complete control over the material communicated. The procedures do make certain that subjects have a reasonably stable amount of information and prevent the inadvertent communication of information which is likely to have a profound effect on subjects' experimental behavior. 7

The significance of the information provided by the scheduling assistant is also apparent in an entirely different context. The information given informally by the research assistant is likely to be viewed by the subject as even more telling than that provided by the experimenter. Subjects see the scheduling assistant's task as merely arranging an appointment time, and therefore information which is casually and apparently inadvertently given has a very high degree of credibility. Subjects tend to perceive the scheduling assistant as uninvolved and thus unbiased; comments from such a communication source become especially trustworthy.

An additional source of information usually overlooked is the interaction with the technician, whose job is too often viewed as one of merely attaching electrodes or setting up the subject with complex equipment. It is almost impossible for an individual to attach electrodes and interact with a subject


7 If a subject asks such pointed or unique questions that the scheduling assistant is forced to provide additional information or rework old information in a new manner, a special telephone contact form is prepared for the experimenter so that a decision can be made in advance of the experimental session as to whether the subject's run should be disqualified. Because the subject volunteered in good faith, he is still run, however, and from his point of view his data are meaningful to the laboratory.

 

170 Martin T. Orne

for perhaps half an hour without responding to questions about the study. Because the technician is seen as a disinterested person who must be aware of what is going on, but who has little stake in concealing it, his casual remarks may carry a great deal of weight. This is somewhat analogous to the everyday phenomenon encountered in a doctor's office where a casual gratuitous comment by the nurse or technician tends to be accepted as gospel -- often in preference to what the physician says - - because the patient is afraid that the doctor may be trying to conceal his real illness from him.

Cues Arising from the Experimental Procedure Itself

The most important single source of cues available to the subject about the purpose of an experiment is provided by the experimental procedure itself. This reflects the well-known adage that actions speak louder than words, especially in a context where there is reason to distrust words. Thus, the experimental protocol inevitably must communicate something about the purpose of an experiment. Attitude research has had to come to terms with the fact that if a subject is given a test, presented with an intervening treatment, and then retested, the procedure clearly communicates that the investigator expects some kind of change, regardless of what the subject is told about the procedure. A variety of complex and sophisticated controls have been developed to evaluate pretest sensitization (see Lana, 1969; Solomon, 1949).

Despite its obvious importance as a source of communication, investigators have tended to overlook the significance of the experimental procedure in most fields of psychology. To take another example from work in hypnosis, Hull (1933) recognized the importance of order effects, and for this reason, in evaluating the effect of hypnosis on physical performance, he used the standard ABBA-BAAB procedure. In a number of studies, highly significant differences could be observed. Note that the logic of such a design is one where the performance of all subjects is averaged in order to statistically eliminate practice effects. In several studies, we have found very striking differences between these two orders which cannot be ascribed to practice (Evans & Orne, 1965). Subjects who are run in hypnosis first apparently perform much better in hypnosis than in the wake state, whereas a very small improvement can be achieved in subjects who are run in the wake state first. Subjects in the former case recognize from the design -- not from anything they are told -- that it is the intent of the investigator to compare the hypnotic performance with the waking performance and apparently at some level suppress their waking performance. On the other hand, subjects who are run in the wake state first yield a waking performance of the same order of magnitude as the hypnotic performance of

 

171 Communication by the Total Experiment

other groups, with very little if any increment in the subsequent hypnotic performance.

It should be noted that the modification in experimental procedure not only alters simple order effects, but also changes the ease with which subjects can recognize and respond to the intent of the investigator. Zamansky, Scharf, and Brightbill (1964) documented the importance of the subjects' awareness of whether or not they would subsequently be tested in hypnosis. Thus, if subjects knew in the first waking performance that they would subsequently be tested in hypnosis, they yielded a significantly higher threshold, or worse performance, than subjects of equal hypnotizability who were not aware that they would subsequently be tested later in hypnosis.

Unfortunately, the subject's perception of an experimental protocol depends upon many subtle factors in interaction with the protocol itself. Frequently, minor changes in experimental conditions or in order of presentation will result in major changes in a subject's perception of the experimental purpose and consequently major changes in his responses.

The Study of Demand Characteristics

While it is possible to specify some of the sources of cues responsible for a subject's perception of an experiment, it would be a matter of empirical inquiry to determine precisely what subjects do in fact perceive. The investigator will often want to determine directly how the subject perceives a given experimental study.

The experimenter is, of course, not as interested in some aspects of the subject's perception as in others. He is specifically concerned with those perceptions which are likely to have a systematic effect on the individual's behavior during an experiment. It is this specific aspect of the subject's perception - - the demand characteristics of the experimental situation -- which become an important area to explore. In reading a protocol of a psychological experiment and thinking about how a subject might behave, one implicitly makes assumptions about how the subject perceives the situation. As we have tried to underline, all too often these implicit assumptions are incorrect. With sufficient experience -- and sufficient knowledge about the kind of details about the experiment discussed earlier -- the investigator may make some inference about how the situation is perceived. Colleagues, regardless of their seniority or experience, can never be the final arbiter of this question; only subjects who have been in the experimental situation can really elucidate how and what subjects perceive in the situation. The actor in the situation is likely to perceive cues differently in context than an outside observer trying to evaluate them. Consequently, subjects may fail

 

172 Martin T. Orne

to perceive what seems completely obvious to the investigator; at other times, they will easily see through the most byzantine deceptions devised by the investigator.

The Postexperimental Inquiry

The most straightforward kind of quasi-control procedure is the postexperimental inquiry. At the conclusion of the experiment, the subject's perceptions about the procedure are elicited in order to learn how the subject perceives the experimental situation, to answer the questions of what the subject believed the purpose of the experiment was, what he believes the experimenter hoped or expected to find, how he thinks other individuals did, and how he thinks he did in relation to what he believes to be the experimenter's hypotheses.

In conducting an inquiry, it is necessary to keep in mind that subjects in psychological experiments tend to be aware that they ought not to catch on to some aspects of the experimental procedure, that if they indicate that they know too much, their data cannot be used. Since subjects tend to have considerable investment in having their data be useful, there is a tendency for them not to volunteer their awareness of the investigator's hypotheses, especially if these have not been discussed. This tendency is, of course, reinforced by a subject's reluctance to appear wrong.

It should be recognized that the needs of the subject in this regard tend to mesh with those of the investigator, who is by no means eager to learn that subjects actually caught on to what he assumed to be a clever deception or that subjects perceived the experiment as different from the way he intended. Consequently, there is a tendency for a "pact of ignorance" (Orne, 1962) to evolve where the subject answers the casual question: "What do you think the experiment was all about?" with "I don't know," and the investigator all too gladly terminates his inquiry. However, if the investigator persists, he will be rewarded -- or punished, as the case may be -- by learning that most subjects form a very clear picture of what the experiment is about and why the experiment is being carried out, though this picture may or may not resemble the experimenter's actual hypotheses. 8

A postexperimental inquiry, carried out with tact and persistence, communicating to the subject that information is truly desired, while taking care, of course, not to shape his response, may yield a great deal of meaningful


8 A number of recent studies have shown that inquiries may still fail to break down the subject's reluctance to admit to "forbidden information" (see Golding & Lichtenstein, 1970).

 

173 Communication by the Total Experiment

information. To carry out an effective inquiry, the investigator should strive to alter the subject-experimenter relationship and restructure it in order to make the subject a coinvestigator. The change in role from subject to coinvestigator tends to have a significant effect on the nature of the communication. It is particularly helpful from this point of view for the inquiry to be conducted by someone other than the experimenter himself. A second experimenter does not need to alter his role relationship with the subject; rather he begins the interview with the subject as a coequal whose task now becomes helping the scientist to understand what was communicated to him by the experimenter, how he perceived the experimental procedures, and what might be done to improve the experiment in future work. Most important, perhaps, are the subtle, but often significant alterations in the nature of the communication which tend to take place when someone other than the experimenter discusses what occurred. In this different context, the subject can view his experience more dispassionately and may feel more free to discuss what he might construe as implied criticisms of the procedures or the experimenter. In conducting an inquiry, it is important to avoid any judgmental comments and rather to take the role of an individual seeking to learn and understand. As is the case in the therapeutic context, the more the subject is asked to explain and spell out what he means, the more likely that he will be in a position to communicate significant aspects of his experience. The more the interviewer is willing to accept ambiguous comments and such phrases as "You know what I mean," the less likely is he to learn what the subject really meant.

The inquiry procedure, even optimally carried out and objectified by the use of judges, still suffers from several limitations. The most important, perhaps, is that the subject's conclusions at the end of the experiment are determined by everything that has taken place, and it is very difficult to establish at what point in time within the experiment, or even the inquiry, the subject arrived at his conclusions. The relationship between behavior and the subject's awareness of a particular hypothesis is complex. Awareness may not be an all-or-none phenomenon; rather it may gradually evolve and it may even be partially shaped by the subject's observing his own behavior. On the whole, however, a particular hypothesis a subject may form about the experiment is likely to affect his behavior more after he has arrived at it. Consequently, it becomes very important to establish when in the course of the experiment, cues are available to permit him to form a particular hypothesis.

Solomon 9 has proposed the use of "sacrifice" subjects where the experiment is terminated at different points in time in order to establish the kind of


9 Richard Solomon, personal communication, 1958.

 

174 Martin T. Orne

hypotheses that are formulated at the various crucial points within a given study. Despite the cost of such a procedure, it is nonetheless extremely powerful in helping to clarify the nature of the cues which affect subjects' perceptions of an experiment.

The "Nonexperiment" Procedure

Another equally troublesome difficulty is the fact that subjects may derive their hypotheses about the experiment from their own behavior during the session. In order to avoid this difficulty, we have employed another procedure which was suggested independently by Riecken (1962) and which we have called the preinquiry or nonexperiment (Orne, 1962) in which all experimental communications are available to the subject.

In the nonexperiment, a subject is told everything about the experiment, given the experimental instructions, shown the experimental equipment, and is told about the treatment to which he would be exposed, but he is not actually exposed to any of the treatment procedures. For example, in the sensory deprivation study, he would be given the instructions, administered the pretests, shown the setup, put in the situation momentarily, and then told, "If you were a subject in this experiment, you would now be left alone for an indeterminate period -- that is, you would not know how long it would be; actually, however, it would be 8 hours." He is then told, "Now, I want you to behave as if you had actually been in this experiment." He is then administered the posttests as if he were an actual sensory deprivation subject. In the nonexperiment, the subject comes from the same population and shares the same background information as other members of the subject pool who would ultimately participate in the actual study. Such a subject is not quite an actor in the actual experimental context, but he is considerably more than an observer. He is in an experimental situation, though not the identical experimental situation as an actual subject, and tends to yield information that goes beyond what can be surmised intuitively by the investigator in planning his experiment.

The nonexperiment is an extremely interesting procedure because it depends exclusively on the subject's ability to think himself into the situation. Again, the very kind of mental mechanism which tends to complicate human research is turned to advantage in this procedure. The nonexperiment could easily be conceptualized as an extension of the inquiry procedure, but it has the virtue of yielding data in the same form as those given by subjects who actually participate in the experiment. It is no longer necessary to make an inference about how a given perception might affect a subject's behavior- - the subject himself makes that inference and provides the

 

175 Communication by the Total Experiment

behavior. If preinquiry data are identical to those obtained by subjects in the actual experiment, an alternative explanation for the subject's behavior must be considered. Thus, the subject in the actual experiment might be responding in terms of his expectations rather than to the experimental variables. Such a finding would suggest that the experimental procedure was not adequate as a rigorous test. It is for this reason that we think of these controls as procedural controls which help design better experiments.

Simulation Techniques

A further extension of this principle can be seen in the use of simulators as controls (Orne, 1959). This technique, which was originally designed for use in hypnosis research but can be extended to other kinds of treatments, asks the subject to simulate a state, for instance hypnosis, for an experienced hypnotist who is actually blind as to the true status of the subject. Again, we are asking the subject to utilize his cognitive processes in the service of the control procedure, to try his best to respond to all the subtle cues in the situation in order to yield the kind of behavior a highly hypnotizable, deeply hynotized individual would give. To the extent that an unhypnotizable, simulating subject is able to do this without special training, we must conclude that an alternative hypothesis could explain the behavior. It thus indicates that the experimental test, designed to prove something as necessarily due to hypnosis, is not adequate. It of course says nothing about the treatment itself. Thus, the simulator may merely know enough to predict accurately how a given treatment would affect the subject.

The nature of inference based on simulating subjects' responses must be very carefully thought through since only counterexpectational findings can be expected to yield differences. Consequently, we will often find no differences in behavior between hypnotized and simulating subjects even though there are differences in the mechanism by which the behavior is elicited. On the other hand, when differences do emerge, we will have considerably more faith in the experimental conclusions based upon them. [For a more detailed discussion of the conceptual issues, see Evans (1971), Orne (1971, 1972), and Sheehan (1971).]

The Concept of Quasi-Controls

In order to carry out any experimental research, it is necessary for the investigator to translate his theoretical concepts into operational terms, which then define the intended meaning of the experimental procedures. For

 

176 Martin T. Orne

meaningful experimental research in any field of science, it is crucial that what is operationalized in the experiment validly reflect the more general theoretical construct being examined. There are, however, no fixed rules by which the validity of the operational definition is evaluated, and, at times, this appropriately becomes a source of controversy. For the most part, in the physical sciences the reader of a paper is able to evaluate the adequacy of the translation.

As one applies the operational approach to psychological research, however, a new set of problems emerges. For example, in studying physiological effects of stress, the technique of requiring subjects to count backward rapidly by sevens has often been used as a moderate stressor. In order to compare the physiological response of the subject in his resting state with his response under stress, physiological recordings are obtained, first during a "base line" period and then after asking the subject to count backward as rapidly as possible by sevens -- the latter procedure being operationally defined as stress. In a recent experiment (Paskewitz & Orne, 1971), we were surprised to note that of nine subjects, two showed absolutely no objective evidence of stress when counting backward by sevens. These subjects reported that they were not troubled by this task, and, indeed, their objective performance was such that there was hardly any difference in either rate or accuracy when these individuals counted backward by sevens versus when they counted backward by ones.

While a great many techniques have been developed to make certain that differences observed in experimental situations are not random or chance events, very little, if any, attention has been given to the validity of the operational definition (Evans, 1971). It is obvious in this instance that for two of our subjects counting backward by sevens under time pressure was simply not stressful and that, to the extent that one is seeking to study the effect of stress on physiological responsivity, they should be excluded from the data analysis.

While most readers would agree in this instance, the more general implication would be that the investigator, defining a given procedure as a stressor, must ascertain whether in fact it serves the intended purpose for all of his subjects. If it fails to stress some individuals, they should not be considered in the same experimental situation as those who are in fact stressed. In other words, some aspect of the subject's response must a priori be used to ascertain whether the situation is for that subject what it was intended to be by the investigator.

There are a considerable number of research contexts where the subject's perception of the situation leads to radically different responses. In the example just given, the problem arose due to individual differences in numerical skills. More subtle, but equally serious, problems result from

 

177 Communication by the Total Experiment

purely cognitive factors leading to differing perceptions of the experimental situation. In translating the work on avoidance conditioning with dogs to humans, Turner and Solomon (1962) used a simple manipulandum which required subjects to move a lever in order to avoid shock. Some subjects failed to learn and continued to accept shocks for very large numbers of trials. These subjects had been solicited for a shock experiment, and the initial experimental procedure included considerable emphasis on the level of shock that could be tolerated. It turned out that some individuals perceived the purpose of the study to establish whether they would be able to tolerate the discomfort of the shock, and they behaved accordingly, ignoring the manipulandum. Clearly, the subjects who learned perceived the experimental situation differently from those who failed to do so. A minor modification in the instructions, including the statement that it was possible for subjects to learn to avoid the shock, led to rapid learning with all subjects. The experiment, as originally conceived by the investigators, concerned the learning of avoidance conditioning in man, but the perception of some subjects defined it as an endurance study for them, leading to very different and initially somewhat surprising behavior. Once the investigators explored the subjects' perception, it became possible to alter the experiment, leading to far more stable and predictable results. This was possible only after the experiment was analyzed from the subject's point of view.

Another kind of problem occurs when the subject sees the experimental situation in a way totally different from the way it is perceived by the investigator. This was the case in the studies on antisocial behavior and hypnosis to which we have alluded earlier. Only after the use of simulating subjects demonstrated that unhypnotized individuals would be willing to pick up a poisonous snake and throw acid at an assistant, did it become fully clear how differently the situation was seen by the subjects and the investigators. Once this discrepancy was recognized and resolved, it became a simple matter to run other control subjects and demonstrate that compliance in this situation varies directly with the degree of conviction with which the instructions are given.

Deception as a Control Procedure

The problems inherent in how subjects perceive the experimental situation are implicitly recognized by all investigators. This recognition has led to the use of deception in psychological research becoming so commonplace. Subjects are deceived in the hope of finding out how they would really behave in a situation not confounded by their awareness of the variable actually under investigation. Typically, investigators take considerable pride

 

178 Martin T. Orne

in the subtlety of their techniques and are satisfied to present the outline of how subjects were deceived, going on to assume that this was actually the case. Obviously, these investigators would not have utilized the deception manipulation had they not believed it would make a difference. It seems remarkable, therefore, that both they and their professional audience often uncritically assume that the subject is in fact deceived (Orne & Holland,1968).

In each of the examples previously cited, the meaning of an experimental finding can be understood only when the investigator evaluates it from the subject's point of view. This allows him to determine the validity of the procedures as operational definitions of the psychological process under investigation. In the study from stress research where two subjects failed to be stressed by the demand to count backward rapidly by sevens, these individuals differed from others in their ability to manipulate numbers. As a consequence, however, their perception of the situation was drastically altered. Thus, the operational definition of the counting procedure as a stressor was not valid for these particular subjects. In the conditioning study, subjects who perceived the experiment as a test of their masculinity were clearly responding to an entirely different psychological context from those who perceived the situation as a test of their ability to learn to avoid electric shock. Inquiry procedures clarified that situation and made it a simple matter to design instructions which assured that all subjects were in a learning experiment. In the study on antisocial behavior, simulating subjects were used to clarify that the act of using one's bare hands either to remove a penny from fuming nitric acid or to pick up a poisonous snake need not necessarily be defined by the subject as self-destructive behavior. Rather, these subjects showed that they clearly recognized the procedure as safe. Once the validity of the procedure as an operational definition of self-destructive behavior is questioned, our understanding of the previous experimental findings is radically altered.

In any deception experiment, it is clearly crucial to determine whether the deception manipulation was effective in order to evaluate the validity of the experimental procedure. Thus, if subjects are tipped off as to either the nature of the deception or to the fact that deception is part of the experiment, they ought to behave differently from those who lack such information. If they fail to do so, one may seriously question, in the absence of other data, whether it would not be best to assume that the original subjects might somehow have seen through the deception manipulation. Hence, the particular deception procedures are not likely to help in creating a valid operational definition (see Golding & Lichtenstein, 1970; Holland, 1967;Levy, 1967).

Several other relevant examples have been given earlier in this chapter,

 

179 Communication by the Total Experiment

and many more will undoubtedly come to mind when the reader considers studies where doubts can be raised regarding the validity of the experimental approach. It would seem clear that, for the experimental technique to become more effective in psychology, it is essential that we consider this problem at least as carefully as that of statistical significance. It is particularly relevant to the ecological validity of an experimental study to establish whether it validly reflects the mechanism it is meant to explore.

Quasi-Controls as Procedures to Evaluate the Total Experimental Communication

The usual concept of control is intended to clarify the nature of the independent variables. However, the problems we have been discussing here all result from the way the subject perceives the situation, what the experiment communicates or means to him, and how he is affected by the demand characteristics. These issues are at a different level from those generally considered in texts on methodology. Clearly, the discrepancy between how the subject perceives the experimental context and how the experimenter intended for it to be perceived will be more serious in some research studies than in others. Further, the effects of the discrepancy, if any, will be greater in some situations than in others. While we can never be certain about the extent to which these factors may be confounding variables, we can and need to use techniques to estimate their. possible effects.

We need to think in terms of controls built into the experiment to evaluate the effect of the experimental situation itself. These procedures are designed to explore how the subject perceives the situation and to determine whether his perception is that which the experimenter intended. Findings from these procedures will tell us little or nothing about the independent variables, and will help to shed light only on the extent to which the experiment is a valid means of examining them. These techniques are controls to evaluate the procedures of an experiment and the extent to which they validly reflect what they are intended to reflect. It is difficult to find an appropriate name for them. They could be considered "active controls" since they are concerned with the unwanted contributions to the experimental situation which the subject inevitably adds that are the result of the subject's active thinking processes. One might call them "validity controls" except, of course, that all controls speak to the validity of findings. Again, one might call them "procedural controls" because they test the adequacy of the experimental procedures, yet this seems hardly an adequate description. For want of a better term, I have called them quasi-controls (Orne, 1969). One would hope that, in addition to the quasi-control procedures

 

180 Martin T. Orne

discussed earlier (which include the nonexperiment, the use of simulating subjects, the use of subjects alerted to possible deception), other quasi-controls appropriate to special experimental problems will be developed. These control procedures need to be interpreted with great care because they include involving the subject in a special and different way from simply asking him to be a passive responder. They require responses from the subject as basic data to evaluate the communication of the total experiment.

When Are Quasi-Controls Needed?

For didactic reasons, a number of examples have been given where the subject's perception of the experiment, his beliefs about what constitutes appropriate behavior, and his perception of the research aims account for the major portion of the variance in his responses. Though these variables are important in some experimental situations, they have little or no effect on the outcome of others. Unfortunately, in the absence of quasi-control data one can hardly ever be certain whether this is the case in any particular experiment. Since clear instances can be documented in such diverse areas as conditioning, perception, and psychophysiology as well as social psychology where subjects' perceptions have played unexpectedly important roles, one may well ask to what extent an investigator needs to concern himself with these issues. Is it essential to set up elaborate quasi-controls in every study with human subjects? Such a course would increase the cost of research geometrically and slow research to a snail's pace. On the other hand, if the investigator assumes that these factors are unimportant, is he likely to be deceiving himself and wasting his time on laboratory artifacts?

There are a number of hints, however, which ought to alert an investigator, in the course of reviewing the literature as well as during his pilot research, to the likelihood that demand characteristics are important variables in the particular phenomenon under investigation. If the same procedure appears to lead to dramatically differing results in different laboratories, or, alternatively, if widely different procedures used in the same kind of context produce similar results whereas these procedures in other contexts do not elicit such findings, the subject's perceptions are likely to be of paramount importance.

During pilot studies, most subjects may yield data consistent with the literature, but a few produce totally different findings. Under such circumstances, it will be most productive to explore in depth the nature of the aberrant subjects' responses and the possible reasons for them. From this point of view, it is not important whether the number of subjects who behave appropriately is sufficiently great to yield statistically significant

 

181 Communications by the Total Experiment

findings despite several deviant subjects. The purpose of exploring the reasons for subjects' aberrant responses is the likelihood of obtaining crucial hints about the nature of the demand characteristics in the situation. The question, of course, is to establish whether the deviant responses are a. function of peculiarities of the subject or of alternative perceptions of the experimental situation. When the latter is the case, it is usually possible to alter the experimental results drastically by subtle variations in procedures and instructions. Furthermore, the findings from the experiment can be generalized only to those individuals who perceive the experiment in a particular manner which, in turn, may be contingent upon subtle and ephemeral cues. Similarly, a bimodal set of responses should alert the investigator to explore differences in subjects' perceptions. This is especially true when subjects split on such parameters as psychology majors versus others, previous experience with experimental studies, subjects run early in the experiment versus subjects run late in the experiment, IQ, and so on.

As investigators, we should seek to understand insofar as possible every response of every single individual. In order to accomplish such a goal we need to focus on how each particular subject perceives the experiment. It should be clear that such efforts to understand and explain responses have little or no place in the published account of an experiment; rather, they are intended to serve the vital function of helping to increase the investigator's sensitivity to alternative explanations of subjects' responses. Such an understanding is likely to permit the investigator to redesign his experiment in such a manner that the subject perceives the situation in a way that makes the operational definitions more congruent with the mechanisms they are intended to reflect.

Common Pitfalls in Attempts to Control for Demand Characteristics

With the growing concern about issues of demand characteristics, studies in some areas of psychology have attempted to deal with the problems in a variety of ways. Two procedures widely believed to be effective in controlling demand characteristics deserve mention because they simply fail to come to terms with the problems. Under the misapprehension that demand characteristics are communicated primarily by instructions, some investigators have tried to eliminate verbal instructions, attempting to treat their subjects as though they were laboratory animals. It is foolhardy to think, however, that simply because the subject is not told verbally why an experiment is being conducted or how he is expected to behave, he will therefore fail to interpret or to use the cues available from the experimental procedure itself as the basis for his experimental performance. No verbiage simply does

 

182 Martin T. Orne

not equal no communication! Regardless of what the subject is told or not told, he will inevitably form perceptions about the experimental purpose, which will, in turn, affect his responses.

Partially in reaction to the use of deception in psychological experiments, some investigators have made a special point of telling subjects precisely what is wanted and why. In some instances, they have made the unrealistic assumption that the fact of their honesty will assure how the subject perceives the situation. Unfortunately, some honest explanations about experiments are less plausible to subjects than some of the better deception instructions. At times, subjects find it hard to believe that the investigator is really interested only in what he honestly describes as his purpose. When this occurs, the subject's belief about what he is doing and his inaccurate perception of the experimental purpose will affect his behavior. The consequence is exactly the same as in an unsuccessful deception study: It is what the subject perceives which determines his behavior, and this need not be what the experimenter had intended.

Demand Characteristics as a Spoiler Variable

As one gradually becomes aware of the extent of the subject's active participation in experimental situations and how this may serve to distort and modify the effects of the independent variables, it becomes necessary to rethink the enterprise of research with human subjects. At first one tends to be appalled by the difficulties. The initial dismay may gradually turn into irritation, as students and colleagues are all too ready to base criticisms of serious research on questions of demand characteristics. It is always possible to raise doubts about whether subjects really perceived the situation as they were supposed to and, consequently, whether demand characteristics might not be a better way to account for the subject's behavior than the independent variable. Such criticisms when leveled glibly and without much thought become a source of considerable annoyance to those of us actively engaged in empirical research. A preoccupation with these problems may even serve to discourage colleagues from concerning themselves with substantive psychological issues.

Unfortunately, the investigator trying to answer substantive questions by carrying out laboratory research with human subjects must somehow deal with the infinite regress implicit in criticisms arguing that a given effect is due to demand characteristics. There can be no simple answer since any experiment, regardless of the caution with which it is conducted and the careful use of quasi-controls, can still be subjected to such criticisms. While there is little doubt that research with human subjects is far more difficult

 

183 Communication by the Total Experiment

and complex than has generally been recognized, no adequate viable alternative to the experimental method is available. Field experiments, experiments in nature, the technique of the participant observer may all serve useful purposes, but none permits the systematic exploration of psychological mechanisms, nor do they provide for the kind of control which is potentially possible in the laboratory. The recognition that the experiment is a complex interaction and that subjects' cognitive processes may interfere with our efforts to investigate these very processes should not negate the potential usefulness of the experiment. An understanding of the concept of demand characteristics should ultimately allow us to design better experiments, permitting ecologically valid inference to the real world, because we have learned how to analyze the communication of the total experiment from the subject's point of view.

It does follow however, that it will be necessary to take the potential problems of demand characteristics into account in any interpretation of psychological research. Perhaps this will help us recognize that no single experiment can ever resolve major issues -- that the effort to design the single perfect, definitive experiment is essentially futile. Instead, experimental findings need to be conceptualized as a useful source of relevant information concerning the psychological phenomena they purport to investigate. The controlled circumstances under which the observations are obtained help in their interpretation; however, certain inevitable limitations also stem from the context in which the observations are carried out.

When data from experimental studies appear to contradict repeated reliable observations in nonexperimental settings, further work seems essential to resolve such a paradox. We can ill afford to accept experimental data because "it was obtained under carefully controlled laboratory conditions" and reject other observations out of hand; nor is it appropriate to reject experimental data because of conflicting field observations. Science must be able to account for apparently contradictory findings, and the kinds of variables we have been discussing here are particularly likely to be responsible for discrepant observations.

The experimental investigation of psychological phenomena remains an exceedingly powerful and effective way of answering important questions. Nonetheless, it is essential that the limitations of findings from any given experiment be recognized. This seems particularly urgent if it is our intent to generalize our observations so that policy decisions in the real world can be based upon them. Under such circumstances, we should require a matrix of experimental data as well as congruent, systematic observations gathered in nonexperimental settings in order to feel reasonably confident about the ecological validity of our conclusions. If the concept of demand characteristics is useful only in raising doubts about premature generaliza-

 

184 Martin T. Orne

tions from single experiments (regardless of how congruent they may be with our value system) to policy decisions affecting the lives of individuals, such an outcome would still seem highly desirable for psychology, for those who would apply its findings, and particularly for those who would be affected by the decisions.

Even though an investigator is careful in his research, tries to see experimental observations in the broader perspective of observations obtained in other contexts, and tries to use quasi-control procedures as they seem relevant, he may still find his experiments subject to criticisms arguing that a given effect is due to demand characteristics. How can he deal with such critiques?

While any experiment may be criticized on the basis of presumed demand characteristics, such criticism eventually boils down to a controversy about how the experiment is seen from the subject's point of view. Since it is far more difficult to specify how subjects actually perceive an experiment as opposed to the nature of instructions and physical stimuli, it is hardly surprising that it is more difficult to deal with. The concept of demand characteristics forces us to attend not so much to what is done to the subject, but rather to how the situation is perceived by the subject. Implicitly, all investigators make definite assumptions about how their subjects see the experimental situation, assumptions which are generally taken for granted and rarely challenged. A criticism based on demand characteristics challenges this implicit view and thereby provides an alternative hypothesis to explain the observed findings, which is not different in kind from a myriad of other alternate hypotheses which can be devised to account for experimental observations. Clearly, no investigator can be required to answer every possible alternate interpretation of his findings; he needs to concern himself only with plausible alternatives. Thus, if a critic argues that an effect is due to demand characteristics, he is implicitly stating that the experiment is perceived by the subjects in a particular fashion. Such an assertion is readily made explicit and should in the final analysis be resolved empirically, much as any other dispute about possible causal mechanisms.

The Peculiar Nature of the Psychological Experiment and How It Affects Replication of Prior Research

This discussion has consistently emphasized the importance of what is communicated by the total experimental situation to the subject and the significance of his perceptions of this total context for an understanding of his behavior. The experimental method, however, was developed in the physical sciences and assumes the object of study is passively responsive

 

185 Communication by the Total Experiment

and that it is possible to specify all relevant variables that impinge on the object of study in a given experiment. In utilizing the experimental procedure in psychology, the major difficulty is the subject's status as an active participant rather than as a passive responder to stimuli.

Unfortunately, we do not have a viable alternative conceptual model for the experiment. It is all too tempting to criticize its use in psychology because the assumptions of the physical sciences are not satisfied. However, it is an extremely productive exercise to force ourselves to specify all relevant variables that may affect the experimental outcome -- even though the subject is not a passive responder -- and it is likely that the assumptions of the experimental method are not significantly violated provided the subject perceives the total experimental situation in the manner in which the investigator had intended him to.

It is worth noting that in each and every instance where the demand characteristics of the experiment appear to have been the significant determinant of the subjects' behavior (rather than the independent variable which had previously been assumed to determine the responses), there was a major discrepancy between the manner in which investigators had thought subjects perceived the experimental situation and the manner in which they actually did perceive it.

Despite the shortcomings of the experimental method as used in psychology, every effort should be made to design experiments so that the relevant variables can be objectively stated, and the primary role of quasi-controls is in helping to design better experiments. Specifically, they make it possible to evaluate whether the subject's perception of the experimental situation is isomorphic with how the investigator had intended it to be and therefore address the validity of the experimental procedure.

One special characteristic of human experimentation deserves particular attention. Not infrequently, very subtle and apparently trivial changes in subject selection, experimental setting, instructions, or procedure may cause major differences in the manner in which the experiment is perceived by the subject. Sometimes such changes can be anticipated in advance, but at other times the effects are puzzling and unpredictable. The tendency for communication implicit in the total experimental setting to be radically altered by some apparently trivial modification of procedure has, however, crucial implications for an understanding of replication.

In contrast to the physical sciences, detailed replications are rarely carried out in psychology and many widely quoted, frequently cited studies have not ever been replicated. When a noncontroversial finding is replicated, it is not seen as a particularly significant contribution to the field. On the other hand, a failure to replicate is also not readily published. The contrasting attitude in psychology, as opposed to the physical sciences where

 

186 Martin T. Orne

replication is seen as an essential and valuable enterprise, cannot be ascribed to the amount of effort involved in carrying out the research, since this has not prevented replication of major findings in other sciences. Rather, it seems related to the difficulty of coming to terms with the inherent problems of the experimental paradigm itself as it is used in psychology.

The most serious problem seems related to the implicit assumption that it is possible in psychological experiments to specify adequately all relevant variables in a scientific communication. However, due to the limitations of space, the procedure section is invariably cut to a bare minimum, and while the casual reader may find even such a shortened version pedantic and detailed, anyone seriously attempting to replicate finds innumerable questions where details of procedure are simply not spelled out.

A failure to replicate a psychological experiment is, in fact, usually a trivial observation which makes little or no contribution to the field, because this failure is likely to be due to subtle, unspecified changes in procedure. The fault is as likely to be with the attempted replication in not reproducing the circumstances under which the original findings were obtained as with the original study. The only way in which an investigator may have some reasonable certainty that he has created an experimental situation which truly replicates that of the original study is when, after obtaining answers from the author of the original study concerning the myriad of unpublished details, he repeats the study and actually obtains findings similar to those of the original paper. If he fails to do so, it becomes necessary to systematically study the experimental procedure, subject selection, amount of information available to the subject, and so forth, modifying these parameters in subtle ways consistent with the published description as elaborated by the author's commentary, until a situation is created where the original findings are replicated.10 Only after this has been accomplished, is it possible to clarify the mechanisms by which these findings were obtained. Only then is it appropriate to show that varying the independent variable involved in the original study fails to affect the experimental results, whereas varying some other variable to which the original author had ascribed little or no importance is sufficient to alter the findings drastically.

For example, consider the antisocial behavior and hypnosis study previously discussed. It would have been meaningless to carry out this study without demonstrating that hypnotized subjects could be compelled to carry out antisocial and self-destructive behavior while other subjects, when asked whether they would carry out these behaviors, indicated that they would not. Only after the essential observations of Rowland (1939) and


10 The asymmetry of positive versus negative findings in replications has recently been emphasized by Aronson and Carlsmith (1969).

 

187 Communication by the Total Experiment

Young (1952) had been replicated in fine detail, with the authors' clarification of the procedural nitty-gritty, did the experimental demonstration that simulating subjects would carry out the same behaviors become relevant and, subsequently, the demonstration that other subjects could easily be persuaded to do likewise. One can conceive of an experiment done in a way that would have made the safety precautions so transparent that subjects asked whether they would carry out the behaviors would readily indicate their willingness to do so. If this had been the case, other aspects of the demonstration would have been trivial and meaningless because it would have been very unlikely that a truly analogous situation had been created.

In our present state of development in psychology, then, the only meaningful replication is one where the experimenter is able to reproduce not only the conditions, but also the responses of the subjects. This is necessary as a means of ascertaining that the original situation had, in fact, been replicated; that is, it is not likely that the communication of the total original experiment has been brought under control without being able to reproduce the basic data of the original subjects' responses. Only after this can be accomplished is it possible to demonstrate that the determinants of the subjects' responses might not be what the previous investigator believed them to be, but rather something else. Such an experiment is, of course, more difficult to carry out than a casual replication. It forces the investigator to try to understand how a given finding might have been obtained and to come to terms with the likelihood that a failure to replicate is probably due to his inability to reproduce the relevant conditions (in terms of the total experimental communication). Such a level of understanding is essential, however, in order to be able to explain previous findings definitively and add significantly to our knowledge.

In our work, we have attempted this type of procedure and in several instances have been fortunate enough -- after a considerable period of experimentation -- to replicate the original findings of others and then to dissect the mechanisms which were involved (Gustafson & Orne, 1965; O'Connell, Shor, & Orne, 1970; Orne, 1959; Orne & Evans, 1965; Orne, Sheehan, & Evans, 1968). This has led us to a progressively greater emphasis on understanding the experimental communication from the subject's point of view.

There are, of course, disadvantages in requiring that findings be replicated. Since the likelihood of replicating erroneous findings reported in the literature would be small indeed -- despite the most assiduous efforts to vary the conditions in a plausible manner -- the requirement that this be done would tend to allow an incorrect finding to stand.

Perhaps even more important is the possibility that even if a finding is replicated by varying some of the subtle demand characteristics which could

 

188 Martin T. Orne

presumably have played a role in the original observations, one can never be certain that the mechanisms responsible for the data obtained in the replication are the same as those which were operant in the original study. Thus, even if an investigator is able to show that by approximating the situation and varying the manner in which the subject might have perceived the experiment he is able to manipulate the presence or absence of previous observations, he can never conclusively prove that these were in fact the mechanisms which accounted for the previously reported observations. He has, of course, documented a plausible and viable alternative hypothesis, and, hopefully, future work will permit the scientific community to decide between his conclusions and those of the original report.

Despite these limitations, this approach toward replication seems the best available compromise with the realities of psychological experimentation. Though it makes it more difficult to refute erroneous observations and it does not eliminate the possibility that similar data can be produced by multiple mechanisms, the proposed ground rules still seem worthwhile. Hopefully they provide a framework within which it becomes meaningful to build upon the work of others. Further, they spell out how critiques based on the concept of demand characteristics can be applied responsibly (see, for example, Page & Scheidt, 1971). In the final analysis, a realistic view toward the problems and virtues of replication will help psychology in its efforts to develop a reliable hard core of information that is generally accepted by the field.

Summary

In this discussion, an effort has been made to focus on what is communicated by the total experimental situation from the subject's point of view and to show the crucial importance of being certain that the subject's perception is that which the experimenter intends it to be. The significance of subtle factors affecting subjects' perceptions have been emphasized as well as how the experimental model derived from physics must be modified in order to make it appropriate to the study of psychological problems. Despite the discrepancies between how the experiment is ideally conceptualized in the physical sciences and the realities of experiments with human subjects, it remains the single most useful tool for the systematic study of psychological mechanisms. It is, however, essential that the Achilles' heel of the psychological experiment receive careful attention; that is, the extent to which any given experimental procedure serves as a valid operational translation of the psychological process the investigator hopes to study. Appropriate techniques, designed to clarify the validity of the experimental

 

189 Communication by the Total Experiment

procedure, must be developed and used lest inappropriate and misleading conclusions be drawn from the data.

Since the major difficulty is a function of the subject's active mental processes which make him a participant in the unique interaction we call the psychological experiment rather than merely an object of study, we have tried to suggest special ways in which the possible influences of these active mental processes may be evaluated. Therefore, an effort must, be made to specify not only what is done to the subject but, equally important, what is communicated to him by the total experimental situation. As we become more adept at recognizing the contributions which these variables make to our findings, the psychological experiment will become an even more powerful tool. Certainly, such an understanding is essential if we would hope to claim ecological validity for our findings and extend them beyond the confines of the laboratory.

Ultimately, it will be necessary to devote the kind of careful attention to questions concerning the validity of operational definitions that is currently paid to questions concerning the statistical validity of findings. As we gain a better understanding of these issues, the experiment in psychology will have not only the appearance of science, but will also serve psychology with the kind of effectiveness which has made the experimental method the hallmark of the physical sciences.

Acknowledgments

I would like to express appreciation to my colleagues at the Unit for Experimental Psychiatry, Mary R. Cook, A. Gordon Hammer, David A. Paskewitz, and Harvey D. Cohen, for their helpful comments in the preparation of this paper. I am particularly grateful to Frederick J. Evans, Charles Graham, and Emily Carota Orne for their detailed criticisms and many incisive suggestions.

The substantive work upon which the theoretical outlook presented in this paper is based was supported in part by Grant #MH 19156 from the National Institute of Mental Health and by a grant from the Institute for Experimental Psychiatry.

References

Adair, J. G., & Fenton, D. P. Subject's attitudes toward psychology as a determinant of experimental results. Canadian Journal of Behavioral Science, 1971, 3, 268-275.

Aronson, E., & Carlsmith, J. M. Experimentation in social psychology. In G. Lindzey & E. Aronson (Eds.), The handbook of social psychology. (2nd ed.) Vol. 2. Research methods. Reading, Massachusetts: Addison-Wesley, 1969. Pp. 1-79.

Brehm, J. W., & Cohen, A. R. Explorations in cognitive dissonance. New York: Wiley, 1962.

 

190 Martin T. Orne

Evans, F. J. Simulating subjects: Who is fooling whom? Paper presented at the meeting of the American Psychological Association, Washington, D. C., September 1971.

Evans, F. J., & Orne, M. T. Motivation, performance, and hypnosis. International Journal of Clinical and Experimental Hypnosis, 1965, 13, 103-116.

Festinger, L. Theory of cognitive dissonance. Evanston, Illinois: Row, Peterson, 1957.

Golding, S. L., & Lichtenstein, E. Confession of awareness and prior knowledge of deception as a function of interview set and approval motivation. Journal of Personality and Social Psychology, 1970, 14, 213-223.

Gustafson, L. A., & Orne, M. T. Effects of perceived role and role success on the detection of deception. Journal of Applied Psychology, 1965, 49, 412-417.

Hilgard, E. R. Hypnotic susceptibility. New York: Harcourt, 1965.

Holland, C. H. Sources of variance in the experimental investigation of behavioral obedience. Unpublished doctoral dissertation, Univ. of Connecticut, 1967.

Hull, C. L. Hypnosis and suggestibility. New York & London: Appleton, 1933.

Lana, R. E. Pretest sensitization. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research. New York: Academic Press, 1969. Pp. 119-141.

Levy, L. D. Awareness, learning, and the beneficent subject as an expert witness. Journal of Personality and Social Psychology, 1967, 6, 365-370.

Masling, J. Role-related behavior of the subject and psychologist and its effects upon psychological data. In D. Levine (Ed.), Nebraska symposium on motivation. Lincoln: Univ. of Nebraska Press, 1966.

O'Connell, D. N., Shor, R. E., & Orne, M. T. Hypnotic age regression: An empirical and methodological analysis. Journal of Abnormal Psychology, 1970, 76 (Monogr. Suppl. No. 3), 1-32.

Orne, M. T. The nature of hypnosis: Artifact and essence. Journal of Abnormal and Social Psychology, 1959, 58, 277-299.

Orne, M. T. On the social psychology of the psychological experiment: With particular reference to demand characteristics and their implications. American Psychologist, 1962, 17, 776-783.

Orne, M. T. Demand characteristics and the concept of quasi-controls. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research. New York: Academic Press, 1969. Pp. 143-179.

Orne, M. T. Hypnosis, motivation, and the ecological validity of the psychological experiment. In W. J. Arnold & M. M. Page (Eds.), Nebraska symposium on motivation. Lincoln: Univ. of Nebraska Press, 1970. Pp. 187-265.

Orne, M. T. The simulation of hypnosis: Why, how, and what it means. International Journal of Clinical and Experimental Hypnosis, 1971, 19, 183-210.

Orne, M. T. On the simulating subject as a quasi-control group in hypnosis research: What, why and how. In Erika Fromm & R. E. Shor (Eds.), Hypnosis: Research developments and perspectives. Chicago: Aldine-Atherton, 1972. Pp. 399-443.

Orne, M. T., & Evans, F. J. Social control in the. psychological experiment: Antisocial behavior and hypnosis. Journal of Personality and Social Psychology, 1965, 1, 189-200.

Orne, M. T., & Holland, C. H. On the ecological validity of laboratory deception. International Journal of Psychiatry, 1968, 6, 282-293.

Orne, M. T., & Watson, P. D. The motivation of subjects for traumatic experiments. Paper presented at the meeting of the American Psychological Association, New York, September 1957.

Orne, M. T., Sheehan, P. W., & Evans, F. J. Occurrence of posthypnotic behavior

 

191 Communication by the Total Experiment

outside the experimental setting. Journal of Personality and Social Psychology, 1968, 9, 189-196.

Page, M. M., & Scheidt, R. J. The elusive weapons effect: Demand awareness, evaluation apprehension, and slightly sophisticated subjects. Journal of Personality and Social Psychology, 1971, 20, 304-318.

Paskewitz, D. A., & Orne, M. T. Cognitive effects during alpha feedback training. Paper presented at the meeting of the Eastern Psychological Association, New York, April 1971.

Riecken, H. W. A program for research on experiments in social psychology. In N. F. Washburne (Ed.), Decisions, values and groups. Vol. 2. New York: Pergamon Press, 1962. Pp. 28-42.

Rosenberg, M. J. The conditions and consequences of evaluation apprehension. In R. Rosenthal & R. L. Rosnow (Eds.), Artifact in behavioral research. New York: Academic Press, 1969. Pp. 279-349.

Rosnow, R. L., & Aiken, L. S. Mediation of artifacts in behavioral research. Journal of Experimental Social Psychology, in press.

Rowland, L. W. Will hypnotized persons try to harm themselves or others? Journal of Abnormal and Social Psychology, 1939, 34, 114-117.

Sheehan, P. W. Countering preconceptions about hypnosis: An objective index of the involvement of the hypnotist. Journal of Abnormal Psychology, 1971, 78, 299-322.

Shor, R. E., & Orne, Emily C. Norms on the Harvard Group Scale of Hypnotic Susceptibility, Form A. International Journal of Clinical and Experimental Hypnosis, 1963, 11, 39-47.

Silverman, I. Role-related behavior of subjects in laboratory studies of attitude change. Journal of Personality and Social Psychology, 1968, 8, 343-348.

Solomon, R. L. Extension of control. group design. Psychological Bulletin, 1949, 46, 137-150.

Turner, L. H., & Solomon, R. L. Human traumatic avoidance learning: Theory and experiments on the operant-respondent distinction and failures to learn. Psychological Monographs, 1962, 76 (Whole No. 559), 1-32.

Young, P. C. Antisocial uses of hypnosis. In L. M. LeCron (Ed.), Experimental hypnosis. New York: Macmillan, 1952, Pp. 376-409.

Zamansky, H. S., Scharf, B., & Brightbill, R. The effect of expectancy for hypnosis on prehypnotic performance. Journal of Personality, 1964, 32, 236-248.


The preceding paper is a reproduction of the following book chapter (Orne, M.T. Communication by the total experimental situation: Why it is important, how it is evaluated, and its significance for the ecological validity of findings. In P. Pliner, L. Krames, & T. Alloway (Eds.), Communication and affect. New York: Academic Press, 1973. Pp. 157-191. It is reproduced here with the kind permission of Academic Press, now an imprint of Elsevier Science, and the book's co-editors Patricia Pliner, Lester Krames, and Thomas M. Alloway.