How to stay true to our science: three principles to guide our behavior.
Abstract: The scientific method is the method of choice for studying phenomena that results in the highest level of objective believability. Without a basic understanding and appreciation of the tents of science and scientific inquiry, one is at risk for believing in cause and effect relationships that really don't exist and adopting treatments for various concerns that may not in truth be effective. Three principles of science that aid in rational and objective thinking are skepticism, experimentation, and internal validity. Behaving skeptically and requiring experimentation that controls for threats to internal validity increase the chances that one is more like to reject claims that have no proof and adopt claims that have a level of credible evidence.

Keywords: fad treatments, science, experimentation, evidenced-based treatment
Article Type: Report
Subject: Evidence-based medicine (Health aspects)
Skepticism (Research)
Human acts (Research)
Human behavior (Research)
Author: Zane, Thomas
Pub Date: 06/22/2010
Publication: Name: The Behavior Analyst Today Publisher: Behavior Analyst Online Audience: Academic Format: Magazine/Journal Subject: Psychology and mental health Copyright: COPYRIGHT 2010 Behavior Analyst Online ISSN: 1539-4352
Issue: Date: Summer, 2010 Source Volume: 11 Source Issue: 3
Topic: Event Code: 310 Science & research Canadian Subject Form: Human behaviour
Product: Product Code: 9911220 Behavior Theory
Geographic: Geographic Scope: United States Geographic Code: 1USA United States
Accession Number: 248578630
Full Text: The ivory-billed woodpecker (Campephilus principalis) was last known to exist in 1944. Unexpectedly, in 2004, it was purportedly seen near Brinkley, Arkansas. This claim resulted in a scientific expedition that produced an inconclusive video that was used to confirm the bird's reemergence from extinction, an article in Science magazine extolling the excitement that the bird was indeed back, and a worldwide fascination towards a species supposedly extinct but now here again. Yet, despite over 5 years of searching at a cost of over $10 million, there remains no physical proof that the woodpecker is in fact alive (Radford, 2009).

At a 2004 Florida conference about treatment for Autism Spectrum Disorders (ASD), a medical doctor spoke to a group of parents about electromagnetic fields and their impact on autism. The doctor asked one parent if she used cell phones, to which the parent replied in the affirmative. With a grand wave of the hand, the doctor pronounced, "throw them out!" advocating for the unproven belief that the electrical energy emanating from cellular phones was somehow either responsible for or negatively impacting the symptoms of this neurological disorder.

When confronted with claims that are presented as true, how can we make a reasonable evaluation to ascertain, as confidently as possible, whether the claim has merit? This fundamental question impacts virtually all areas of our society. Claims abound--of alien abductions, the existence of the Loch Ness monster and Bigfoot, and the eating of wild boar meat to cure autism. How can we "separate the wheat from the chaff" in a way that both prevents the acceptance of wildly suspicious claims that have no support, and permits adoption, with some level of certainty and comfort, claims that are likely to in fact be true?

The best way known to evaluate claims is to adopt the intellectual discipline of science and the scientific method of investigation. This methodology involves (1) adopting "philosophic doubt" or skepticism (e.g., Cooper, Heron, & Heward, 2007) and (2) conducting controlled experiments that (3) minimize threats to internal validity. Practicing skepticism is crucial to protecting oneself from believing unsubstantiated claims. Though the American public views science's effect on society as positive (in a recent survey, 84% of respondents said that the effect of science was mostly positive and that the scientists were ranked as the third-most contributing profession to society, after the military and teachers; American Association for the Advancement of Science, 2009), the continued adoption of unproven beliefs, claims, and bizarre treatments (particularly in the field of autism) remains strong, suggesting that although science is lauded, skepticism--and scientific thinking in general--is not widely practiced. The use of experimentation is the most rigorous of the levels of science (Cooper, et al., 2007), because of the use of systematic manipulation of variables to test the existence of causal relationships.

Skepticism is not a view that promotes the disbelief of every truth or claim (Normand, 2008). Skepticism is more refined. Merriam-Webster Online (2010) defines it as, "an attitude or doubt or a disposition to incredulity either in general or towards a particular object." The word is from the Greek "skeptikos," meaning "inquirer" or "investigator" (DiCarlo, 2009). Pigliucci (2009) defines skepticism closer to the original Greek meaning as the suspension of judgment (either to adopt or reject) until sufficient evidence is examined.

Kurtz (2010) stresses this perspective with his discussion of "skeptical inquiry," an approach that promotes the examiner to ".. .seek, when feasible, adequate evidence and reasonable grounds for any claim to truth in any context." (p. 21, as quoted in Normand, 2008). Claims of all kinds should be, before adoption or rejection, examined for the amount and quality of evidence that supports them. Thus, if there is a particular treatment for which there is valid scientific evidence for support, that treatment should be adopted and viewed as evidenced-based. However, when a claim is shown to have no evidence, or evidence that is weak and of poor quality (such as solely relying on the opinion of the claim maker), the rejection of such a claim or position should be the decision. Simply put, skepticism is the position of objectively evaluating, by looking for empirical evidence, the validity of any claim of fact, and basing adoption or rejection on the evidence (or lack thereof; Normand, 2008).

This skeptical attitude, and the corresponding investigatory approach, reduces the possibility of adopting, as true, a claim (or treatment) that may not be true. As is often said, extraordinary claims could be true, but a skeptical approach towards them would require extraordinary evidence and evaluation of that evidence. To reiterate, a skeptical thinker does not reject all claims; nor does s/he accept all claims as true. Rather, the position of a skeptical thinker is one of assessing the validity of the evidence before rendering a decision. The type of evidence is important, and there is an acknowledgement that there exists quite a bit of variation and debate regarding what evidence constitutes "valid" evidence (Zane & Hanson, 2008). But there is general agreement that the methods and criteria used by science is the most acceptable perspective to take.

Normand (2008) smartly acknowledged that the literature provides little specification on exactly how to behave skeptically. To increase the number of people who are "scientific skeptics (a termed coined by Normand; those who think and act skeptically), several suggestions are offered.

First, study and adopt the methods of science, scientific investigation, and skepticism, as described by numerous textbooks that exist on these subjects (e.g., Cooper, et al., 2007; Sagan, 1996). The scientific perspective and method of inquiry will inoculate against the reflexive acceptance of claims that are baseless.

Second, require that anyone making extraordinary claims provide extraordinary evidence to substantiate those claims. For example, when the practitioners of craniosacral therapy assert that they do not even need to touch the client's body in order to change the course of the cerebral spinal fluid (Zane, 2005), they should be required to present evidence that this is in fact true. When Gutstein, the developer of Relationship Development Intervention, asserts that, "The RDI Program is for every age group and for every range of severity, including those who are severely affected by autism" (Connection Center, 2005), he should be required to present the evidence that backs up this extraordinary claim.

Third, don't be gullible--do not accept claims without evaluation. Accepting all claims is not only intellectually dishonest, but potentially dangerous and fatal (Pigliucci, 2009). For example, promoting holistic remedies for curing AIDS will likely result in the unnecessary deaths of persons with the disease. Gullibly accepting the false claim that vaccines cause autism may lead to parents not vaccinating their children, and such an action puts children at risk for serious diseases. Furthermore, accepting claims without critical evaluation will result in significant costs in money, time, and emotion (Zane, Davis, & Rosswurm, 2009). Gullibility is the opposite of skepticism, so by demanding evidence of truth will naturally protect one from being gullibly accepting every claim.

Fourth, behave according to this rule--"In science, keeping an open mind is a virtue-just not so open that your brains fall out." (James Oberg; Sagan, 1996). In other words, be intellectually willing to accept any claim, but always seek for evidence and proof of truth before acceptance is granted.

Finally, find contexts that promote skepticism. For example, attending meetings of other skeptics and listening to podcasts such as The Skeptics Guide to the Universe will prompt and reinforce skeptical behavior (Loxton, 2009). Consider following some of the suggestions in What Do I Do Next, a call for action on the part of all skeptics (Loxton, 2009).

In addition to behaving skeptically, another prerequisite to accessing clinical treatment is to understand what treatments might be effective and have a chance of delivering positive results. And a prerequisite to determining what treatments have actually been proven to be "evidenced-based" is understanding some information about what makes a research design a valid design.

Consider a study recently published by Rossignol and Rossignol (2006), in which they assessed the effect of a hyperbaric oxygen chamber on a range of symptoms of six children diagnosed with autism. Prior to starting the hyperbaric oxygen therapy (HBOT), the researchers assessed the participants on three measures, the Autism Treatment Evaluation Checklist, the Childhood Autism Rating Scale, and the Social Responsiveness Scale. The children then participated in HBOT for 40, 1-hour sessions, and the researchers then re-assessed the participants using the same measures as in the pretest. For most of the children, the post-test scores were lower on each assessment (for these instruments, a lower score suggests fewer symptoms of autism and improved functioning). Thus, the authors suggested that the HBOT was responsible for the improvement.

Consider a study by Gutstein, Burgess, and Monfort (2007), in which they assessed the effectiveness of an autism treatment developed by Gutstein called Relationship Development Intervention (RDI). Here, the authors selected 16 children with autism and reviewed their files, noting their test scores on various measures prior to receiving RDI. The authors noted the scores on a subset of the Autism Diagnostic Observation Schedule and Autism Diagnostic Interview-Revised, and had parents provide information about each child's educational placement (on a continuum of intrusiveness) and level of "flexibility" (i.e., child's comfort level reacting to change in his/her life and routine). After obtaining these measures, the participants received RDI for an average of 18 months. Following treatment, Gutstein, et al. conducted post-test assessments using the same measures as the pretest. For most children, the authors concluded that the posttest scores improved over pre-test scores, and suggested that RDI was responsible for the improvement.

Researchers and clinicians often attempt to demonstrate the effectiveness of an autism treatment by using this common "pre-post" test design (also called "before-after, "AB," and "one-group, pretest-posttest design; e.g., Drew, Hardman, & Hosp, 2008; Fraenkel & Wallen, 2009). The general strategy in a pre-post test study is to obtain some measurement of the critical dependent variable(s) hypothesized to be changed by the treatment, implement the treatment protocol, and then the re-administer the same measurement as pretest. There is an assumption that if the post-test scores have changed positively from the pretest scores, then the change is due to the treatment. Many researchers and treatment developers use this basic design (e.g., Linderman & Steward, 1999; Rossignol, Rossignol, James, Melnyk, & Mumper, 2007).

The important question is, does this design permit convincing proof that the treatment caused the improvement in the variable(s) being measured? The answer is unambiguous--this basic design never permits confirmation of cause and effect between the treatment and positive changes in the dependent measures (e.g., autism symptomology; Drew, et al., 2008; Fraenkel & Wallen, 2009).

The weakness of this design to demonstrate causal relationships relates to its inability to minimize "internal validity" threats. The internal validity of a research study refers to the level of confidence in believing that changes in the variables being measured are due to the treatment protocol being used. If the research study is designed to eliminate any explanation other than the treatment changing what is being measured, then that study has strong internal validity. On the other hand, if the research study is designed in a way that allows explanations other than the treatment variable to possibly be influencing what is being measured, then that study will have weak internal validity, and the conclusion must be that the treatment may not be the only reason for the change in the dependent measurements. And if there is an assumption that variables other than the treatment could have produced the changes in what is measured, one must conclude that the treatment probably did not cause the changes.

There are several threats to the ironclad belief that a treatment caused the positive changes in participant's behaviors or abilities. Any research study must strive to eliminate these from serious consideration of having impacted the study and results. Some of these threats are reviewed here:

Subject characteristics--this threat refers to the possibility that the participants selected to be in the study might have certain characteristics that make them more sensitive to the treatment or perform better. This threat is highly likely when a study utilizes only one group of participants, or if the researcher did not select the participants in a random fashion.

Loss of subjects--if a researcher reports that many participants failed to finish the treatment, this could bias the eventual results, due to the possibility that these participants, if they had finished the treatment, might not have impacted by the treatment similarly to other participants.

Location--if participants in a research study are located in different places (such as different classrooms, different materials and resources), then these factors could explain any positive results, as opposed to believing that only the treatment could have influenced the participants.

Instrumentation--this threat refers to several different potential problems. First, if there is poor reliability and validity of the test, then this threat is a possibility. If data collectors are not trained well, or if there are not occasional "reliability checks" on the primary data collectors, then perhaps there is bias introduced into the data collection, and this factor alone could explain any positive results, as opposed to the belief the treatment was responsible. One example of potential data collector bias is when the person who developed the treatment conducts a study. In these cases, one must carefully study the methodology of the data collection, to ascertain that all safeguards are protected.

The pretest-posttest design is fatally flawed with respect to internal validity. For example, if participants improve from pretest to posttest, the improvement could be due to simply the participants maturing (physically or psychologically) over the course of the experiment. Consider a research project done over the course of a year with preschoolers with autism. An improvement in assessment from pretest to posttest (after one year) could be due simply to the natural maturation of the participants, rather than influence of the treatment. Another possible threat to believing that a treatment caused any positive changes relates to participants who were chosen on the basis of extremely low scores (or extremely low performance) on the variable(s) being measured in the pretest. Generally, extremely low scores will often improve, and extremely high scores will often decline, given repeated assessments, just because they are so extreme. Thus, any study that involves participants because they scored very low or very high on the dependent measures, and that uses a pretest-posttest design, is open to this particular threat and thus one cannot believe that the treatment caused any improvement.

The one group pretest-posttest design is flawed by several additional internal validity threats not discussed here. The reality is that any attempt to demonstrate the effectiveness of a treatment by using a pretest on one group of participants, then applying a treatment, followed by a reassessing the variables being tracked, will always be open to skepticism of linking improvement to treatment. This type of design will never allow strong confidence in the belief of a cause and effect connection between treatment and improvement.

Antiscience, pseudoscience, and bizarre claims continue to gain influence in the public and this state of affairs is partly due to the lack of understanding of the nature of science (Lamal, 2009). Skepticism is a key concept in understanding how to assess the level of believability of something. Pigliucci (2009) goes so far as to believe that there is an ethical requirement to be skeptical and question the veracity of claims. He asserts that everyone must seek the truth and this requires a "baloney detection toolkit" (Sagan, 1996). This set of analytic and decision-making procedures and rules allow us to, as best as we are able, ascertain what might be true and what does not have evidence of believability. The adoption of healthy skepticism will result in a more informed public, more informed decision making about claims and treatments, and have the overall effect of the promotion of truth and validity to protect us from extraordinary claims that have little reason to be believed.

All research is not equal in quality. Just because a research study has been conducted and shows positive changes in some aspects of autism does not necessarily mean that the treatment was responsible for those changes. Since autism is said by some to be a "fad magnet" (e.g., Jacobson, Foxx, & Mulick, 2005), parents and other consumers must critique any research study that purports to show a positive effect of a treatment, and try to determine if internal validity threats are controlled by the experimenter. If not controlled, if threats to internal validity are possible, then the positive changes attributed to the treatment could be due to either the treatment or uncontrolled confounding variables. If both are a possibility, then one must assume that the uncontrolled variables are the reason for the change. That is why internal validity is so important to be assured in any study.

By activating their "baloney detectors" (Sagan, 1999), parents, care givers, and service providers can avoid adopting treatments that have no proof of effectiveness, and thus be more likely to embrace treatments for which there is a body of well-designed research supporting a cause and effect relationship. Skepticism will inoculate people considering what causes certain behavior or phenomena and protect them from making assumptions that do not stand up to a scientific scrutiny. Research in autism treatments that purportedly shows evidence of effectiveness, but that utilizes only pretest-posttest studies, needs to be viewed with caution and must not be thought of as producing valid conclusions that allow consumers and caregivers to believe that the treatment in fact works. Better understanding of the flaws in this basic and commonly used research design could enhance accessing clinical treatment services.


American Association for the Advancement of Science (2009). Retrieved August 20, 2010 at http:/

Connections Center (2005 August). Myths & facts about the RDI[R] program, part 5, fact: The RDI program is for those severely affected by autism, too! Going to the heart of autism. Retrieved from archive/ newsletters/0816005/default.htm#article.

Cooper, J.0., Heron, T.E., and Heward, W.L. (2007). Applied Behavior Analysis--2nd ed. Englewood Cliffs, NJ: Prentice-Hall.

DiCarlo, C. (2009). The roots of skepticism: Why ancient ideas still apply today. Skeptical Inquirer, 33(3), 51-55.

Drew, C. J., Hardman, M. L., & Hosp, J. L. (2008). Designing and conducting research in education. Thousand Oaks, California: Sage Publications, Inc.

Fraenkel, J. R., & Wallen, N. E. (2009). How to design and evaluate research in education. Seventh edition. New York: McGraw-Hill.

Gutstein, S. E., Burgess, A. F., & Montfort, K. (2007). Evaluation of the Relationship Development Intervention program. Autism, 11, 397-412.

Jacobson, J. W., Foxx, R. M., & Mulick, J.A. (2005). Controversial therapies for Developmental disabilities: Fad, fashion, and science in professional practice. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

Kurtz, P. (2010). Exuberant skepticism. J. R. Shook (Ed.), Prometheus Books, Amherst, New York.

Lamal, P. (2009). Paul Kurtz: A titan of skepticism. Skeptical Inquirer,. 34(4), 57-58.

Linderman, T. M., & Steward, K. B. (1999). Sensory integrative-based occupational therapy and functional outcomes in young children with pervasive developmental disorders: A single subject study. American Journal of Occupational Therapy, 53, 207-213.

Loxton, D. (2009). (Ed.). What do I do next: Leading skeptics discuss 105 practical ways to promote science and advance skepticism. Retrieved August 20, 2010 at

Merriam-Webster Online (2010). Retrieved August 16, 2010 at

Normand, M. P. (2008). Science, skepticism, and applied behavior analysis. Behavior Analysis in Practice, 1(2), 42-49.

Pigliucci, M. (2009). The moral duty of a skeptic. Skeptical Inquirer, 33(6), 18-19.

Radford, B. (2009). Chasing the ghost bird: Science, skepticism, and the ivory-billed woodpecker. Skeptical Inquirer, (34)3, 32-34.

Rossignol, D. A., & Rossignol, L. W. (2006). Hyperbaric oxygen therapy may improve symptoms in autistic children. Medical Hypotheses, 67, 216-228.

Rossignol, D. A., Rossignol, L. W., James, S. J., Melnyk, S., & Mumper, E. (2007). The effects of hyperbaric oxygen therapy on oxidative stress, inflammation, and symptoms in children with autism: An open-label pilot study. MBC Pediatrics. Retrieved on March 8, 2010 at

Sagan, C. (1996). The demon-haunted world: Science as a candle in the dark. New York, NY: Random House.

Zane, T. (2005). Fads in special education. In Jacobson, J. W., Foxx, R. M., & Mulick, J. A. (Eds.), Controversial therapies for developmental disabilities: Fad, fashion, and science in professional practice. Mahwah, NJ: Lawrence Erlbaum Associates.

Zane, T., Davis, C., & Rosswurm, M. (2009). The cost of fad treatments in autism. Journal of Early and Intensive Behavior Intervention, 5(2), 44-51.

Zane, T. & Hanson, J. (2008). Evidenced Based Practice: A Review of the Criteria that Constitutes Evidence. Presented at the Florida Association for Behavior Analysis conference, Daytona Beach.

Thomas Zane, Ph.D., BCBA

Author information:

Thomas Zane, Ph.D., BCBA-D

Institute for Behavioral Studies

Endicott College

376 Hale Street, Beverly, MA 01915
Gale Copyright: Copyright 2010 Gale, Cengage Learning. All rights reserved.