by Janet Kahn

In this column, researcher and massage therapist Janet Kahn, Ph.D., visits the major issues, organizations and people involved in research into complementary health care, especially massage, and updates readers on policies related to such research. In this issue: a report on integrative-medicine literature and a study on craniosacral therapy.

A wise feminist sociologist named Pauline Bart once said, “Everything is data, but data isn’t everything.” In this column I present an interesting study that looked at some of the basics of craniosacral therapy, including testing the reliability of practitioners’ palpation of the cranial rhythmic impulse.

Before reviewing that study, however, I want to call your attention to two recent publications that form a backdrop for considering the findings of the craniosacral study. These are both important contributions to the ongoing dialogue about integrated medicine, and each names the question of evidence (or data) as central to the possibility, or impossibility, of integrating complementary and alternative medicine (CAM) and conventional medicine.

In the literature

The first publication is from the Institute of Medicine, “Complementary and Alternative Medicine in the United States.” This grand title is in keeping with the grand mission of the IOM, one of four national academies that collectively are viewed as advisors to the nation on science, engineering and medicine ( This will be an influential report. Medical schools are reviewing it for guidance regarding their responsibilities to train physicians for a future that includes multiple forms of medicine. The recommendations to Congress, the National Institutes of Health and private foundations about needed research and infrastructure will be attended to carefully.1

I recommend reading the IOM report. It is a partially successful attempt at an even-handed treatment of CAM and conventional medicine. As such it tells a lot about the challenges of integration.

“The goal should be the provision of comprehensive care that respects contributions from all sources,” states the report. That sounds inviting and patient-centered, yet the next sentence says, “Such care requires decisions based on the results of scientific inquiry … the committee recommends that the same principles and standards of evidence of treatment effectiveness apply to all treatments, whether currently labeled as conventional medicine or CAM.” (Remember this last sentence when we discuss the next article.) Regarding research, the report acknowledges that randomized, controlled trials are not applicable to all aspects of CAM and calls for the development of innovative research methods.

I also recommend an article by Ted Kaptchuk and Franklin Miller, titled, “What is the Best and Most Ethical Model for the Relationship Between Mainstream and Alternative Medicine: Opposition, Integration or Pluralism?”2

This article is a breath of fresh air, because it heads straight into the dilemmas of integrated medicine and asks us to make a conscious choice about what we want this relationship to look like.

The authors describe three approaches to a relationship between forms of medicine they see as different in worldview, as well as in technique. The three approaches are oppositional, integrated and pluralistic.

The oppositional form, they argue, is history. To make their case they provide deliciously extreme quotes from early battles between the American Medical Association and the chiropractic field, with a reminder that chiropractic is now licensed in all states.

Opposition is being replaced by mainstream health-care providers’ rush to include CAM in varying approaches to integrative health care, the authors claim. This the authors see as an impossible ideal.

One of the key chasms to a bridge, they say, is very different standards for what constitutes evidence of effect. Mainstream medicine, as the IOM report reminds us, asserts that it is scientific and seeks to base its treatments upon objective, experimentally derived data.

“Like any science, medical science is suspicious of ‘anecdotal’ or simple empirical experience,” the authors state. “What is observed in everyday circumstances cannot be trusted as much as what is observed (and preferably measured and replicated) under controlled conditions.”

While some call for evidence-based CAM, Kaptchuk and Miller say that “proponents of CAM systems usually assert that they operate within a theoretical and rational understanding of the world validated by the reliability of ordinary human experience … ‘Unimpeachable testimonials’ of cures are acceptable evidence; case reports narrated in the singular are acceptable units of authentication … Immediate and personal experiences are positively valued, while objective detachment and analytic methods are not.”

Given the real epistemological differences between the two worlds of medicine, which the authors describe in much more detail, they argue against even striving for a truly integrated medicine. “Despite the attractive rhetoric, the integration model does not amount to a coherent medical framework … [it] promises that patients are offered the best of both medical worlds; but it seems more likely that patients are being denied the ‘integrity’ of either world.”

Rather than weaken the integrity of each form through modifications that real integration would require and still fail at, the authors suggest we can have the best of both worlds through pluralism—an approach that rests upon tolerance and/or cooperation, but not integration. Pluralism, they say, acknowledges that mainstream medicine and CAM are fundamentally different and that both offer clinical value. It encourages cooperation without blending.

Their idea of pluralism may simply suggest we accept what we already have—a world in which CAM and conventional medicine exist in fairly separate silos. Integration is largely done by patients. The improvement is greater communication. The improvements that we get with a more formal embrace of pluralism could be a friendlier atmosphere for both the patients and the practitioners, and more practical cooperation between CAM and conventional-medicine providers.

The Kaptchuk and Miller article could deepen the dialogue on integrative medicine if it prompts us all to get more specific about what we really mean by integrative or integrated. The devil is in the details, including deciding what evidence we require—as a profession and as individual practitioners. This is an important question for massage-training programs to consider. What determines what you teach, especially about the effects of massage? Consider the following study.

Craniosacral evidence

Two health-sciences faculty from Victoria University in Melbourne, Australia, conducted a study testing some central issues in craniosacral therapy.3

First they attempted to establish interexaminer reliability of the cranial rhythmic impulse , the assumption being that if there is a palpable cranial rhythmic impulse that practitioners can be trained to identify accurately, then two people feeling a patient’s cranial rhythmic impulse at the same location (such as the head or sacrum) close in time, should describe the cranial rhythmic impulse similarly. Next they tested intrarater reliability—meaning that if the rhythm is real, a single practitioner tracking the rhythm of a series of patients should find relative consistency within each patient’s cranial rhythmic impulse (if the patient stays in the same state and same position), and greater variation across patients.

Reliability is important. Imagine if different pathologists got unreliably different findings from tumor biopsies so that no one really knew how to proceed in terms of treatment, or if five different X-rays of the same bone showed five different indications about whether or not there was a break. And what if the same pathologist did not always find the same indications regarding malignancy from the same biopsy? Since no modern technology has yet been able to definitively detect the cranial rhythmic impulse, human palpation, subjective as it may be, is our only diagnostic method. Thus it is reasonable to ask how reliable this clinical decision-making tool is. When palpating the cranial rhythmic impulse do we really know when and how to intervene?

The third issue was an examination of the core-link hypothesis, described as “an involuntary movement of the sacrum caused by a lifting force exerted on the sacrum by the attachments of the spinal dura to the sacrum. ”If the core-link hypothesis is true, then two practitioners feeling a patient’s cranial rhythmic impulse at the same time at different parts of the body should describe it identically.

The study used a within-subject repeated measures design. Two osteopaths, experienced in craniosacral therapy, simultaneously palpated the cranial rhythmic impulse of a sample of 11 healthy subjects in a series of two-minute trials. Palpation results were recorded with one practitioner stationed at the head and another at the feet. Every time they felt a subject attain “full flexion,” they depressed a foot-switch interfaced with a computer. The examiners could not see one another nor tell when the other depressed the switch. Subjects’ heart rates were monitored to ensure that any differences practitioners perceived in the subjects’ cranial rhythmic impulse were not due to differing states of arousal in the subject. Then the practitioners changed positions. Two trials were done in each position.4

Intraclass correlation coefficients, which indicate the extent of agreement between two measures (such as the two practitioners’ independent assessments of full flexion) were used to assess both interrater and intrarater reliability. Pearson product-moment coefficients were used to assess any pattern of association other than strict agreement.

The cranial rhythmic impulse rates were calculated from the interval times between the “full flexion” moments when each practitioner depressed their foot switch. Mean rates were then calculated—for each practitioner, subject and position at each subject. The rates are calculated as cycles per minute. The results of a custom-modeled factorial analysis of variance showed significant differences between cranial rhythmic impulse rates recorded by the two examiners, between the rates recorded in different positions, and between the rates recorded in different subjects.

The intraexaminer reliability scores for a single body position were found to be “fair to good,” according to agreed-upon standards for this statistic, meaning that practitioner A consistently found roughly the same cranial rhythmic impulse rate at the head of subject. However, when scores from a single practitioner palpating a single subject at the head and the sacrum were compared, reliability was poor. The core-link hypothesis that a single rhythm could be palpated at various body locations was not supported by these findings.

Similarly, interexaminer reliability was poor to nonexistent. Thinking about clinical decision-making based upon cranial rhythmic impulse palpation, the authors find that in at least one subject one examiner palpated a cranial rhythmic impulse rate that could be considered “low or … associated with significant pathosis,” while the second examiner found it to be normal.

Questions are raised

It is not obvious what we should make of these results. It is obvious that there are certain conclusions we cannot draw from them. They do not support the notion that the cranial rhythm can be palpated reliably simultaneously at different parts of a body. They do not support the notion that trained practitioners will find “the same” cranial rhythmic impulse. The strong intrarater reliability scores lead me to think that each examiner was palpating something—but what was it?

These findings are troublesome, and yet two things are also true. First, we are indebted to the researchers who conducted this study (and others who have conducted similar studies). We need to look at these issues. We cannot simply pass on lore without testing it. Second, although I am a researcher used to making decisions based upon data, I am not inclined to ignore the seemingly beneficial results I see from my own clients and those of many cranial practitioners. I believe we are palpating something—but what?

Thinking about these data in light of the IOM report and the Kaptchuk and Miller article raises questions. Should a physician recommend craniosacral therapy to a patient who has been rear-ended if she has seen previous patients benefit in similar situations, even though studies like this call into question the very basic assumptions of the therapy. Should health-insurance companies pay for it?

Whatever your answer is, would you apply the same standards for the prescription of diabetes medication if the blood tests upon which the diagnosis was based had comparably low reliability scores?


1. The 300-plus-page report can be read and/or purchased online at
2. Academic Medicine, Vol. 80, No. 3, March 2005.
3. Moran, RW and Gibbons, P. “Intraexaminer and Interexaminer Reliability for Palpation of the CRI at the Head and Sacrum; Journal of Manipulative and Physiological Therapeutics, Volume 24, Number 1, March/April 2001, pp. 183-190.
4. Space does not permit me to describe these procedures in more detail. They are presented fully in the original article, which I encourage you to read. I felt confident that the procedures and the statistical methods employed were adequate to provide a competent test of the hypotheses.

Janet Kahn, Ph.D., has been a massage therapist since 1970, and a researcher since 1978. She is past president of the American Massage Therapy Association Foundation and a current member of the NIH National Advisory Council on Complementary and Alternative Medicine. She is a consultant for hospitals, massage schools and medical schools on complementary-medicine research and curriculum development.


Interrater reliability: The degree to which two raters, observers or examiners, operating independently, assign the same ratings or values for an attribute being measured or observed. For this study, interrater reliability refers to the degree to which two experienced cranial rhythm impulse palpators, operating independently, observed a subject reaching full extension at the same rate.

Intrarater reliability: The same concept as interrater reliability, but applied to a single examiner across separate trials. For this study, it was the degree to which one examiner, palpating the cranial rhythm impulse on the same subject at different moments, observed the subject reaching full extension at the same rate.

Intraclass correlation coefficients: Statistical calculations that are often used as interrater reliability scores when multiple raters judge the same phenomena. They indicate the extent of correlation or covariance between the ratings of the two judges. For this study, the intraclass correlation coefficients indicate the extent to which examiners A and B tended to agree or disagree about the rate of the cranial rhythmic impulse

Significant pathosis: disease or indication of disease.

Core-link hypothesis: An involuntary movement of the sacrum caused by a lifting force exerted on the sacrum by the attachments of the spinal dura to the sacrum.

Within-subjects design: A research design in which a single group of subjects is compared under different conditions or at different points in time. For this study, researchers compared the cranial rhythm impulse scores of the same subjects under different conditions, such as when they are palpated by examiner A and when they are palpated by examiner B.

—Janet Kahn