INTEREXAMINER RELIABILITY OF THORACIC MOTION PALPATION USING CONFIDENCE RATINGS AND CONTINUOUS ANALYSIS

Interexaminer Reliability of Thoracic Motion Palpation
Using Confidence Ratings and Continuous Analysis
This section is compiled by Frank M. Painter, D.C.
Send all comments or additions to: Frankp@chiro.org

FROM: J Chiropractic Medicine 2010 (Sep); 9 (3): 99–106 ~ FULL TEXT

OPEN ACCESS

Robert Cooperstein, MA, DC, Michael Haneline, MS, DC, and Morgan Young, DC

Palmer Chiropractic College,
San Jose, CA.

OBJECTIVE: Motion palpation is integral to most chiropractic techniques and can be found in curricula of most every chiropractic college. Paradoxically, most studies do not show strong reliability for motion palpation. The purpose of this study was to determine if allowing motion palpators to rate their confidence in their findings, as well using a continuous data analytic method, would influence the level of concordance.

METHODS: Subjects were 52 asymptomatic chiropractic student volunteers. Two palpators assessed posterior to anterior glide of T3–10 in the prone position, alternating in their order and blinded as to each other's results. Each examiner identified the location of maximal restriction in this range and also whether they were "very confident" or "not confident" in their finding.

RESULTS: For all subjects combined, the examiners' calls were "poor": intraclass correlation coefficient [2,1] = .3110 (95% CI, .0458–.5358). In contrast, interexaminer agreement was "good" when both examiners were very confident: intraclass correlation coefficient [2,1] = .8266 (95% CI, 0.6257–0.9253).

There are more articles like this @ our:

LOCATING SUBLUXATIONS Page
CONCLUSION: When each examiner was "very confident" as to the most fixated thoracic segment, the levels they identified were very close. This corresponds to "good" agreement, an uncommon result in most interexaminer motion palpation studies. Thus, the confidence level of examiners had an effect on the interexaminer reliability of thoracic spine. Our novel continuous measures, statistical methodology, and subtyping the subjects according to the confidence of the palpators seem more capable than level-by-level discrete analysis of detecting interexaminer agreement.

From the FULL TEXT Article:

Introduction

The concept of vertebral misalignment and, thus, static listings was present in 1895 at the very beginning of chiropractic. Palmer, [1] describing his first adjustment, said: “An examination showed a vertebra racked from its normal position. I reasoned that if that vertebra was replaced, the man's hearing should be restored.” The concept of joint fixation and, thus, dynamic listings is almost as old, having been described as early as 1906 by Smith et al, [2] “A simple subluxated vertebra differs from a normal vertebra only in its field of motion and the center of its field of motion; because of its being subluxated, its various positions of rest are differently located than when it was a normal vertebra ... its field of motion may be too great in some directions and too small in others.” [3]

Although motion palpation (MP), the examination procedure most targeted at identifying joint fixation, was established early in the profession's history, it traditionally received far less emphasis than models of chiropractic subluxation based on vertebral misalignment. Nonetheless, the European Henri Gillet was a leading proponent of dynamic analysis throughout his career [4, 5] and strongly impacted on the practice and teaching of an influential American exponent of motion palpation, Faye. [6] Improper motion (too much, too little, or improper coupling patterns) is now seen as a vitally important component of the chiropractic vertebral subluxation complex. [7]

Motion palpation in one form or another is integral to most chiropractic techniques and can be found taught within the core curriculum of virtually every chiropractic college. Paradoxically, the preponderance of information from dozens of reliability studies show MP to be unreliable, in that palpators do not generally show concordance much above chance levels. [8] Indeed, literature reviews on the subject have reported kappa values suggesting only slight interexaminer reliability and moderate intraexaminer reliability. [9-12] On this basis, Troyanovich and Harrison [13] have opined that chiropractic colleges should desist teaching MP and chiropractors should give up the practice. The subsequent review of Hestbaek and Leboeuf-Yde [14] came to the same conclusion after reviewing 15 studies of motion palpation for the lumbar spine and 6 for the sacroiliac joint, “the esteem chiropractors have for motion palpation in particular has not been substantiated by scientific data.” Panzer11 came to a similar conclusion as did Russell [15] in their reviews.

Possible explanations for the general poor reliability of MP have involved variation in procedure, [16] poor interexaminer spinal level localization leading to possible misreported discrepancies [17, 18], and incorrect landmark rules. [19-21] It may be theoretically difficult, if not impossible, to determine the reliability of motion palpation. [10, 22] In motion palpating a research subject, the examiner who goes first most likely alters that subject so that we should not be surprised, nor excessively disheartened, to find that the second examiner does not come up with the same result. The first examiner may attenuate spinal restrictions because the palpatory procedure resembles mobilization, which is, after all, a treatment modality, or the first examiner may leave the subject with aggravated restrictions, with the result of stirring up guarding reactions from injured joints.

Mior et al [23] reported that lack of experience among examiners did not seem to be a viable explanation, nor did providing true or false information regarding the location of pain improve reliability. [24] Boline et al [25] did not find the use of asymptomatic or slightly symptomatic subjects an apparent confounder compared to using symptomatic subjects. Some researchers have attempted to collapse spinal regions during the analysis of MP data, but this statistical practice has been shown to inflate reliability and be methodologically unacceptable. [26] Since not all motion palpation procedures are the same, we hypothesized that the choice of method might impact upon the degree of reproducibility. After categorizing 44 studies as having used either an excursion or end-feel method, [27] we found that the high-quality studies did not establish that one method outperformed the other.

Against this backdrop of previous studies, we hypothesized that the study designs that had been heretofore used to investigate the reliability of MP may have been wanting. First, many or even most of the subjects in these studies may have lacked a significant fixation. In the absence of a gold standard as to which of the subjects may have been significantly fixated, the study designs could not distinguish between the following 2 possibilities: the examiners were unable to agree on the location of actual fixations, or the subjects simply lacked detectable fixations. Forcing examiners to say “fixated” or “not-fixated,” level by level although in some cases they might have preferred the option to say “I am not sure” may not have given them enough choices and, thus, lowered concordance.

Second, although the profession continues to discuss the etiology of putative fixations, it remains possible that some involve the tethering of spinal segments by muscles and ligaments that span several segments. In such cases, one might expect fixation to manifest as a multilevel rather than strictly segmental finding. Then, asking examiners level by level if a segment is fixated, as every study we have seen has done, may be asking the wrong question. It may be more relevant to define agreement among examiners as having to do with how near their calls are to one another, rather than determining their concordance level by level.

When 2 examiners assess a patient with a musculoskeletal complaint, we would clearly like to distinguish cases in which they completely disagree on the location of a putative vertebral level thought to be clinically relevant and cases in which they almost agree on the location. Previous studies have stated that an examiner finding T8 movable and T9 fixated, and another finding T8 fixated and T9 movable, were in complete disagreement, whereas it would have been more illuminating to state they were in close agreement, although their calls were not identical. This type of assessment would better capture the essence of how MP is actually done: the palpator examines a region of the spine looking for the most fixated place(s).

The objective of this study was to assess the interexaminer reliability of thoracic MP, taking into account (a) the examiners' confidence in their palpation findings and (b) defining agreement as proximity to each other's findings.

Discussion

To accomplish motion palpation, the examiner introduces motion into joints to assess the range, pattern, and quality of movement. Perhaps the broadest distinction that might be made is between the motion palpation of intersegmental range of motion (ie, excursion), as compared with unisegmental motion (ie, end-feel [29]). In palpating for excursion, the examiner usually contacts elements of 2 or 3 vertebrae, using the fingers of the palpating hand to assess intersegmental movements, whereas the other hand imparts motion into the articulation(s). This is quantitative analysis, whereby the examiner estimates the amount of movement and generally categorizes the motion segments as hypermobile, normal, or hypomobile. In palpating end-feel, the examiner contacts a single segment, using the fingers of the palpating hand to apply overpressure into rotation, flexion-extension, and lateral flexion, whereas the other hand either stabilizes or assists in imparting motion. This is a more qualitative analysis whereby the results are interpreted in terms of the unisegmental character of movement; it may lack “springiness” or have a “hard end-feel.” In a previous systematic review of the literature, [27] we noted a trend for the end-feel method to be more reliable than the excursion method, although there was no statistically significant advantage when quality ratings of the relevant literature were taken into account. The trend for the end-feel method to outperform the excursion method determined our choice of palpatory methods in this study.

Visual inspection of the scatter plot in Fig 3 is clear: when each examiner was “very confident” as to the most fixated thoracic segment (upper right scatterplot), the levels they identified were very close (within 1 vertebral level). This corresponds to “good” agreement, a result not seen, to our knowledge, in other interexaminer MP studies. The data for all subjects (upper left scatterplot) show much less concordance, and the lower plots in which at least 1 doctor lacked confidence show essentially no concordance. In addition to demonstrating that examiners can, under certain circumstances, agree in their fixation findings, our data suggest that minimally symptomatic and asymptomatic subjects do indeed manifest spinal findings that a palpator may experience as fixation, even in the absence of significant presenting patient complaints. Not surprisingly, among subjects found either barely fixated or significantly fixated at multiple levels, the examiners did not agree above chance levels.

Previous studies that used a forced call paradigm, in which the rater had to find the subjects fixated or not at each spinal level, were very demanding for the examiners, in that they were required to identify all fixations, under the implicit assumption that they were of the same severity. It is unlikely that examiners will agree with one another when the signal to noise ration is very low. These studies analyzed their data using the κ statistic. However, the value of the kappa statistic is changeable when the prevalence of the attribute being tested varies and/or when bias (the degree of disagreement between raters on the proportion of positive or negative cases) is present. [30]

When 2 examiners assess a patient with a musculoskeletal complaint, we would clearly distinguish cases in which they completely disagree on the location of a putative vertebral level that is relevant and cases in which they almost agree on the location. Previous studies would have stated that an examiner finding T8 movable and T9 fixated, and another finding T8 fixated and T9 movable, disagreed, whereas in our study, we would find them to be close agreement, although their calls are not identical.

Among the some 4 dozen MP studies, we discussed in an annotated review of motion palpation [8] Potter et al [31] were the only investigators to have used a most fixated segment paradigm similar to ours. Like ourselves, these investigators used ICC for the purposes of analysis. Since theirs was an intraexaminer study, unlike ours, and furthermore used findings in addition to MP to assess agreement, we can not otherwise compare the results of their study with our own.

In our study, the palpators did not have any verbal interaction with the subjects, so that findings of fixation could be considered central to their identification of dysfunctional spinal segments. We wanted to avoid confounding our objective findings with subjective information about pain or tenderness. Although the motion palpation procedures described and tested by Jull et al [32, 33] are sometimes regarded to be valid, their interpretation is questionable because in their work the examiner's finding of restriction is commingled with other findings, such as patient-reported tenderness and soft-tissue textural changes. Thus, it is not clear that the finding of fixation per se is central to their identification of dysfunctional spinal segments.

We should note that at least 1 other MP reliability study acknowledged the excessive stringency of assessing interexaminer agreement at individual motion segments. In a study by Christensen et al, [34] examiners were considered to be in agreement when their calls were within ±1 spinal segment of each other. Intraexaminer reliability was reported to be good (κ = 0.59 to 0.77), whereas interexaminer reliability was low (κ = 0.24 and 0.22). In addition, Humpreys et al [35] used a “most fixated level” method in a validity study, which assessed the accuracy of blinded palpators in detecting fixation in 3 subjects with congenital block vertebrae as a reference standard.

Limitations

The sample size was relatively small after it was stratified by degree of confidence. Nonetheless, there was a robust contrast between reported indices of agreement when examiners were very confident compared with when they were not confident. The use of convenience samples (mostly asymptomatic subjects) in MP studies has been previously criticized [14, 18] although there is some evidence it makes no difference. [24] It appeared our study featured a mix of subjects with varying degrees of fixation, partially satisfying the acknowledged condition that diagnostic studies include a spectrum of subjects with the target disorder. [36]

We did not explore the question as to whether an examiner's lack of confidence was related to the absence of palpable fixations or the existence of multiple fixations (wherein no maximally fixated segment could be identified) because the subgroups were too small. It is not obvious why examiner agreement was limited to level and did not include side-specificity. The examiners did not call out many such multiple fixations, and when they did, poststudy discussion suggested the data were inconsistently recorded, precluding analysis. The results of this study of thoracic MP may not be relevant to studies of cervical and lumbar MP using similar methodology. With data collection in progress, we will report on similar studies of these regions in other publications.

The most glaring limitation of this study, at least as we see it, is not related to its methodology or findings so much the uncertain clinical value of MP in determining the optimal locations to target adjustive and other therapeutic procedures. In other words, to our knowledge it has not been demonstrated that the information provided by MP improves the outcome of clinical care. Indeed, at least 1 study by Haas et al [37] suggests that end-play assessment does not contribute to same-day clinical improvement in the cervical spine, although the investigators do not rule out possible contribution over a longer term. Moreover, the study design did not allow distinguishing between MP being intrinsically not useful, the adjustor being nonspecific, or the motion palpator being inaccurate.

It is difficult to discern from the literature which type of examination findings would be most clinically informative on deciding where and how to adjust chiropractic patients. The PARTS acronym (pain, asymmetry, range of motion, tone/texture/temperature, and special tests) [38] is widely respected in chiropractic, as is the very similar TART acronym (tissue texture changes, asymmetry, restriction of motion, tenderness) in osteopathy, [39] but assessment of their clinical utility awaits outcome studies. In the meantime, the absolute and relative importance of fixation, pain provocation, tenderness, temperature asymmetry, misalignment, functional leg length inequality, or other types of examination findings are unclear.

Conclusions

The confidence level of examiners has an effect on the interexaminer reliability of thoracic spine MP, such that agreement is “good” when examiners are “very confident” in their calls and not above chance levels when at least one of them is not. Looking at the data set as a whole, unstratified by degree of examiner confidence, our results resemble those of other investigators, in that the index of agreement is low. Thus, we believe using continuous measures methodology, and defining subgroups according to the confidence of the palpators, is more capable than level-by-level discrete analysis of detecting interexaminer agreement. We also believe our analytic method better reflects what motion palpators, who presumably look for maximally fixated levels within a spinal region logically related to a patient complaint, actually do.

We would suggest that future studies deploying a similar methodology, using confidence ratings and continuous analysis, use a more representative mix of study subjects, some with and some without clinically significant pain. Moreover, we would avoid using the “not confident” rating to refer to 2 very different clinical situations: the finding of multiple fixations and that of not finding any significant fixations at all. This complicates the analysis greatly because it confounds subjects that seem so fixated that multiple levels would be chosen, and others who seem not fixated at all.

Ultimately, it is desirable that chiropractic education mirror the clinical situations that graduates are likely to encounter as closely as possible. Since we would expect clinicians to detect, make record of, and treat the most fixated level(s) within a range including the area of chief complaint, we would think it reasonable to teach chiropractic interns to examine patient just that way. This would be more relevant than asking them to agree or disagree, level by level, on the segmental motion or lack thereof, with the instructors or with each other.

Return to SPINAL PALPATION

Return to LOCATING SUBLUXATIONS
Since 1-13-2019

Home Page

Visit Our Sponsors

Become a Sponsor

Join us

Please read our DISCLAIMER

Interexaminer Reliability of Thoracic Motion Palpation Using Confidence Ratings and Continuous Analysis

Return to SPINAL PALPATION

Return to LOCATING SUBLUXATIONS

Interexaminer Reliability of Thoracic Motion Palpation
Using Confidence Ratings and Continuous Analysis