Reliability of McKenzie Classification of
Patients with Cervical or Lumbar Pain

This section is compiled by Frank M. Painter, D.C.
Send all comments or additions to:

FROM:   J Manipulative Physiol Ther 2005 (Feb); 28 (2): 122–127 ~ FULL TEXT

Helen A. Clare, PT, MAppSc • Roger Adams, PhD • Christopher G. Maher, PT, PhD

School of Physiotherapy,
The University of Sydney,
2141, Australia.

Background:   In the McKenzie system, patients are classified first into syndromes, then into subsyndromes. At present, the reliability of classification with this system is unclear. No study has included patients with cervical pain, and the studies to date have reported conflicting results.

Objective:   The aim of the study is to investigate the interexaminer reliability of the McKenzie classification system for patients with cervical or lumbar pain.

Subjects:   Fifty patients with spinal pain (25 with lumbar pain and 25 with cervical pain) were included in the study.

Method:   The patients were assessed simultaneously by 2 physical therapists (14 in total) trained in the McKenzie method. Agreement was expressed using the multirater kappa coefficient and percent agreement for classification into (i) syndromes and (ii) subsyndromes.

Results:   The reliability for syndrome classification was kappa = 0.84 with 96% agreement for the total patient pool, kappa = 1.0 with 100% agreement for lumbar patients, and kappa = 0.63 with 92% agreement for cervical patients. The reliability for subsyndrome classification was kappa = 0.87 with 90% agreement for the total patient pool, kappa = 0.89 with 92% agreement for lumbar patients, and kappa = 0.84 with 88% agreement for the cervical patients.

Conclusion:   The McKenzie assessment performed by persons trained in the McKenzie method may allow for reliable classification of patients with lumbar and cervical pain.

Key Indexing Terms   Spine • Pain • Physical Examination • Reproducibility of Results

From the FULL TEXT Article:


The McKenzie method for evaluation and treatment of patients is frequently used by clinicians. [1m 2] A number of reviews have concluded that the method is effective for the treatment of low-back pain (LBP) [3-5]; however, at present, there are no reviews that have evaluated efficacy for cervical pain. There is some evidence that treatment given according to subclassification is more effective than treatment given to an unselected population. [6, 7] A key aspect of the McKenzie approach is that the patients receive individualized treatment based upon their clinical presentation. The McKenzie method uses an assessment process, which aims to identify subgroups of patients within the nonspecific spinal pain population whose symptoms behave in a similar way when subjected to mechanical forces within the physical examination. The classification into subgroups then directs treatment. [8]

Patients are classified into 1 of the 3 McKenzie syndromes (derangement, dysfunction, postural), and those whose presentation does not fit 1 of the 3 syndromes are classified in an “other” grouping.

The derangement syndrome is the most prevalent treatment classification. [8] This syndrome is characterized by the centralization and peripheralization of symptoms, in response to repeated movements or sustained postures of the lumbar spine within the physical examination. For example, a patient whose leg and back pain is increased with repeated or sustained flexion movements and reduced with repeated or sustained lumbar extension movements would be placed into the category of derangement and the subcategory of posterior derangement. In this example, the patient's directional preference is for extension movements; therefore, he would be treated with exercises into this direction and encouraged to avoid flexion movements and postures. With the McKenzie approach, the emphasis is on self-management with manipulative techniques reserved for those patients who do not respond to self-management measures. [8-10]

The dysfunction syndrome is characterized by intermittent spinal pain that is reproduced at the end range of a restricted movement. Treatment emphasizes mobilizing exercises in the direction of movement that reproduces pain. The treatment rationale is that the exercise will remodel the tissues limiting the movement.

The postural syndrome is characterized by intermittent spinal pain, which is produced with static positioning of the spine and abolished by moving the patient out of the static position. Treatment consists of patient education and avoidance of the provocative postures. The “other” grouping is used for patients where, after several days of mechanical evaluation, a mechanical syndrome cannot be identified. [8] Referral for further medical review is then indicated.

At present, the reliability of the McKenzie classification for patients with spinal pain has not been clearly established. To date, no study has included subjects with cervical pain. This has clinical relevance because cervical pain is a common complaint in primary care. Although 4 studies have investigated lumbar patients, none provides convincing estimates of the reliability to be expected when trained therapists classify patients into the syndromes and subsyndromes described by McKenzie.

Problems include

(i)   the use of minimally trained therapists, [11, 12]

(ii)   the use of a small number of therapists thus limiting the ability to generalize, [11, 13, 14]

(iii)   not using a chance-corrected measure of agreement, [11]

(iv)   not evaluating classification into all subsyndromes, [11, 12] and

(v)   use of categories not part of the McKenzie system. [11]

The aim of the current study was to address this gap in the literature and to clearly establish the intertester reliability of the McKenzie system. We studied formally trained therapists classifying patients with cervical or lumbar pain into both syndromes and subsyndromes.


This study was approved by the University of Sydney Human Ethics Committee.


Patients attending private physical therapy clinics in Australia for treatment of cervical or back pain, with or without radiation into the limb, were invited to participate in the study.


All examiners (n = 14) were physical therapists who had completed the McKenzie credentialing examination. In addition, 7 had also completed the McKenzie diploma qualification. Each rater examined between 2 and 8 patients.

Table 1

Treating therapists invited their patients to participate in the study. Patients received an explanation of what was required of them and signed an informed consent form. Before clinical assessment, information was collected from the patients regarding their sex, age, weight, height, location of symptoms, duration of symptoms, working status, previous history of LBP, pain intensity, pain frequency, and functional status (Table 1).

The patients underwent an assessment as described by McKenzie [9, 10] with a standard McKenzie assessment form (cervical or lumbar) used to record the findings. Pairs of therapists simultaneously assessed the patient. One of the examiners was the treating therapist and directed the assessment, and the other was the researcher who observed the assessment and did not speak with the patient beyond the initial greetings.

At the completion of the assessment, the 2 therapists moved to separate rooms and independently recorded their classification of the patient on a form provided by the researcher (Appendix A). The forms were then placed inside an envelope together with a copy of the patient assessment form and sealed. The 2 sealed envelopes were then placed inside a larger envelope together with the patient demographic forms, which were posted to the investigator's office. The classifications were then coded to allow for data analysis (Appendix B).

Interrater reliability was estimated by calculating percent agreement and multirater ? (an unweighted form of ?) using the MKAPPASC.SPS macro in SPSS 10.0 (SPSS, Chicago, Ill). Also calculated were 95% confidence intervals for ?.

Table 2

Table 3

Table 4

The ratings for classification into syndromes and subsyndromes are shown in Tables 2 and 3. The prevalence of the derangement syndrome was 88% for the first rater and 84% for the second rater; dysfunction syndrome was 0% and 4%, respectively; postural syndrome was 0% for both raters; and “other” was 12% and 12%, respectively (Table 2). Within the classification of derangement, 86% of the patients were classified as subsyndromes D1, D3, and D5 (Table 3).

The percent agreement for assignment into syndromes for the total patient pool was 96%, with the point estimate and 95% confidence interval for ? being 0.84 (95% confidence interval [CI] 0.35-1.0). For subsyndromes, the percent agreement was 90% with ? = 0.87 (95% CI 0.71-1.0). For the 25 lumbar patients, the percent agreement was 100% with ? = 1.0 (95% CI 0.35-1.0) for syndrome classification, and 92% agreement with ? = 0.89 (95% CI 0.66-1.0) for subsyndromes. For the 25 cervical patients, the percent agreement was 92% with ? = 0.63 (95% CI ?0.11 to 1.0) for syndromes, and 88% agreement with ? = 0.84 (95% CI 0.60-1.0) for subsyndromes (Table 4).


The principal finding of this study is that trained McKenzie therapists are able to classify patients with spinal pain into the categories described by McKenzie with good reliability. The reliability measures for lumbar patients (? = 0.89) and cervical patients (? = 0.84) were both acceptably high for subsyndrome classification. Reliability for syndrome classification was less clear, with a ? value and percent agreement clearly acceptable for lumbar patients (1.0 and 100%, respectively), whereas for cervical patients, the percent agreement was acceptably high, but the ? value was not. We attribute this pattern of results to the behavior of ? when prevalence is low (the low base-rate problem), where only 1 or 2 disagreements between raters can markedly reduce the ? value. Although there is some ambiguity about the reliability of cervical syndrome classification (because of the conflicting ? and agreement scores), there is no ambiguity about the reliability of cervical subsyndrome classification.

Table 5

To date, 4 studies have assessed the reliability of the McKenzie classification for patients with lumbar pain. The 2 first studies reported low reliability,11,12 whereas the 2 more recent studies have reported acceptably high reliability.13,14 However, direct comparison between studies is difficult because each of the 4 studies used different design and analysis strategies. Studies differed in terms of (i) the number of raters, (ii) the number of patients, (iii) whether raters assessed the patient simultaneously or consecutively, (iv) the level of training of the raters, (v) the classification categories used, and (vi) the coefficient of agreement presented. These differences between the studies and the current study are summarized in Table 5.

The 2 most recent studies13,14 provide the highest estimates of the reliability of classification into the syndromes and subsyndromes. The design and analysis of these 2 studies were similar to the present study; however, we believe that the use of a larger pool of raters and the inclusion of both cervical and back pain patients in our study allow greater generalizability of our results. Interestingly, the reliability observed here for the total patient pool (which included patients with symptoms of lumbar and cervical origin) was of a similar magnitude to that reported by Razmjou et al14 for syndromes and by Kilpikoski et al13 for subsyndromes, for patients with pain of lumbar origin.

The 2 studies that have reported unacceptably low reliability both recruited raters with minimal training in the McKenzie assessment method. In the Riddle and Rothstein12 study, two thirds of the raters had not completed any formal McKenzie education, a feature which may explain the low reliability observed in that study. The raters in the Kilby et al11 study had completed only the basic training (parts A and B) in the McKenzie education program; however, it is not clear whether the low reliability found here was the result of insufficient training or the use of a different classification system to that advocated by McKenzie. It would seem that, from a consideration of the studies conducted to date, raters need to have at least reached the level of credentialing to use the classification system with acceptable reliability. As yet, no study has directly addressed this issue, and it would be of interest to conduct further research to establish the minimal level of education required to use the system with good reliability.

This study is the first that has assessed the reliability of the McKenzie classification for cervical pain patients. Obtained point estimates for ? values for the patients with lumbar pain were higher than those for cervical pain: syndromes ? = 1.0 versus 0.63; subsyndromes ? = 0.89 versus 0.84; however, in each case, the 95% confidence intervals overlap, so the difference in reliability could have arisen by chance. To resolve this, a more precise estimate of reliability is required, and further study using a larger pool of raters is recommended.

Table 6

The prevalence of the syndrome classifications found in our study is similar to that reported by both Razmjou et al14 and Kilpikoski et al13 (Table 6). The majority of the patients were classified as derangement in all 3 studies, a finding that is consistent with McKenzie9 who states “most patients develop pain and seek assistance as the result of derangement.” Within the derangement classification, 86% of the patients fell into the subsyndromes of D1, D3, and D5. McKenzie9 also states that approximately 89% of derangements will fall into the subsyndromes of D1, D3, and D5. In the patient population of Razmjou et al,14 90% were classified in these subsyndromes. This is in contrast to the findings of Kilpikoski et al13 who found that the majority (56%) of the patients were classified into derangement D4. Differences in the health care setting for the provision of the patients for the study may provide an explanation for their result.

One limitation of this study was that the 95% confidence intervals for ? were quite broad. A lack of precision is a feature of the traditional reliability study that uses a small number of raters (typically 2) assessing a larger number of patients. The problem with this design is that with only 2 raters, increasing the number of patients provides a very inefficient method of increasing power. A more efficient method is to expand the number of raters beyond 2; however, this may not be practical for clinical assessments. For example, the measure may be reactive (the potential for the attribute being measured to change with repeated measurement),15 or the time involved in repeated measurements may make participation unattractive to a patient. We are currently considering other designs to address these problems in clinical reliability studies.


The McKenzie assessment, when performed by therapists with training in the McKenzie method, allows for reliable classification of patients with lumbar and cervical pain.

Supplementary Material

APPENDIX A + B   See Page 6


  1. Battie MC, Cherkin DC, Dunn R, Ciol MA, Wheeler K.
    Managing lumbar pain:
    Attitudes and treatment preferences for physical therapists.
    Phys Ther 1994;74:219-96.

  2. Foster NE, Thompson KA, Baxter GD, Allen JM.
    Management of non specific lumbar pain by physiotherapists
    in Britain and Ireland.
    Spine 1999;24:1332-42.

  3. Rebbeck T.
    Position statement on the efficacy of physiotherapy interventions for the treatment of low back pain.
    Melbourne7 Australian Physiotherapy Association; 2002.

  4. Maher C, Latimer J, Refshauge K.
    Prescription of activity for low-back pain: What works?
    Aust J Physiother 1999;45:121-32.

  5. Danish Institute for Health and Technology Assessment.
    Low back pain frequency, management and prevention from an HTA perspective.
    Danish Health Technology Assessment 1999;1:1-106.

  6. Fritz JM, Delitto A, Erhard RE.
    Comparison of classificationbased physical therapy with therapy based
    on clinical practice guidelines for patients with
    acute low back pain. A RCT.
    Spine 2003;28:1363-72.

  7. Long A, Donelson R.
    Does it matter which exercise? A multicentred RCT of low back pain subgroups.
    Proceedings of the McKenzie Institute 8th International Conference;
    2003 Sep 12-14; Rome. September 12-14th, 2003.
    Waikanae (New Zealand)7
    McKenzie Institute International; 2003.

  8. McKenzie RA, May S.
    The lumbar spine: Mechanical diagnosis and therapy.
    2nd ed. Waikanae (New Zealand)7 Spinal Publications; 2003.

  9. McKenzie RA.
    The cervical and thoracic spine: Mechanical diagnosis and therapy.
    Waikanae (New Zealand)7 Spinal Publications; 1990.

  10. McKenzie RA.
    The lumbar spine: Mechanical diagnosis and therapy.
    Waikanae (New Zealand)7 Spinal Publications; 1981.

  11. Kilby J, Stigant M, Roberts A.
    The reliability of back pain assessment by
    physiotherapists, using a bMcKenzie algorithmQ.
    Physiotherapy 1990;76:579-83.

  12. Riddle D, Rothstein J.
    Intertester reliability of McKenzie’s classifications of
    the syndrome types present in patients with lumbar pain.
    Spine 1993;18:1333-44.

  13. Kilpikoski S, Airaksinen O, Kankaanpaa M, Leminen P, Videman T, Alen M.
    Interexaminer reliability of lumbar pain assessment using the McKenzie Method.
    Spine 2002;27: E207-14.

  14. Razmjou H, Kramer J, Yamada R.
    Intertester reliability of the McKenzie evaluation in
    assessing patients with mechanical low-back pain.
    J Orthop Sports Phys Ther 2000;30:384-6.

  15. Campbell D, Stanley J.
    Experimental and quasi-experimental designs for research.
    Chicago (Ill)7 Rand McNally & Company; 1963.


Since 12-14-2023

                  © 1995–2024 ~ The Chiropractic Resource Organization ~ All Rights Reserved