Interexaminer Reliability of a Multidimensional Battery
of Tests Used to Assess for Vertebral Subluxations

This section is compiled by Frank M. Painter, D.C.
Send all comments or additions to:

FROM:   Chiropractic Journal of Australia 2016; 46 (1): 100–117 ~ FULL TEXT

Kelly Holt, David Russell, Robert Cooperstein, Morgan Young, Matthew Sherson, Heidi Haavik

Centre for Chiropractic Research,
New Zealand College of Chiropractic,
Auckland, New Zealand

Objective:   The purpose of this study was to investigate the interexaminer reliability of assessing for vertebral subluxations using a multidimensional battery of tests and continuous measures analysis approach.

Methods:   70 participants were assessed by 2 blinded examiners. Examiners used a multidimensional battery of tests to assess for vertebral subluxations in 3 regions (cervical, thoracic, lumbar) of the spine, and indicated which segment had the most positive test findings in each spinal region. The distance was measured from the segment to marks that had been placed on the spine. Interexaminer reliability was determined by calculating the median absolute examiner difference in vertebral equivalents (VEs), where a VE is the height of a typical vertebra in each region of the spine. If the median examiner difference was ≤ 1VE, there was definite agreement on the motion segment that had the most subluxation findings. Differences > 1VE but ≤2VE suggested agreement on the same motion segment, and differences >2VE precluded agreement on the same motion segment.

Results:   Median absolute examiner differences were 0.5 vertebral equivalents in the lumbar region, 1.0 vertebral equivalent in the cervical and thoracic regions, and 0.6 vertebral equivalents when combined across all regions. In the combined dataset, definite agreement (≤1 vertebral equivalent) occurred 63.3% of the time, possible agreement 19.0% of the time, and definite disagreement 17.6% of the time.

There are more articles like this @ our:


Conclusion:   A multidimensional approach to vertebral subluxation assessment was reliable between examiners for etecting the level of vertebral subluxation in all regions of the spine. Median absolute examiner differences indicated examiners agreed on the motion segment with the most positive vertebral subluxation test findings most of the time. Vertebral subluxation assessment agreement, when analyzed using continuous data, indicates much higher reliability than has previously been associated with assessing agreement using discrete data.

From the FULL TEXT Article:


The primary objective of the chiropractic profession is to improve (primarily) spinal function in order to either improve nervous system function and general health and/or prevent or manage neuromusculoskeletal conditions. [1–3] To do this, chiropractors identify, analyze and correct areas of vertebral subluxation (sometimes referred to as spinal dysfunction) using a variety of chiropractic adjustment techniques, which predominantly involve manual procedures. [1, 3] However, there seems to be little agreement on what constitutes vertebral subluxation or what to call it. [4, 5] It has variously been termed subluxation, vertebral subluxation, the vertebral subluxation complex, the chiropractic subluxation, spinal dysfunction, biomechanical joint dysfunction, or a manipulable or functional spinal lesion. [4, 6–9] The term traditionally, or historically, used by the chiropractic profession to define this dysfunction is vertebral subluxation. [10, 11] Recently a group of chiropractic colleges, known as The Rubicon Group, released a definition of ‘chiropractic subluxation’ that provided a testable model for this clinical entity. [9]

In their definition and position statement this group states that:

“We currently define a chiropractic subluxation as a self-perpetuating, central segmental motor control problem that involves a joint, such as a vertebral motion segment, that is not moving appropriately, resulting in ongoing maladaptive neural plastic changes that interfere with the central nervous system’s ability to self-regulate, self-organize, adapt, repair and heal.”

This definition provides a model that includes joints outside of the spine, so uses the term ‘chiropractic subluxation’, instead of the more exclusive term vertebral subluxation. One of the reasons for the release of this definition was that there is currently little consensus regarding the nature of the vertebral subluxation or its associated neurological manifestations. [6, 9] One issue that has led to this paradox is that the chiropractic profession has struggled to demonstrate that it can reliably identify vertebral subluxations. [7, 12]

Vertebral subluxation assessment generally involves evaluating what have been described as the ‘pathophysiological consequences of manipulable lesions’. [7] These have been loosely aggregated into overlapping categories that are often referred to as a PARTS evaluation. [13] The categories include; Pain, Asymmetry, changes in relative Range of motion, changes in Tissue temperature/texture/tone, and other findings that can be identified using Special tests. [7, 13] Some methods of vertebral subluxation assessment, such as pain provocation at segmental levels, have been described as being reliable and valid. [7] However, many of the methods commonly used by chiropractors to functionally assess the spine have previously been found to have limited interexaminer reliability. [7]

In chiropractic practice a montage of examination tests, in combination with other aspects of the patient presentation, history, and preferences, is generally used to decide where to deliver a chiropractic adjustment, as opposed to a single evaluation method such as motion palpation alone. [14, 15] A number of studies have therefore used a combination of assessment methods to identify areas of vertebral subluxation with reliability results varying from poor to substantial. [7, 12, 16] When considering the results of these trials collectively, and taking study quality into account, it remains unclear whether multidimensional approaches contribute more than their component elements when deciding where to adjust the spine. [7, 12]

A continuous measures system, combined with an assessment of examiner confidence, was found to lead to improved levels of interexaminer reliability for spinal motion palpation assessment. [17–21] The primary objective of this study was to use this continuous measures system to determine whether examiners agree on the vertebral segment with the most indicators for adjustment based on the findings of a multidimensional battery of tests that can be used to assess for vertebral subluxations.


      Design and Setting

This interexaminer reliability trial was conducted at the Chiropractic Centre (student training facility) of the New Zealand College of Chiropractic (NZCC) during regular operating hours. The trial was approved by the NZCC Research Committee and was given exemption from formal external ethical review by the local Ministry of Health ethics committee as it was deemed to be an evaluation of an existing practice against a standard that did not significantly differ from standard practice and/or quality assurance.


A convenience sample was recruited from patients presenting to the Chiropractic Centre. Potential participant’s eligible for inclusion were all public patients attending the Chiropractic Centre during data collection sessions who were over the age of 18 and verbally consented to participate in the study. Data collection took place during regular shift times in the Chiropractic Centre when study examiners and research assistants were available. All public patients attending the Chiropractic Centre during these shift times were asked to participate as long as it did not interfere with the logistical operation of the Chiropractic Centre (e.g. scheduling clashes or room bookings). No incentives were given for participation and participation was only at the agreement of the participant.


Two chiropractors, each with over 10 years of clinical experience, were the examiners in this study. Both chiropractors were involved in teaching within the technique program at the NZCC and regularly mentored interns as supervising clinicians in the Chiropractic Centre. Frequent consensus training sessions were held over a 3–month period prior to data collection in order to ensure the multidimensional spinal assessments were performed consistently.

      Measurement/rating Process

When a patient over the age of 18 presented to the Chiropractic Centre during a data collection session, a research assistant (RA) assessed whether their participation in the trial would interfere with the logistical flow of operations in the Chiropractic Centre. If not, the RA explained the study to the patient and asked them if they consented to participate. If they agreed to participate they were escorted to an assessment room by the RA. Their age and gender were recorded along with the date of their most recent chiropractic care session and whether they were currently experiencing any bodily pain or not. If they were experiencing symptoms they were asked to describe the location and severity of any symptoms using a numeric pain rating scale ranging from 0 to 10, with 0 described as no pain and 10 being the worst pain imaginable. They were then asked to remove their shirt or change into a gown and marks were placed on their spine over the inferior tip of the C7 and T12 spinous processes while the patient was seated.

Table 1 A

Table 1 B

Table 1 C

The first examiner then entered the room, accompanied by an RA, and performed a multidimensional battery of vertebral subluxation assessment tests (Table 1). The battery of tests included motion palpation, leg length checks, soft tissue palpation, and joint play/end feel assessment. These assessments are all part of the routine spinal assessment package taught at the NZCC. When the examination was complete, the segment in each area of the spine that the examiner believed to have the most positive vertebral subluxation test indicators was identified and the RA measured the distance from the applicable skin marking (C7 for cervical and thoracic regions, and T12 for the lumbar region) to the position indicated by the examiner. All measurements were performed in the same seated position that was used for placing the marks on the skin. If the examiners found 2 levels in a region that had an equal amount of motion, soft tissue and joint play findings (with no additional finding to help make a decision (e.g. leg length inequality, Derifield tests or Cervical Syndrome), the lowest level was recorded as the level of vertebral subluxation. The examiner was also asked to indicate whether they were confident or not confident about their findings. The second examiner then entered the room within 5 minutes with a second RA, and while remaining blind to the findings of the first examiner, repeated the assessment. The examiners were not provided with any clinical information about the participants, they alternated their order in assessing the participants, and they did not converse with the participants.

The paired findings for examiner differences on vertebral subluxation were assessed for normality to determine the appropriate statistical function(s) to be used to assess interexaminer reliability. Following this, interexaminer reliability in this study was determined by calculating Median Absolute Examiner Differences (MedAED). Data dispersion was determined by calculating the Median Absolute Deviation (MAD). [22] Since standard deviation cannot be calculated in the usual manner when working with absolute values, data dispersion was characterized by MAD, the median of the absolute deviations of examiners differences from the median of such differences. MAD) is calculated as the median of the absolute value of each value, xi, minus the median: MAD = median (|xi – median(xi)|). In addition to being provided in “cm” units, MedAED and MAD were transformed into and presented as vertebral equivalents (VEs), where VE is defined as the height of a typical vertebra. Since the height of a typical vertebra varies according to the spinal region, examiner differences reported in cm would misleadingly imply different degrees of examiner reliability depending on the spinal region. For example, a MedAED of 4 cm constitutes a median difference of 1 vertebral body height in the lumbar spine, but in the cervical spine, where the vertebrae are shorter, would constitute median examiner differences of over 2 vertebral body heights. Reporting the data as VEs allows immediate comparisons of examiner reliability, irrespective of spinal region. To convert cm to VE, the following heuristic weighting factors were used: 2.3cm for a typical thoracic segment, [23] 1.8cm for a typical cervical segment, [24] and 4cm for a typical lumbar segment. [25] Calculations were also performed to determine the degree of examiner agreement on the level of vertebral subluxation with the following possible defined outcomes for identifying the most subluxated vertebra or the motion segment including it:

MedAED ≤1.0VE:   definite agreement
MedAED >1.0VE and ≤2.0VE:   indeterminate agreement
MedAED ≤1.5 VE:   acceptable reliability
MedAED >2.0 VE:   definite disagreement

Figure 1

Figure 1 illustrates why MedAED values were interpreted in this way. If MedAED is less than 1 VE, it may be stated there was definite agreement on the vertebral subluxation or at least the motion segment containing it. A spinal motion segment is the smallest spinal function unit that is comprised of 2 adjacent vertebrae and their accompanying ligaments and intervertebral disc. [26] At the other extreme, if MedAED is greater than 2 VE’s, there was definite disagreement as the examiners could not have agreed on the same vertebra, let alone the motion segment including it. In the range where MedAED is between 1 and 2 VE’s, there was indeterminate agreement, depending on whether an examiner happened to identify the vertebral subluxation near the center of a spinal segment, or rather identified the vertebral subluxation close to the top or bottom of a spinal segment (Figure 1). We thought it reasonable to identify the midpoint of this range, MedAED ≤1.5 VE, as the boundary of acceptable interexaminer reliability, wherein with great likelihood the examiners at least agreed on the motion segment including the vertebral subluxation. It would be very difficult, if not impossible, to reduce the size of the indeterminate zone. Doing so, would require untenable assumptions as to exactly where the spinal locations the examiners judged most subluxated were situated in relation to the actual center of the vertebrae.


Table 2

Figure 2

Data collection took place across 21 study sessions between October 2014 and March 2015. Seventy patients were assessed during the trial with between 1 and 6 patients being assessed during each data collection session. All patients who were asked to participate agreed to do so. Fifty-one percent of participants reported the presence of bodily pain with the mean pain level being 4.3/10 (SD = 2.1) if present. A summary of patient characteristics is provided in Table 2.

Figure 2 shows the distributions of the examiners’ determination of the most subluxated segments for each spinal region (calculated using the distance in VE’s from the standardized measurement point). The most common levels identified in each region were L2, T7, and C3.

Shapiro-Wilk testing for the cervical, thoracic, and lumbar spines demonstrated that in none of these regions were the pair examiners’ ratings for vertebral subluxation normally distributed. This result technically precluded analysis using Intraclass Correlation (ICC) for parametric data, as well as calculating Bland-Altman Limits of Agreement. Given how commonly ICC is used to estimate reliability for continuous data, we thought it reasonable, despite this limitation, to provide their values: ICC values for interexaminer agreement were 0.55 in the cervical region, 0.57 in the thoracic region, and 0.61 in the lumbar region. Although these values are considered to be fair to good, [27] they should be interpreted with caution due to the non-normal distribution of examiner differences. Furthermore, there is another reason to be cautious in interpreting these values. ICC values are misleadingly depressed when subject variability is relatively low; i.e., the subjects are relatively homogeneous. [28] This is because ICC is a ratio of the variance within subjects to the total variance (the sum of within and between-subject variance). When within-subjects variance is small, as in the present study, where the most subluxated spinal locations were not randomly distributed in the thoracic and lumber spines, ICC values can be surprisingly low even when the examiners tend to agree.

Table 3

Figure 3

Table 4

Although data were collected for the examiners’ confidence levels, there were too few examiner ratings (14%) where 1 or both examiners lacked confidence to perform meaningful analysis.

MedAED and MAD values, in both cm and VE’s are provided (Table 3). Given the differing vertical dimensions of typical vertebrae in the 3 spinal regions, the authors believe it more meaningful to compare the results for the different regions as expressed in VE units. Median examiner differences for vertebral subluxation assessment were smallest in the lumbar region (0.5VE), and equal in the thoracic region and cervical regions (1.0VE). For the combined data, including all 3 spinal regions, MedAED was 0.6VE. MAD values, which represent data dispersion, ranged from a low of 0.3VE in the lumbar spine, to 0.8VE in the thoracic spine. Figure 3 summarizes the results of examiner agreement using a box-and-whisker plot, identifying 12 outliers, defined as examiner differences outliers out of the box by more than 1.5 times the interquartile range. Subgroup analyses did not indicate a significant effect of patient symptomatology on reliability results.

In the regional analyses, examiner agreement on the vertebral subluxation, or the motion segment including it, ranged from a high of 90% in the lumbar spine to a low of 40% in the thoracic spine (Table 4). Definite disagreement on the vertebral subluxation, or the motion segment including it, ranged from a high of 40.0% in the thoracic spine to a low of 0% in the lumbar spine. Adequate agreement varied from a high of 97.1% in the lumbar spine to a low of 47.1% in the thoracic spine. In the combined dataset, definite agreement was 63.3%, definite disagreement was 17.6%, and acceptable agreement was 73.3%.


The results of this study suggest that a multidimensional approach to vertebral subluxation assessment was reliable between examiners for detecting the level of vertebral subluxation in all regions of the spine. In at least 63% of assessments the examiners agreed on the same motion segment across all regions of the spine. It has been hypothesized that reliability would increase when the examiners were confident in their findings, as had been the case in prior studies by Cooperstein et al in the thoracic [19], cervical [20], and lumbar [17] regions. In the present study there were too few observations in which 1 or both examiners lacked confidence to test this hypothesis.

When a clinical variable can be measured using either discrete or continuous analysis, there are good reasons to expect to find greater reliability when using continuous data. Markon, Chmielewski [29] performed 2 meta-analyses including 58 studies in which both continuous and categorical measures were used to assess psychopathology. They found a 15% increase in reliability and 37% increase in validity using continuous measures, allowing a 50% reduction in sample size for any given power analysis of subject requirements. Baca-Garcia, Perez-Rodriguez [30] suggested that low reliability in assessing patients may result from the use of discrete diagnostic criteria that fail to recognize continuous variation in patients’ presentations.

In assessing vertebral subluxation, the most obvious explanation why strict segmental assessment is less likely to detect agreement than using a most subluxated site paradigm is that when the finding actually lies on a continuum, and is then artificially discretized, information is lost. If vertebral subluxation were understood to involve at least one motion segment, rating individual segments as subluxated or not will fail to identify larger fields of vertebral subluxation and thus miss the overlap of those fields among examiners’ assessments.

Due to the non-parametric nature of the data, rendering conventional ICC analysis suspect, the authors strongly emphasized understanding interexaminer reliability as the median of absolute examiner absolute differences, MedAED, which provides a measure of the typical difference between examiners. In fact, it is especially useful when examiner differences are not normally distributed. [31] MAD is a robust measure of dispersion (functionally similar to standard deviation) that is resilient to outliers and is suitable for datasets that are not normally distributed. Data points at the very extremes of the distribution of examiner differences do not impact the calculation of MedAED any more than less extreme values.

Figure 2 suggests that the examiners found the subjects relatively homogeneous in their most subluxated level in the thoracic and lumbar regions, but in the cervical region the examiners’ findings for the most subluxated level were relatively dispersed throughout the range of the cervical spine.

Most likely the higher agreement (90.0%) seen in the lumbar spine compared to the thoracic spine (40.0%) reflects the fact that there are only 5 lumbar segments vs. 12 thoracic segments to choose among. Had the thoracic spine been divided into upper and lower divisions, in all likelihood, examiner agreement would have increased. Even granting the limitation that the thoracic spine was not subdivided into sections comparable in numbers of segments to the cervical and lumbar spines, acceptable examiner agreement was 73.3% in the combined dataset.

A box-and-whisker plot is provided to summarize the results in the combined dataset (Figure 3). Analysis of the plot leads to the conclusion that 154/210 (73%) of examiner differences were ≤1.5VE, which the authors deem the boundary of acceptable reliability, and 12/210 (6%) of examiner differences were ≥1.5 times the interquartile range, extreme data points generally considered outliers. [21]

Previous research has suggested that the interexaminer reliability of clusters of tests to identify vertebral subluxations in the spine is questionable, with reviews of the literature concluding that evidence for examination montages is either unclear or that no good quality studies exist that show a testing regimen is reliable. [7, 12] Two studies have suggested that by clustering the results of a number of tests for sacroiliac joint dysfunction substantial interexaminer reliability can be demonstrated. [32, 33] However, previous studies that have investigated multidimensional assessment methods across multiple spinal levels, that also met quality standards, [12] showed marginal interexaminer reliability. [15, 16, 34] French, Green [16] used a multidimensional spinal diagnostic method commonly used by chiropractors to assess interexaminer reliability in the lower thoracic spine, lumbar spine, and sacrum, and found fair agreement (kappa (κ) = 0.27) when averaged across all spinal joints tested. Hawk, Phongphua [34] also used a combination of commonly used chiropractic assessment procedures in their interexaminer reliability study of the lumbar spine and reported levels of agreement that averaged less than chance (κ = –0.08), with the maximum level of agreement across 42 comparisons barely reaching acceptable levels (κ = 0.44). Keating, Bergmann [15] also investigated interexaminer reliability of the lumbar spine using a multidimensional approach that included the 4 strongest tests from an 8 test regimen. They reported slightly better results with ICC’s across the levels ranging from 0.34 to 0.62 with an average of 0.46. The average ICC for the multidimensional vertebral subluxation assessment across the spinal regions for the present study was 0.58 which exceeds the values from these previous studies, but must be interpreted with caution due to violation of normality assumptions.

Interestingly, a spinoff study from the present study assessed examiner agreement purely based on the motion palpation assessment included in the present study. [35] In this motion palpation study, the MedAED for examiner agreement in the combined dataset was 1.1 VE, almost twice as large as the MedAED examiner agreement in the combined dataset for the present study, which was 0.6VE. This suggests that using a multidimensional approach to assessing vertebral subluxations is more reliable than using motion palpation alone. Of more clinical relevance is the finding that the examiners in the present study agreed on the same motion segment 73.3% of the time across all spinal regions. This suggests that it is possible to create a multidimensional assessment of vertebral subluxation that is reliable. Interestingly enough, the assessment that was used in the present study did not include pain provocation at segmental levels, which currently has the most convincing favorable evidence for interexaminer reliability. [7] This suggests that the reliability of multidimensional testing approaches may exceed that reported in this trial if more reliable component parts were to be included in the multidimensional approach.

Limitations of the study

As full-time chiropractic educators it could be argued that the examiners in the study were not representative of chiropractors in the field, though both examiners were still active in part-time private practice. Although examiners were blinded to any other prior findings during the study it is possible that they were familiar with some patient’s prior findings or clinical or non-clinical cues based on previous visits to the Chiropractic Centre that they may have supervised; this is also a limitation. It has been hypothesized that reliability would increase when the examiners were confident in their findings, as had been the case in prior studies by Cooperstein et al in the thoracic [19], cervical [20], and lumbar [17] regions. There were too few observations in which 1 or both examiners lacked confidence to test this hypothesis. Data violated normality assumptions which meant neither the more traditional ICC analysis nor Bland-Altman Limits of Agreement could be used in this study. One of the strengths of MedAED for analyzing reliability data such as these, is that it is not influenced by the variability amongst possible responses.

We chose the relatively stringent criterion for “acceptable” agreement that the median examiner difference was ≤1.5VE, corresponding to apparent agreement on the motion segment including the vertebral subluxation. It is entirely possible the clinical field of impact for vertebral subluxation includes not only the motion segment including it, but the motion segments adjacent to it, presumably to a lesser extent. Pursuing that logic, clinically relevant examiner agreement in this study may have corresponded to a higher median examiner differences. For example, at the VE≤3VE cut point, agreement in this study occurred 90% of the time in the combined data.

This study showed high levels of interexaminer reliability for a multidimensional battery of tests for detecting vertebral subluxations, but it did not address the validity of these tests. A reliable test cannot be assumed to be useful for clinical decision-making if it has not been shown to be valid. Further research is required to assess the validity of the tests that were used in this study. The results of this study indicate that chiropractors can agree on the vertebral level to be adjusted which is important when it comes to clinical practice and teaching examination techniques to chiropractic students. Future research is required to determine whether the findings of the multidimensional battery of tests change after an adjustment is provided at that level, and whether patient clinical outcomes are influenced by adjusting at the spinal level with the most positive test findings as opposed to some other means for determining the preferred adjustment site.


In this study, high levels of interexaminer reliability were observed in each region of the spine when a multidimensional approach to detect vertebral subluxations was used. Since the combined MedAED for vertebral subluxations was 0.6VE, it can be stated with confidence that examiners usually agreed on at least the motion segment containing the most positive vertebral subluxation test indicators, and very frequently on the same segment. Vertebral subluxation assessment, when analyzed using continuous data, indicate much higher levels of agreement than has been heretofore associated with assessing agreement using discrete data and the Kappa statistic.


  1. World Health Organization (WHO)
    WHO Guidelines on Basic Training and Safety in Chiropractic
    Geneva, Switzerland: (November 2005)

  2. Association of Chiropractic Colleges.
    The Association of Chiropractic Colleges Position Paper # 1 July 1996.
    ICA Rev 1996;November/December.

  3. Chiropractic WFC. Definitions of Chiropractic 2015
    Available from:

  4. Gatterman ML.
    Foundations of chiropractic: subluxation. 1st ed.
    St Louis: Mosby-Year Book, Inc; 1995.

  5. Ebrall P.
    Subluxation, what's in a name.
    Chiropr J Aust 2011;41(3):110-2.

  6. Nelson C.
    The subluxation question.
    J Chiropr Humanit 1997;7(1):46-55.

  7. Triano J, Budgell B, Bagnulo A, Roffey B, Bergmann T, Cooperstein R.
    Review of Methods Used by Chiropractors to Determine
    the Site for Applying Manipulation

    Chiropractic & Manual Therapies 2013 (Oct 21); 21 (1): 36

  8. Ebrall P, Draper B, Repka A.
    Towards a 21 century paradigm of chiropractic: stage 1, redesigning clinical learning.
    J Chiropr Educ 2008;22(2):152-60.

  9. Definition and Position Statement on the Chiropractic Subluxation [press release].
    [Online] Available at:
    The Rubicon Group, 22/5/2017 2017.

  10. Gliedt JA, Hawk C, Anderson M, Ahmad K, Bunn D, Cambron J, et al.
    Chiropractic Identity, Role and Future:
    A Survey of North American Chiropractic Students

    Chiropractic & Manual Therapies 2015 (Feb 2); 23 (1): 4

  11. Walker BF, Buchbinder R.
    Most Commonly Used Methods of Detecting Spinal Subluxation and the Preferred Term
    for its Description: A Survey of Chiropractors in Victoria, Australia

    J Manipulative Physiol Ther. 1997 (Nov); 20 (9): 583–589

  12. Gemmell H, Miller P.
    Interexaminer reliability of multidimensional examination regimens used for detecting spinal manipulable lesions:
    A systematic review.
    Clin Chiropr 2005;8:199-204.

  13. Bergmann TF, Finer BA.
    Joint Assessment – P.A.R.T.S.
    Topics in Clinical Chiropractic 2000; 7 (3): 1–10

  14. Walker BF.
    Most common methods used in combination to detect spinal subluxation:
    A survey of chiropractors in Victoria.
    Australas Chiropr Osteop 1998;7(3):109-11.

  15. Keating JC, Jr., Bergmann TF, Jacobs GE, Finer BA, Larson K.
    Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality.
    J Manipulative Physiol Ther 1990;13(8):463-70.

  16. French SD, Green S, Forbes A.
    Reliability of chiropractic methods commonly used to detect manipulable lesions in patients with chronic low-back pain.
    J Manipulative Physiol Ther 2000;23(4):231-8.

  17. Cooperstein R, Young M.
    The Reliability of Lumbar Motion Palpation Using Continuous Analysis and Confidence Ratings:
    Choosing a Relevant Index of Agreement

    J Can Chiropr Assoc. 2016 (Jun);   60 (2):   146–157

  18. Cooperstein R, Young M.
    The Reliability of Spinal Motion Palpation Determination of
    the Location of the Stiffest Spinal Site is Influenced by
    Confidence Eatings: A Secondary Snalysis of Three Studies

    Chiropractic & Manual Therapies 2016 (Dec 20); 24: 50

  19. Cooperstein R, Haneline M, Young M (2010)
    Interexaminer Reliability of Thoracic Motion Palpation Using
    Confidence Ratings and Continuous Analysis

    J Chiropractic Medicine 2010 (Sep);   9 (3):   99–106

  20. Cooperstein R, Young M, Haneline M (2013)
    Interexaminer Reliability of Cervical Motion Palpation Using Continuous Measures
    and Rater Confidence Levels

    J Can Chiropr Assoc. 2013 (Jun);   57 (2):   156–164

    Box-and-Whisker Plots [Available from:

  22. Leys C, Ley C, Klein O, Bernard P, Licata L.
    Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median.
    J Exp Soc Psychol 2013;49(4):764-6.

  23. Gray H.
    Anatomy of the human body 1918 [Available from:

  24. Gilad I, Nissan M.
    Sagittal evaluation of elemental geometrical dimensions of human vertebrae.
    J Anat 1985;143:115-20.

  25. Terazawa K, Akabane H, Gotouda H, Mizukami K, Nagao M, Takatori T.
    Estimating stature from the length of the lumbar part of the spine in Japanese.
    Medicine, science, and the law 1990;30(4):354-7.

  26. White AA, Panjabi MM.
    Clinical biomechanics of the spine.
    Philadelphia: Lippincott; 1990.

  27. Cicchetti DV.
    Guidelines, criteria, and rules of thumb for evaluating normed and standardized
    assessment instruments in psychology.
    Psychological Assessment 1994;64(4):284-90.

  28. Lee KM, Lee J, Chung CY, Ahn S, Sung KH, Kim TW, et al.
    Pitfalls and important issues in testing reliability using intraclass correlation coefficients
    in orthopaedic research.
    Clinics Orthop Surg 2012;4(2):149-55.

  29. Markon KE, Chmielewski M, Miller CJ.
    The reliability and validity of discrete and continuous measures of psychopathology:
    a quantitative review.
    Psychological Bull 2011;137(5):856-79.

  30. Baca-Garcia E, Perez-Rodriguez MM, Basurte-Villamor I, Fernandez del Moral AL, Jimenez-Arriero MA.
    Diagnostic stability of psychiatric disorders in clinical practice.
    Br J Psychiatry 2007;190:210-6.

  31. Rouse MW, Borsting E, Deland PN.
    Reliability of binocular vision measurements used in the classification of convergence insufficiency.
    Optom Vis Sci 2002;79(4):254-64.

  32. Cibulka MT, Delitto A, Koldehoff RM.
    Changes in innominate tilt after manipulation of the sacroiliac joint in patients with low back pain.
    An experimental study.
    Phys Ther 1988;68(9):1359-63.

  33. Kokmeyer DJ, Van der Wurff P, Aufdemkampe G, Fickenscher TC.
    The reliability of multitest regimens with sacroiliac pain provocation tests.
    J Manipulative Physiol Ther 2002;25(1):42-8.

  34. Hawk C, Phongphua C, Bleecker J, Swank L, Lopez D, Rubley T.
    Preliminary study of the reliability of assessment procedures for the indications for
    chiropractic adjustments of the lumbat spine.
    J Manipulative Physiol Ther 1999;22(6):382-9.

  35. Holt, K., Russell, D., Young, M., Sherson, M., Haavik, H., 2018b.
    Interexaminer Reliability of Seated Motion Palpation for the Stiffest Spinal Site
    J Manipulative Physiol Ther. 2018 (Sep); 41 (7): 571–579



Since 1-09-2019

                  © 1995–2024 ~ The Chiropractic Resource Organization ~ All Rights Reserved