BACKGROUND
Shortly after the introduction of the Model SHLCP-5
Precision Adjustor, reports were received from clinicians that the sound of the
impulse changed during use. Some clinicians began using the change in sound as
an indicator that the adjustment phase should be terminated.4 Another group of
clinicians5 began to use the instrument as a percussor to help locate points on
the spine as candidates for adjustment.
The use of percussive techniques
as an aid to diagnosis is well known in the medical arts. The image of the
physician tapping our body with a small hammer is a vivid childhood memory for
most.
As commonly practiced, this technique is subject to wide
variability in both the application of the percussive force (was the last
impulse the same as the current impulse?) and the interpretation of the results
(you can hear the difference, can't you?). This variability and the total
subjectivity of the interpretation of the results require a certain minimum
training and individual aptitude to make the technique useful and the results
transferable across patients and examiners.
The development of the Force
Recording and Analysis System presents an opportunity to improve the percussive
technique since:
The patient's body is represented by the spring with spring constant K and
the damper C. In the equation of motion for this system, the mass used would
include the mass of the anvil as well as a virtual mass of tissue. Assuming that
the damping and virtual mass are essentially constant, the frequency response of
the system and, therefore, the shape of the impulse would be determined
primarily by the spring constant or stiffness of the substrate.
If this
simple analysis is fundamentally sound, the peak force of each impulse would be
expected to vary with the stiffness (or inversely with the compliance) of the
substrate and would be expected to be constant at a point on a
substrate.
In addition, the mean force varies in a predictable and expected manner when
the impulse is applied to surfaces of differing compliance; that is, low
compliance substrates (high resistance) result in a higher peak impulse with
higher compliance substrates (lower resistance) substrates resulting in lower
peak forces. Furthermore, the variability of the data indicate that differences
in peak force of greater than ten percent (fifteen pounds at full scale) have a
ninety-five percent chance of representing valid differences in the underlying
substrate.
CONCLUSIONS
The results of the testing and
analysis indicate that fundamental engineering analysis methodologies may be
productively applied to the problem of quantifying the analysis of the
compliance of the human body. In particular, single point vibration analysis
techniques utilizing impulse loading are useful in explaining and analyzing the
force output of the adjusting head of the Sense Technology Force Recording and
Analysis System. The peak force has been shown to be related to the compliance
of the substrate to which the adjusting head is applied. The peak force has been
shown to be essentially constant for a single substrate. Variations of greater
than ten percent in the peak force indicate a high probability that the output
was derived from two different substrates.
APPENDIX II Intra- and
Inter-Examiner Reliability Studies
The compliance of the human spine may be thought of as the ease of movement
of each individual vertebra. For the purposes of this paper, compliance is
defined as the displacement response of a structure when subjected to a unit
force. It is the inverse of stiffness and intuitively can be thought of as the
flexibility of a structure. This paper describes reliability studies on an
instrument which measures the compliance of the human spine, before, during and
after adjustment.
Sense Technology, Inc. has developed a unique
chiropractic adjusting system which incorporates a percussive adjusting head.
This percussor is instrumented with a force transducer that supplies data to a
computer system. The computer stores and displays the force data for clinical
evaluation. This system, referred to as the Force Recording and Analysis System
(FRAS), is an extension of previous adjustors which are marketed without the
force instrumentation.9 The clinician uses the Force Recording and Analysis
System to challenge each vertebra with a low energy impulse. The system records
and displays the peak force measured at each vertebra. The compliance of the
vertebra is inversely proportional to the peak force. The results of the
compliance assessment are used by the clinician, in conjunction with other
diagnostic techniques (such as palpation, X-ray examination, thermal analysis,
and analysis of patient complaint and history) to determine appropriate
adjustment locations.
BACKGROUND
Sigler and Howe [20] and
Jackson et al. [10] have pointed out that when measurement systems are used as
the basis for diagnostic techniques, the reliability of the measurement system
directly influences the reliability of the diagnostic system. Sigler and Howe
found that the errors involved in measuring very small changes in atlas position
on X-ray films were of such magnitude that the validity of the entire diagnostic
and therapeutic regime was open to question. Jackson et. al. found using a
somewhat different measurement system that changes in angles between vertebrae
could be reliably obtained from X-ray films. A series of studies has been
conducted examining the reliability of palpation and motion palpation for the
detection of "somatic dysfunction" with mixed results [4,14,15,17,25,27]. Where
statistical significance has been achieved, the level or strength of the
findings has been relatively low. DeBoer et. al. [4] points out that such
findings are not surprising since even well-established procedures such as
reading X-rays, E K G's or blood pressure give maximum intra-examiner
correlation in the range of .40 to .60. Others [19,21-24] have attempted to
assess the reliability of simple instrumentation (inclinometers, temperature
measurement instruments and penetration devices) as aids to diagnosis, again
with mixed results. Several authors [7,8,16] have critiqued the methodology used
in intra- and inter- examiner reliability studies. Their critiques may be useful
as general guides and to raise questions regarding statistical
methodology.
In previous work[6] we have documented the reliability and
applicability of force measurement techniques for measuring differences in
compliance of various substrates including the human body. That study concluded
that the peak force of the impulse of the adjusting head varies directly with
the stiffness of the substrate ( inversely with the compliance). The measures
obtained showed that the system produced a repeatable result and that
differences of greater than ten percent in peak force were
significant.
Clinicians are currently using the Model SHLCP-5 Precision
Adjustor to percussively test the spine both before and after adjustment [3].
The force levels used are generally higher (twenty-five pounds) than used for
the initial assessment of the Force Recording and Analysis System (fifteen
pounds). The proposal to apply these techniques to the analysis of compliance
along the human spine raises some fundamental questions. This study examined the
following:
TRIAL TWO
Determine the reproducibility of repeated compliance
readings
taken by the same examiner.
Method: Using the 30mm dual prong
attachment to straddle the
spinous of the patient's cervical vertebrae, the
examining
chiropractor stabilized the head and neck of the seated
patient
in flexion. The examining chiropractor chose the line of drive
and
positioned the tips of the instrument on the occiput, and
vertebrae C1-7 and
T1-3. Twenty patients participated in the study.
The examiner obtained a
second set of readings on each patient
immediately after the first set. There
were no markings on the
patient to serve as position references.
Research
Hypothesis: The patterns of compliance measurements
obtained using the
Force Recording and Analysis System in consecutive
measure-
ments of the human cervical spine by the same examiner show
no
significant differences from one trial to the
next.
Statistical
Analysis: Each set of consecutive readings was
examined for similarity
using the following statistical techniques. First,
the probability that
the two sets of data were different was computed using
a c2
statistic. This method was chosen since an estimate of the
measurement error at each force level was available.[6] The c2
statistic
was computed using the formula:
where: xi = the first set of data
collected from the patient
yi = the second set of data
= the standard
deviation at the force level
= the standard deviation at the force level
and are calculated using the empirical relationship
developed in
[6]:
where: = .14
= .035
The probability of obtaining a c2
value at least as large as that
observed, assuming the measurements were
drawn from the same
distribution, was computed from an incomplete gamma
function.10
If this probability is smaller than .05, then the hypothesis
that the
data sets are the same may be rejected with a confidence of
95%.
In addition, the Pearson product moment correlation (r ) was
computed on the data sets using the formula:
where: xi = the
first set of data collected from the patient
yi = the second set of
data
Finally, the intra-class correlation coefficient (ICC) was computed
by performing a one way analysis of variance and formulating the
ICC
according to:
where = the variance within the data
= the
variance between the data sets
m = the number of levels of the
analysis
The chi square metric computes the square of the difference
between each paired observation (the value obtained at level C1 on the first
observation is subtracted from the value obtained at C1 on the second
observation and the difference squared) and then compares that value to an
estimate of the measurement, determined through empirical repetitive testing on
known substrates of constant compliance[6].
If the measurements are
exactly the same, the chi square returns a value of zero and the probability of
obtaining a poorer (larger) result is necessarily equal to one. The larger the
chi square the less likely that the two sets of measurements are "the same" or
that the measurement is repeatable within an acceptable degree of
error.
The chi square is a good measure of the significance of the
difference between two sets of measures. Because its numeric values are not
subject to direct interpretation, it does not provide a measure of the strength
of the association. Some form of correlation coefficient is generally used as a
measure of the strength of a significant association. The most widely used is
the linear correlation (also referred to as the product-moment correlation or
Pearson's r). The use of this statistic has been criticized [7,8,16], primarily
because of artifacts such as obtaining a high correlation even though the two
sets of measures differed by some constant value (i.e., a correlation equal to
one would be obtained from two data sets even though each value in the second
data set were twice the value of its mate in the first data set). In our case,
where the pattern of data within the set may be more important than the exact
values, this limitation may not be important.
The intra-class
correlation coefficient (ICC) has been proposed, [7,16] as a more reliable
indicator of the strength of a significant association for reliability studies
involving continuous measurements. This coefficient can be constructed from the
results of a one way analysis of variance. The formulation
is:
where = the variance within the data
= the variance
between the data sets
m = the number of levels of the analysis
If the
data sets are exactly the same then MSB equals zero and the ICC equals one. If
the variance between the data sets is greater than the variance within the data
set then the ICC will be negative (no agreement between levels). If the variance
between the data sets is less than the variance within the data then the ICC
will be positive and there is said to be some agreement between levels. The
results are summarized in TABLE 1.
Examiner Patient p r
1 1 10.52 .484
.98
1 2 41.23 <.0000 .86
1 3 12.01 .363 .85
1 4 11.94 .368 .80
1
5 12.62 .319 .85
1 6 9.47 .579 .80
1 7 10.89 .452 .82
1 8 12.11 .355
.74
1 9 3.73 .977 .97
1 10 15.56 .158 .82
1 11 14.82 .191 .80
1 12
11.5 .402 .84
1 13 20.61 .038 .91
1 14 15.68 .153 .82
1 15 15.35 .167
.86
Linear Correlation Across Patients for Examiner One =.88
ICC
Across Patients for Examiner One =.90
2 1 18.58 .069 .89
2 2 9.21 .602
.95
2 3 14.43 .210 .91
2 4 49.91 <0000 .72
2 5 19.94 .046
.86
Linear Correlation Across Patients for Examiner Two =.85
ICC
Across Patients for Examiner Two =.93
Linear Correlation Across Patients
for Examiner One and Two =.96
ICC Across Patients for Examiner One and Two
=.96
Analysis: The two consecutive readings taken in the cervical area
were
highly correlated for each chiropractor. In addition,
the c2 probability
indicates that the null hypothesis (the
measurements are the same) could be
rejected in only three
cases. It appears that the intra-examiner
reliability
of the tests is high, by any of the metrics. The chi-squared and
ICC metrics sometimes disagree since chi-squared normalizes the
observed
variability against an empirical estimate of the
measurement error, and the
ICC normalizes against the variability
in the patient's measurements. If the
ICC is calculated for all of the
data, a very high value is returned. This is
done in the
overall ICC measurements reported in TABLE 1
above.
Sources of Variability:
There are two obvious sources of
variability which may influence
the outcome of this intra-examiner
reliability study. The first is
the error introduced due to imperfect
placement. By placement,
we mean the position of the dual prong tips used for
contacting
the patient during the examination as well as the line of drive
chosen by the examiner. Even with markings on the cervical area
(which
were not used in this study), the angle of the instrument
and the positioning
of the tips would be impossible to duplicate
exactly from the first
examination to the second.
Another source of error is a result of the
measurement itself.
In our case, it is not difficult to understand that the
second
set of measurements may differ from the first due to the act of
measuring because the energy used in the measurement may
well be
sufficient to cause changes in the underlying structure
of the spine. Such
changes would be expected to result in
differences in response to the energy
of the test impulse. That
these changes are relatively small is attested to
by the excellent
agreement found. This agreement may well be improved
by
training and/or lowering the energy of the impulse.
Conclusions:
The intra-examiner reliability of compliance measurements
obtained in the
cervical spine with the Force Recording and
Analysis System is consistently
high.
TRIAL THREE
Determine the reproducibility of repeated
compliance readings
taken by two different examiners.
Method: Using
the 30mm dual prong attachment to straddle the spinous
of the patient's
cervical vertebrae, an examining chiropractor
stabilized the head and neck of
the seated patient in flexion.
The first examining chiropractor chose the
line of drive and
positioned the tips of the instrument on the occiput
and
vertebrae C1-7 and T1-3. Three patients participated in the study.
A
second examiner obtained a second set of readings on each patient
immediately
after the first set. There were no markings on the
patient to serve as
position references.
Research
Hypothesis: The patterns of compliance
measurements obtained using
the Force Recording and Analysis System in
consecutive
measurements of the human cervical spine by two
different
examiners show no significant differences from one trial to
the
next.
Statistical
Analysis: This analysis was conducted in the same
manner as the
previous analysis for the intra-examiner reliability.
Examiner Patient p r
1-3 21 10.95 .447 .80
1-3 22 8.45 .673
.87
1-3 23 11.4 .432 .75
Linear Correlation Across Patients for
Examiner One and Three =.89
ICC Across Patients for Examiner One and Three
=.65
Analysis: The two consecutive readings taken in the cervical
area
were highly correlated for each chiropractor. In addition,
the c2
probability indicates that the null hypothesis (the
measurements are the
same) could not be rejected in any
of the cases. It appears that the
intra-examiner repeatability
of the tests is high.
Discussion: The
purpose of this trial is to determine whether
or not the compliance
measurements obtained by two different
clinicians on a single patient are
the same. The statistics suggest
that the measurements are quite
reproducible across clinicians.
CONCLUSION
This study
suggests that the measurements of the FRAS system reflect the actual compliance
of the patient's spine. Further, the intra- and inter- examiner reproducibility
of these measurements is good. A subsequent study will quantify our observations
that these compliance measurements change after chiropractic adjustment with the
instrument. Clinicians are just beginning to develop diagnostic and treatment
rules to use the information provided by this new instrument.
Sense
Technology Inc.
12/15/94
This yields a set of values for the segments tested that are normalized
relative to the maximum value recorded. These values are then displayed for the
investigator as a bar graph with eleven elements.
Typical results obtained from sequential testing of sutura joint, condyloid
joint, and ginglymus joint. First four measurements obtained from the sutura
joint; second four measurements from the condyloid; last three measurements from
the ginglymus.