J Manipulative Physiol Ther 2008 (Sep); 31 (7): 491–502 ~ FULL TEXT
Howard Vernon, DC, PhD
Canadian Memorial Chiropractic College,
Toronto, Ontario, Canada
BACKGROUND: Published in 1991, the Neck Disability Index (NDI) was the first instrument designed to assess self-rated disability in patients with neck pain. This article reviews the history of the NDI and the current state of the research into its psychometric properties -- reliability, validity, and responsiveness -- as well as its translations. Focused reviews are presented into its use in studies of the prognosis of whiplash-injured patients as well as its use in clinical trials of conservative therapies for neck pain.
SPECIAL FEATURES: The NDI is a relatively short, paper-pencil instrument that is easy to apply in both clinical and research settings. It has strong psychometric characteristics and has proven to be highly responsive in clinical trials. As of late 2007, it has been used in approximately 300 publications; it has been translated into 22 languages, and it is endorsed for use by a number of clinical guidelines.
SUMMARY: The NDI is the most widely used and most strongly validated instrument for assessing self-rated disability in patients with neck pain. It has been used effectively in both clinical and research settings in the treatment of this very common problem.
From the Full-Text Article:
History Of The Neck Disability Index
Before 1991, no instrument was available to assess the
self-rated disability of patients with neck pain. In the
previous decade, a few of such instruments for patients with
low back pain had been developed, chiefly, the Oswestry
Low Back Pain Index (OI)  and the Roland-Morris Low
Back Pain Questionnaire.  Recognizing the deficiency with
respect to neck pain, Vernon undertook to develop a similar
instrument suitable for patients with neck pain. It was
decided to model this instrument on the OI, so permission
from its primary author, J. Fairbanks, was obtained for that
purpose. Most of the items in the OI could be regarded as
specific “activities of daily living.” The inclusion of this type of item distinguished the OI and similar instruments from the simpler measures of pain severity, location, or duration that were more commonly used at that time.
The first phase in the development of the new instrument,
later deemed the Neck Disability Index (NDI), consisted of
item selection. First, items from the OI were reviewed for
appropriateness and retained if deemed applicable to patients
with neck pain. Six items were initially thought to be
suitable: ‘pain intensity,’ ‘personal care,’ ‘lifting,’ ‘sleep,’
‘driving,’ and ‘sex life.’ Descriptive studies on patients with
neck pain experiencing chronic pain were reviewed to
identify additional daily activities or health aspects reported
to be importantly affected in these patients. Informal surveys
of patients and a small consulting team of health practitioners
supplemented this search for items upon which neck pain
was considered to have a significant impact. The consulting
team then provided consensus ratings that resulted in the
addition of 4 new items: ‘headaches,’ ‘concentration,’
‘reading,’ and ‘work.’
Rating scales for these items were then developed and
refined. Drafts were submitted to patients and health
practitioners for feedback, resulting in further revisions to
the wording of the items. A pilot test was then launched
using 5 whiplash-injured patients. This resulted in unanimous rejection of the OI item “sex life,” which was replaced
by an item for ‘recreation.’ A final revision involved changes
to the wording of 2 of the original OI items: pain intensity
and sleep. In the original OI, these items were rated with
respect to the “use of tablets” (ie, medication for pain or
sleep). This was deemed unsuitable because many subjects
might not be taking such medications. The wording of all the
detractors in these 2 items was then revised to reflect either
intensity, for pain, or duration, for sleep. This final version was submitted to the pilot group and was unanimously
endorsed as relevant and easy to use.
Since the original publication in 1991,  only 1 small
change has been made to the original English version,
namely, the addition of the qualifier “neck” was added in all
places where the sole term “pain” had been present,
clarifying that the detractor was concerned with the patient's
“neck pain” (items 1, 2, and 3).
Scoring And Interpretation
Each item is scored out of 5 for a maximum total score of
50. Care should be taken in reporting the score as either out of
50 or as a percentage out of 100. Most studies have reported the
scores out of 50. Several strategies for dealing with missing
data or noncompliance with an item have been developed.
When only 1 item is missing, some authors have scored the
NDI out of 45 and converted the score to a percentage. When
several items are missing, some authors have used the mean
value of the scored items and inserted this into the missing
items. If 3 or more items are missing, the overall score may be
suspect and, especially in research studies, may be invalid.
The scoring interpretation for the NDI is slightly different
than for the OI, as follows: 0-4 = none; 5-14 = mild; 15-24 =
moderate; 25-34 = severe; over 34 = complete. These 5
categories have been revised by several authors in
subsequent studies, especially in the effort to determine a
dichotomous cutoff for “disabled” vs “not disabled” or
“recovered” vs “not recovered” (see below).
The Original 1991 Report
The original study reported on test-retest reliability over
a 2-day period, obtaining a value of 0.89 (P b .05). Internal
consistency was measured using Cronbach ?, with a total
index value of .80. The highest scoring items (average out
of 5) were the following: headaches = 2.6; lifting = 2.2;
recreation = 2.2; reading = 2.1; and driving = 2.0. The total
index scores of the study sample were normally distributed,
as follows: 0 to 4 (none) = 2%; 5 to 14 (mild) = 35%; 15 to
24 (moderate) = 48%; 25 to 34 (severe) = 15%; and greater
than 35 (complete) = none. The convergent validity was
assessed by comparing the NDI scores to the scores of
the McGill Pain Questionnaire (MPQ) : NDI/MPQ total
score = 0.70; NDI/MPQ-number of words = 0.69. The
responsiveness of the NDI was assessed by comparing, in a
small group of patients who have whiplash undergoing
chiropractic treatment, the change in NDI scores over 3
weeks to a Visual Analogue Scale (VAS) for “pain
improvement” at 3 weeks. These scores were moderately
strongly correlated (0.60). The average change in NDI score
was 33.2%; the average VAS improvement score was 56%.
Since 1991, 22 additional publications have reported on the psychometric properties of the NDI. [5–26] Eight of these were published before 20025, [6–12] and most of these were included in the only systematic review to date.  In that review, it was acknowledged that (1) the NDI was the most widely used of the several scales for self-rating disability in patients with neck pain, which had been developed since 1991, and (2) the NDI was the most well-validated of these instruments.
With regard to reliability, 8 studies in addition to the original paper have reported test-retest correlations between 0.90 and 0.93. [5, 12, 14, 18, 20–22, 24, 26] Hains et al  reported that item order did not affect the responses. The internal consistency has been reported in 7 additional studies, with Cronbach a values ranging from .74 to .93. [5, 14, 18, 20, 22, 24, 26] Four studies have calculated the factor structure of the NDI, [8, 14, 18, 26] with 3 agreeing that only 1 factor—physical disability—is present. The reliability, internal consistency, and factor structure of the NDI are now considered to be well described in the literature and to be of very high quality.
With regard to responsiveness, the minimum detectable change (MDC) reported in 2 studies of patients with neck pain is less than 2 points (out of 50, <4%), [21, 26] although Pool et al  reported an MDC of 10.4 points. Cleland et al  reported on a small sample of patients with cervical radiculopathy finding a much larger MDC; however, because the NDI was not specifically designed for use in this clinical group, these findings do not reflect on the NDI in usual use. The minimum clinically important difference or change (MCID/C) has been reported in 3 studies. [11, 19, 25] Stratford et al  determined an MDC and MCIC of 5 (5/50) points by comparing NDI change scores with a physician-rated change scale. Cleland et al19 reported an MCID of 10 points in the small sample of radiculopathy patients. This clinical problem is generally more refractive to treatment, so a larger MCID is not surprising. Pool et al's  value of the area under the curve (AUC) comparing NDI change vs global perceived change was 3.5 points. This was deemed by these authors to be the more appropriate value for MCIC.
Effect sizes, standardized response means and responsiveness ratios have been reported by 7 studies, with the findings ranging from 0.80 to 1.82, all of which are large by usual standards.  These studies report on variable treatments over variable times and doses. The data on treatment studies reviewed below is more precise for the effect sizes for different treatment approaches.
The NDI change scores correlate well with measures of global change, with r values ranging from 0.30 to 0.76. [9, 20, 23, 26]
Finally, the NDI has been employed in numerous studies of other instruments designed to evaluate patients with neck pain. In the following list, the NDI was used as one of the primary measures for determining the construct validity of these new instruments. As all of these studies reported acceptably high correlation coefficients, these studies provide evidence for the convergent validity of the NDI with other instruments whose purpose is more or less equivalent (the reference cited is the first one to use the NDI for comparison):
The Copenhagen Neck Functional Disability Scale (Jordan et al )
The Patient-Specific Scale (Neck) (Riddle and Stratford )
The Neck Pain and Disability Scale (Wheeler et al )
The Functional Rating Index (Feise et al )
The Aberdeen Back Scale (Neck) (Williams et al )
The Cervical Spine Outcomes Questionnaire (BenDebba et al )
The Bournemouth Questionnaire—Neck (Bolton et al )
The Whiplash-Specific Disability measure (Pinfold et al )
The Core Outcomes for Neck Pain (White et al )
The Whiplash Disability Questionnaire (Willis et al )
The NHANES-ADL (neck) (Cook et al )
As of late 2007, there were 6 published translations of the NDI into French,  Dutch,  Swedish,  Korean,  Brazilian Portuguese,  and Iranian.  In addition to these published translations, the author has worked with the MAPI Company of France to produce the numerous translations that are available at the MAPI website (www.proqolid.com). All of the MAPI translations were conducted using standardized methodologies of linguistic validation, including forward and backward translations by linguistic experts, pilot testing with clinicians and nonexperts in multiple iterations, final confirmation by the original author, and standard formatting and proof-reading. No separate psychometric studies were conducted by the MAPI group on any of these translations. Overlap exists between the separate French and Dutch studies [14, 15] which did conduct psychometric testing in their respective languages.
As of December 2007, 2 other translations involving the author are in preparation: Greek  and Gujarati (personal communication, Sabapathy, 2007).
The NDI is explicitly endorsed as the instrument of choice in the following guidelines for the treatment of whiplash-associated disorder (WAD):
1. NHS Library:
-Clinical Knowledge Summaries
2. Transport Accident Commission, Victoria State, Australia
3. New South Wales Motor Accidents Authority, Guidelines for the Management of Acute Whiplash-Associated Disorders, 2nd Edition, 2007
4. Clinical Practice Guidelines for the Physiotherapy treatment of Patients with Whiplash-Associated Disorders. Leigh et al, 2004. British Columbia Physiotherapy Association
5. Clinical Practice Guidelines for Physical Therapy in Patients with Whiplash-Associated Disorders. Bekkering et al, 2003, Royal Dutch Society for Physical Therapists
Whiplash-Associated Disorder: Prognosis Studies Using the NDI
There have been 41 studies involving patients with WAD that have used the NDI. Seventeen of these involved the prognosis of patients with whiplash, 14 of which reported original data. [39-52] These studies were rated according to Sackett et al.  Studies rated in categories 3-5/5 were excluded. The quality of these studies ranged from 2b-2c according to the classification by Sackett et al (all were acceptable for inclusion). Data retrieval included numbers of subjects, baseline NDI scores, follow-up NDI scores, prognostic indicators, and where applicable, correlation scores between NDI and other variables.
The groups within these reports were classified according to categories by Cote et al,  as follows: source of data—emergency department (n = 11), general practice (3), insurance database (2), population study (1); study design—univariate (9), multivariate (7), or explanatory (modeling) (1). The median follow-up time in these studies was 6 months (1-204 months) The mean (SD) sample size was 137 (120). One case control study  included 931 controls. Several recovery categorizations were reported, all of which correlated with the original NDI categories (Table 3)
Recovery cutoffs range from 10-20/50, with the average being 15/50. Odds or relative risk ratios for high initial NDI and poor recovery were reported from 1.1 to 11.2. Several predictive models including high initial NDI scores were reported, accounting for up to 84.6% of variability in recovery status. Several studies reported that NDI score was the best predictor of outcome: low initial NDI predicts recovery; high initial NDI recovery predicts chronicity.
The NDI has been shown to be highly useful in the prognostication of outcome after WAD injury either alone or within multivariable models. The NDI appears better than ‘pain level’ as a measure of symptom/disability status for prognostic purposes. High NDI scores (>15/50) at 3 to 36 months postaccident are strongly correlated with several important measures of physiologic dysfunction and physical impairment, indicating that psychosocial and accident-related factors are not the only correlates of high self-rated disability in patients who have chronic WAD and that attention to pathophysiologic factors such as muscular dysfunction and central sensitization is warranted.
The NDI in Random Clinical Trials (RCTs) of Conservative Treatment
In nonsurgical treatment studies, treatment groups were classified as follows: manipulation, mobilization, physiotherapy, exercise, acupuncture, medication, cervical pillow, laser, and relaxation therapy. Each trial report was rated for quality by 2 raters using the Maastricht-Amsterdam Rating Scale,  which gives a score out of 19. Studies attaining a score of 50% or more were considered of high quality.
Table 4-14 display data on the different groups reported in trials of conservative treatments for neck pain, [56-83] which employed the NDI as an outcome measure (no. of groups > no. of trials). Table 14 shows the mean (SD) and the SEM of change scores at various intervals postbaseline for several of these treatment modalities.
This review has only focused on those treatment studies that have used the NDI; the purpose was not to conduct a systematic review of all RCTs of conservative treatments for neck pain. The primary purpose of this review was to describe (not systematically analyze) the responsiveness of the NDI as an outcome measure in these trials. The mean changes obtained in the categories shown in Table 14 all exceed the MCIC reported by Stratford et al,11 although the groups receiving medications appear to improve the least. These mean changes range from 5 to 10 points or from 10% to 20%. By way of interpreting these changes, Farrar et al  have reviewed the change scores on the 11-point pain scale in 10 clinical trials for a variety of chronic pain complaints (2724 subjects) and have determined that a 2-point or 20 out of 100 mm change (20%) is clinically relevant for chronic pain patients.
It could be argued that these change scores represent the natural history of chronic neck pain or the placebo effect within a trial and therefore do not reflect the influence of the treatments provided. Vernon et al83 investigated the average change in pain scores in a separate group of controlled clinical trials of conservative treatments for chronic neck pain and found that these are not generally greater than 15 mm on a 100-mm VAS (around 15% improvement). In several of these studies, there was no change at all in pain scores in the control groups over up to 10 weeks posttreatment. Considering the findings of Farrar et al82 and Vernon et al83 with respect to changes in pain scores of patients with chronic pain, the changes in disability/NDI scores obtained in this descriptive review would appear to exceed what could be ascribed to either the natural history or the placebo effect.
Other Treatment Modalities
The NDI has been used as a primary outcome measure in 57 surgical trials and 3 trials of injection-type therapies (references available from author on request).