|  | Methodological quality of studies and 
                        strength of recommendations A grading system was used for the strength of the 
                        recommendations. This grading system is simple and easy 
                        to apply, and shows a large degree of consistency 
                        between the grading of therapeutic and preventive, 
                        prognostic and diagnostic studies. The system is based 
                        on the original ratings of the AHCPR Guidelines (1994) 
                        and levels of evidence used in systematic (Cochrane) 
                        reviews on low back pain. 
                         Strength of recommendations:1. Therapy 
                        and prevention: 
                          
                          Systematic review: 
                        systematic methods of selection and inclusion of 
                        studies, methodological quality assessment, data 
                        extraction and analysis.
                            | Level A : | Generally consistent findings provided by (a 
                              systematic review of) multiple high quality 
                              randomised controlled trials (RCTs). |  
                            | Level B : | Generally consistent findings provided by (a 
                              systematic review of) multiple low quality RCTs or 
                              non-randomised controlled trials (CCTs). |  
                            | Level C : | One RCT (either high or low quality) or 
                              inconsistent findings from (a systematic review 
                              of) multiple RCTs or CCTs. |  
                            | Level D: | No RCTs or CCTs. |  
                            |  |  |  2. Prognosis: 
                         
                          
                          High quality 
                        prognostic studies: prospective cohort studies
                            | Level A : | Generally consistent findings provided by (a 
                              systematic review of) multiple high quality 
                              prospective cohort studies. |  
                            | Level B : | Generally consistent findings provided by (a 
                              systematic review of) multiple low quality 
                              prospective cohort studies or other low quality 
                              prognostic studies. |  
                            | Level C : | One prognostic study (either high or low 
                              quality) or inconsistent findings from (a 
                              systematic review of) multiple prognostic 
                            studies. |  
                            | Level D, no evidence: | No prognostic studies. |  
                            |  |  |  Low 
                        quality prognostic studies: retrospective cohort 
                        studies, follow-up of untreated control patients in a 
                        RCT, case-series
 3 Diagnosis: 
                         
                          
                          High quality 
                        diagnostic study: Independent blind comparison of 
                        patients from an appropriate spectrum of patients, all 
                        of whom have undergone both the diagnostic test and the 
                        reference standard. (An appropriate spectrum is a cohort 
                        of patients who would normally be tested for the target 
                        disorder. An inappropriate spectrum compares patients 
                        already known to have the target disorder with patients 
                        diagnosed with another condition)
                            | Level A : | Generally consistent findings provided by (a 
                              systematic review of) multiple high quality 
                              diagnostic studies. |  
                            | Level B : | Generally consistent findings provided by (a 
                              systematic review of) multiple low quality 
                              diagnostic studies. |  
                            | Level C : | One diagnostic study (either high or low 
                              quality) or inconsistent findings from (a 
                              systematic review of) multiple diagnostic 
                            studies. |  
                            | Level D, no evidence: | No diagnostic studies. |  
                            |  |  |  Low quality diagnostic study: Study performed in a 
                        set of non-consecutive patients, or confined to a narrow 
                        spectrum of study individuals (or both) all of who have 
                        undergone both the diagnostic test and the reference 
                        standard, or if the reference standard was unobjective, 
                        unblinded or not independent, or if positive and 
                        negative tests were verified using separate reference 
                        standards, or if the study was performed in an 
                        inappropriate spectrum of patients, or if the reference 
                        standard was not applied to all study patients. 
  
 The methodological quality of additional studies will 
                        only be assessed in areas that have not been covered yet 
                        by a systematic review or of the non-English literature. 
                         The methodological quality of trials is usually 
                        assessed using relevant criteria related to the internal 
                        validity of trials. High quality trials are less likely 
                        to be associated with biased results than low quality 
                        trials. Various criteria lists exist, but differences 
                        between the lists are subtle. 
                         Quality assessment should ideally be done by at least 
                        two reviewers, independently, and blinded with regard to 
                        the authors, institution and journal. However, as 
                        experts are usually involved in quality assessment it 
                        may often not be feasible to blind studies. Criteria 
                        should be scored as positive, negative or unclear, and 
                        it should be clearly defined when criteria are scored 
                        positive or negative. Quality assessment should be pilot 
                        tested on two or more similar trials that are not 
                        included in the systematic review. A consensus method 
                        should be used to resolve disagreements and a third 
                        reviewer was consulted if disagreements persisted. If 
                        the article does not contain information on the 
                        methodological criteria (score 'unclear'), the authors 
                        should be contacted for additional information. This 
                        also gives authors the opportunity to respond to 
                        negative or positive scores. 
                         The following checklists are recommended: 
                         Checklist for methodological quality of therapy / 
                        prevention studies 
                          
                          
                            | Items: |  |  
                            | 1) | Adequate method of randomisation, |  
                            | 2) | Concealment of treatment allocation, |  
                            | 3) | Withdrawal / drop-out rate described and 
                              acceptable, |  
                            | 4) | Co-interventions avoided or equal, |  
                            | 5) | Blinding of patients, |  
                            | 6) | Blinding of observer, |  
                            | 7) | Blinding of care provider, |  
                            | 8) | Intention-to-treat analysis, |  
                            | 9) | Compliance, |  
                            | 10) | Similarity of baseline 
                          characteristics. |  
 Checklist for methodological quality of prognosis 
                        (observational) studies 
                          
                          
                            | Items: |  |  
                            | 1) | Adequate selection of study population, |  
                            | 2) | Description of in- and exclusion 
                          criteria, |  
                            | 3) | 3) Description of potential prognostic 
                            factors, |  
                            | 4) | Prospective study design, |  
                            | 5) | Adequate study size (> 100 
                            patient-years), |  
                            | 6) | Adequate follow-up (> 12 months), |  
                            | 7) | Adequate loss to follow-up (< 20%), |  
                            | 8) | Relevant outcome measures, |  
                            | 9) | Appropriate statistical 
                        analysis. |  
 Checklist for methodological quality of diagnostic 
                        studies 
                          
                          
                            | Items: |  |  
                            | 1) | Was at least one valid reference test 
                          used? |  
                            | 2) | Was the reference test applied in a 
                              standardised manner? |  
                            | 3) | Was each patient submitted to at least one 
                              valid reference test? |  
                            | 4) | Were the interpretations of the index test and 
                              reference test performed independently of each 
                              other? |  
                            | 5) | Was the choice of patients who were assessed 
                              by the reference test independent of the results 
                              of the index test? |  
                            | 6) | When different index tests are compared in the 
                              study: were the index tests compared in a valid 
                              design? |  
                            | 7) | Was the study design prospective? |  
                            | 8) | Was a description included regarding missing 
                              data? |  
                            | 9) | Were data adequately presented in enough 
                              detail to calculate test characteristics 
                              (sensitivity and 
                      specificity)? |  |