Scolaris Content Display Scolaris Content Display

Cochrane Database of Systematic Reviews

Manipulative therapies for infantile colic

Collapse all Expand all

Abstract

Background

Infantile colic is a common disorder, affecting around one in six families, and in 2001 was reported to cost the UK National Health Service in excess of £65 million per year (Morris 2001). Although it usually remits by six months of age, there is some evidence of longer‐term sequelae for both children and parents.

Manipulative therapies, such as chiropractic and osteopathy, have been suggested as interventions to reduce the severity of symptoms.

Objectives

To evaluate the results of studies designed to address efficacy or effectiveness of manipulative therapies (specifically, chiropractic, osteopathy and cranial manipulation) for infantile colic in infants less than six months of age.

Search methods

We searched following databases: CENTRAL (2012, Issue 4), MEDLINE (1948 to April Week 3 2012), EMBASE (1980 to 2012 Week 17), CINAHL (1938 to April 2012), PsycINFO (1806 to April 2012), Science Citation Index (1970 to April 2012), Social Science Citation Index (1970 to April 2012), Conference Proceedings Citation Index ‐ Science (1990 to April 2012) and Conference Proceedings Citation Index ‐ Social Science & Humanities (1970 to April 2012). We also searched all available years of LILACS, PEDro, ZETOC, WorldCat, TROVE, DART‐Europe, ClinicalTrials.gov and ICTRP (May 2012), and contacted over 90 chiropractic and osteopathic institutions around the world. In addition, we searched CentreWatch, NRR Archive and UKCRN in December 2010.

Selection criteria

Randomised trials evaluating the effect of chiropractic, osteopathy or cranial osteopathy alone or in conjunction with other interventions for the treatment of infantile colic.

Data collection and analysis

In pairs, five of the review authors (a) assessed the eligibility of studies against the inclusion criteria, (b) extracted data from the included studies and (c) assessed the risk of bias for all included studies. Each article or study was assessed independently by two review authors. One review author entered the data into Review Manager software and the team's statistician (PP) reviewed the chosen analytical settings.

Main results

We identified six studies for inclusion in our review, representing a total of 325 infants. There were three further studies that we could not find information about and we identified three other ongoing studies. Of the six included studies, five were suggestive of a beneficial effect and one found no evidence that manipulative therapies had any beneficial effect on the natural course of infantile colic. Tests for heterogeneity imply that there may be some underlying difference between this study and the other five.

Five studies measured daily hours of crying and these data were combined, suggesting that manipulative therapies had a significant effect on infant colic ‐ reducing average crying time by one hour and 12 minutes per day (mean difference (MD) ‐1.20; 95% confidence interval (CI) ‐1.89 to ‐0.51). This conclusion is sustained even when considering only studies with a low risk of selection bias (sequence generation and allocation concealment) (MD ‐1.24; 95% CI ‐2.16 to ‐0.33); those with a low risk of attrition bias (MD ‐1.95; 95% CI ‐2.96 to ‐0.94), or only those studies that have been published in the peer‐reviewed literature (MD ‐1.01; 95% CI ‐1.78 to ‐0.24). However, when combining only those studies with a low risk of performance bias (parental 'blinding'), the improvement in daily crying hours was not statistically significant (MD ‐0.57; 95% CI ‐2.24 to 1.09).

One study considered whether the reduction in crying time was clinically significant. This found that a greater proportion of parents of infants receiving a manipulative therapy reported clinically significant improvements than did parents of those receiving no treatment (reduction in crying to less than two hours: odds ratio (OR) 6.33; 95% CI 1.54 to 26.00; more than 30% reduction in crying: OR 3.70; 95% CI 1.15 to 11.86).

Analysis of data from three studies that measured 'full recovery' from colic as reported by parents found that manipulative therapies did not result in significantly higher proportions of parents reporting recovery (OR 11.12; 95% CI 0.46 to 267.52).

One study measured infant sleeping time and found manipulative therapy resulted in statistically significant improvement (MD 1.17; 95% CI 0.22 to 2.12).

The quality of the studies was variable. There was a generally low risk of selection bias but only two of the six studies were evaluated as being at low risk of performance bias, three at low risk of detection bias and one at low risk of attrition bias.

One of the studies recorded adverse events and none were encountered. However, with only a sample of 325 infants, we have too few data to reach any definitive conclusions about safety.

Authors' conclusions

The studies included in this meta‐analysis were generally small and methodologically prone to bias, which makes it impossible to arrive at a definitive conclusion about the effectiveness of manipulative therapies for infantile colic.

The majority of the included trials appeared to indicate that the parents of infants receiving manipulative therapies reported fewer hours crying per day than parents whose infants did not, based on contemporaneous crying diaries, and this difference was statistically significant. The trials also indicate that a greater proportion of those parents reported improvements that were clinically significant. However, most studies had a high risk of performance bias due to the fact that the assessors (parents) were not blind to who had received the intervention. When combining only those trials with a low risk of such performance bias, the results did not reach statistical significance. Further research is required where those assessing the treatment outcomes do not know whether or not the infant has received a manipulative therapy.

There are inadequate data to reach any definitive conclusions about the safety of these interventions.

Plain language summary

Manipulative therapies for infantile colic

Infantile colic is a distressing problem, characterised by excessive crying of infants and it is the most common complaints seen by physicians in the first 16 weeks of a child's life.

It is usually considered a benign disorder because the symptoms generally disappear by the age of five or six months. However, the degree of distress caused to parents and family life is such that physicians often feel the need to intervene. Some studies suggest that there are longer‐lasting effects on the child, and estimates in 2001 put the cost to the UK National Health Service at over £65 million.

It has been suggested that certain gentle (low velocity, low amplitude) manipulative techniques (such as those used in osteopathy and chiropractic therapies) might safely reduce the symptoms associated with infantile colic, specifically excessive crying time. This review included six randomised trials involving 325 infants who received manipulative treatment or had been part of a control group.

The studies involved too few participants and were of insufficient quality to draw confident conclusions about the usefulness and safety of manipulative therapies.

Although five of the six trials suggested crying is reduced by treatment with manipulative therapies, there was no evidence of manipulative therapies improving infant colic when we only included studies where the parents did not know if their child had received the treatment or not.

No adverse effects were found, but they were only evaluated in one of the six studies.

Further rigorous research is required where (a) infants are randomly allocated to receive either treatment or no treatment and (b) those assessing the treatment outcomes do not know whether or not the infant has received a manipulative therapy.

Authors' conclusions

Implications for practice

The authors conclude that:

  • quality of the evidence is mixed: the studies that we have included in this meta‐analysis were generally small and methodologically prone to bias, which makes it impossible to arrive at a definitive conclusion about the effectiveness of manipulative therapies for infantile colic

  • taken together, the evidence seems to suggest that there may be benefits in terms of reduction in crying hours: the evaluation using GRADEPro ( GRADEPro 2008 ) indicates low quality of evidence for a reduction in daily hours of crying of over one hour and for a greater proportion of patients reporting resolution of their infants' colic symptoms. The majority of the included trials appear to result in significant reductions in reported crying hours per day and in a greater proportion of parents reporting clinically significant reduction in daily crying.

  • if one excludes the poorer quality evidence, these benefits do not reach statistical significance: most studies had a high risk of performance bias introduced owing to the fact that the assessors (parents) were not blind to who had received the intervention and when combining only those trials with a low risk of such performance bias, the results did not reach statistical significance.

  • we cannot quantify any risk of adverse effects when using manipulative therapies for the treatment of infantile colic.

Further rigorous randomised trials with adequate blinding are required to evaluate the role of manipulative therapies in the treatment of infantile colic.

Implications for research

The current evidence for the effectiveness of manipulative therapies for infantile colic is based on studies that are generally small and methodologically prone to bias.

Further suitably powered studies of good quality are needed, especially those where parents do not know if their child has received manipulation. Future research should focus specifically on understanding:

  • the effect of 'blinding' of parents;

  • the reporting and evaluation of incomplete outcome data;

  • the safety of manipulation in infants.

Additionally, qualitative factors of value to the parents of colicky infants should also be investigated.

Economic evaluation of any benefits would also be needed to inform the guidance provided to and by physicians.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Manipulative therapy compared to control condition for infant colic

Manipulative therapy compared to no treatment or sham for infant colic

Patient or population: infants with colic
Settings: teaching clinics and private practice
Intervention: manipulative therapy
Comparison: no treatment, sham treatment or usual treatment

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

No treatment or sham

Manipulative therapy

Change in daily hours of crying
Crying diary completed by parents

The mean change in daily hours of crying ranged across control groups from
‐0.5 to ‐2.3 hours (reduction)

The mean change in daily hours of crying in the intervention groups was
1.2 hours
(1.89 to 0.51 hours greater reduction than control)

223
(5 studies)

⊕⊕⊝⊝
low1,2

Presence/absence of colic
Global impression of change scale completed by parents

Study population

OR 11.12
(0.46 to 267.52)

185
(3 studies)

⊕⊕⊝⊝
low1,2

Data were either highest category ('completely recovered' or similar) from Likert‐style questionnaire completed by parents or from records of infants discharged well

135 per 1000

634 per 1000
(67 to 977)

Moderate

133 per 1000

630 per 1000
(66 to 976)

Adverse events
Incidents reported by parents

Study population

Not estimable

102
(1 study)

See comment

See comment

See comment

Moderate

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: Confidence interval; RR: Risk ratio; OR: Odds ratio.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1 Several of the studies did not attempt to blind participants
2 There is an unexplained heterogeneity between the Olafsdottir study and the other studies

Background

Description of the condition

Infantile colic ‐ which presents as excessive crying in healthy, thriving infants ‐ is a common problem during the first months of childhood. Studies on the occurrence of colic have reported incidence rates that vary widely from 2% to 40%, in part as a result of differences in the criteria used to define the condition (Lucassen 2001; Soltis 2004). Crying typically occurs in the evenings, episodes starting in the first weeks of life and ending at the age of four or five months (Illingworth 1985). In studies, the condition is typically defined as crying that lasts at least three hours a day and occurs at least three days per week over a period of at least three weeks, a definition first proposed by Wessel (Wessel 1954). A number of other definitions exist, possibly reflecting different conditions with other risk factors (Reijneveld 2002). In some definitions the duration criterion relates to both crying and fussing behaviour.

In some literature, symptoms other than crying are mentioned such as: crying in 'bouts' ('paroxysms'); changes to the acoustics of the cry (higher pitch) (Lester 1992), which can be differentiated by the parents (Gustafson 2000); flushing of the face; passing of gas, abdominal distension and difficulty with passing stools (Wessel 1954); drawing up the legs, arching the back and other indications that indicate the infant may be experiencing pain (Illingworth 1985). It is not clear whether these symptoms are important features of a colic syndrome or relate to other disorders (Reijneveld 2002; Soltis 2004).

The aetiology of infantile colic is unclear. Infantile colic may not have a single cause but rather be the result of a number of different problems, with excessive crying as the final common pathway. Several main causes are suggested in the literature (Lucassen 1998; Savino 2007). Firstly, it is suggested that infantile colic may arise from a problem with the digestive system (Lindberg 1999; Kirjavainen 2001) and according to this view, excessive crying is the result of painful gut contractions caused, for example, by allergy to cow's milk (Miller 1991; Hill 2000; Iacono 2005), intestinal microflora (Lehtonen 1994; Savino 2004; Savino 2005; Rhoads 2009; Savino 2009), neutrophilic infiltration (Rhoads 2009), transient lactase deficiency (Kanabar 2001) or motility dysfunction (Hipperson 2004). The meaning of the word 'colic', derived from the Greek word 'kolikos' (large intestine) is a reflection of this hypothesis (St James‐Roberts 1991). Colic can be viewed as a behavioural problem. According to this model, colic may be the result of an infant's perceived 'difficult' temperament (Canivet 2000) leading to inadequate parental reactions, or due to parental distress or depression leading to less‐than‐optimal interaction (Carey 1984; Akman 2006). Some have argued that excessive crying in an infant is not an illness, but merely the extreme end of normal crying (Barr 1991; Soltis 2004). Finally, some believe that infantile colic is merely a collection of aetiologically different problems, which are difficult to disentangle (Treem 1994).

The mean onset of colic is 1.8 weeks of age (Paradise 1966), and infants whose colic begins in the first two weeks of life seem to have a longer duration of symptoms than those whose symptoms start later (Pinyerd 1989). The average duration of symptoms is 13.6 weeks (Paradise 1966). Symptoms generally increase over the first few weeks of life, peaking in both the amount of crying and the intensity of the early‐evening diurnal pattern around the sixth week before reducing until the age of 12 weeks (Brazelton 1962; St James‐Roberts 1991b).

Unexplained crying is the most common presentation to paediatricians in the first 16 weeks of life (Miller 2007), with around one in six families seeking professional advice for a colicky infant (Husereau 2003), despite the fact that most infants with colic no longer have symptoms by the age of four to five months. Because of the serious impact of the condition on parents, most doctors and nurses feel the need to intervene ‐ at a cost to the UK National Health Service in excess of £65 million (Morris 2001). Although many observers characterise infantile colic as a benign and self‐limiting problem, there is growing evidence to indicate that there may be serious sequelae to the disorder, such as shaken baby syndrome, child abuse, neglect (for example, Overpeck 1998; Reijneveld 2004; Lee 2007), increased maternal stress (Neu 2003; Miller‐Loncar 2004), later behavioural problems or lower academic achievement (Canivet 2000; Rao 2004).

Support mechanisms for parents are also changing, with a general reduction in peer, family and community support since the 1960s. These may also play a role in how children are diagnosed, cared for and treated.  For example, in the last century, an infant may have cried excessively, but colic may not have been diagnosed and the presence of an extended family may have meant better support for the new mother than is generally available today.

Description of the intervention

The first person to suggest manipulative therapy as an effective treatment for infantile colic seems to have been AT Still in 1910. For many years, chiropractors, osteopaths and others have reported apparently good results from the treatment of infants with symptoms of colic (Still 1910; Nilsson 1985; Klougart 1989; Biederman 1992). It has been reported that as many as 63% of paediatric patients referring to chiropractors may present with prolonged crying (Miller 2007).

Chiropractic and osteopathy are health professions concerned with the diagnosis, treatment and prevention of disorders of the musculoskeletal system, and the effects of these disorders on the nervous system and general health. The broad model of health care is holistic, based on the theory that bony misalignments or soft tissue tensions within the body can result in visceral symptoms and that well‐being is dependent on the skeleton, muscles, ligaments and connective tissues functioning smoothly together. Health is viewed as a complex process integrating all parts and systems of the body (Peterson 2002; General Osteopathic Council 2010; McTimoney 2010).

Both professions focus on palpatory techniques to diagnose dysfunction, then use physical manipulation or adjustments, stretches and mobilisation techniques to improve the functioning of joints, to relieve muscle tension, to enhance the blood and nerve supply to tissues, and to help the body's own healing mechanisms (General Chiropractic Council 2010; General Osteopathic Council 2010).

Such manipulatory techniques may include adjustments to dysfunctional vertebrae (identified, for example, by misalignment or by reduced motion), achieved by introducing an impulse into the spinal column to correct the dysfunction. The speed and force applied vary between different techniques (Kawchuk 1992; Kawchuk 1993; Colloca 2009), although the techniques used for infants are more similar than those used for adults. In general, very light fingertip pressure 'hold and release' adjustments are the norm when treating infants.

Practitioners of different techniques may also recommend different numbers of treatments, or deliver them over a different time frame.

Both professions also use cranial adjustments ‐ usually following specialist training in cranial osteopathy, craniosacral therapy techniques or applying cranial techniques taught on specialist paediatrics courses (Craniosacral Therapy Association of the UK 2010; McTimoney College of Chiropractic 2010; Sutherland Society 2010). These involve very light pressure to the cranium and associated soft tissue to allow or encourage the bones to achieve a normal physiological balance.

Adverse outcomes

The wider literature concerning the use of high‐velocity manipulative therapies for adults suggests that there may be some evidence of serious adverse reactions (for example, Smith 2003), although other authors (for example, Cassidy 2009) have found no evidence of causal links, concluding that adverse events are most likely to be related to patients referring to therapists with presenting symptoms indicative of incipient stroke or vertebral artery dissection, such as headaches and neck pain.

The safety of manipulative techniques in infant populations has also been questioned, based on anecdote or case reports with no denominators (for example, Holla 2009, who reported on the death of an infant following treatment for infantile colic by a 'so‐called craniosacral therapist' who used a technique not apparently taught by any of the schools of osteopathy, chiropractic or CranioSacral Therapy). The present review authors are not aware of any studies concerning infants specifically, but there are reports of adverse events in children (to 18 years of age). For example, one survey (Alcantara 2009) found three adverse events reported by chiropractors in 5438 visits involving 577 children (between one day and 18 years of age) and two adverse events reported by parents from 1735 visits involving 239 children. The adverse events reported by chiropractors were 'muscle stiffness', 'spine soreness' or 'stiff and sore'; those reported by parents were 'soreness of the knee' and 'stiffness of the cervical spine'. In a systematic review of 13 studies (two randomised trials, 11 observational reports) (Vohra 2007), 14 cases of adverse events involving neurological or musculoskeletal events were identified between 1959 and 2006, of which nine were considered serious, two were moderate and three were minor. The causative relationship between these single case reports and the manipulation received by the children is unclear, since retrospective case controls were not used. One of the children in whom a serious adverse event was reported was a neonate and none were reported as receiving treatment for colic. The exact nature of the manipulation used in the 14 case reports was unclear; and Vohra 2007 concluded that "serious adverse events may be associated with paediatric spinal manipulation", although "neither causation nor incidence rates can be inferred…"

Adverse effects were one of our outcomes but we did not conduct a separate search for these outside of our core search strategy for RCTs that compared manipulative therapy with no treatment, placebo/sham treatment or standard care. Most trials we found did not measure adverse effects and therefore we may consider conducting an additional search specifically for any adverse consequences from the use of manipulative therapies when this review is updated.

How the intervention might work

There are several theories for how manipulative therapies might work to relieve infantile colic. Many are based on the belief that the birth process can cause extreme pressures to be exerted on the infant's head, leading to cranial moulding, or that poor positioning in the uterus can create cervical dysfunction, such as poor vertebral alignment if the head is maintained at an angle during pregnancy. These may be uncomfortable for the infant, and may contribute to colic symptoms (Craniosacral Therapy Association of the UK 2010; Sutherland Society 2010). Once such biomechanical problems are resolved, the discomfort is relieved and the symptoms of crying abate.

Other proposed therapeutic mechanisms suggest somatovisceral or spino‐craniovisceral reflex involvement (Hipperson 2004; Biedermann 2005).

However, there is little research evidence to support such arguments so the mechanisms of action of manipulative therapies remain unsubstantiated.

It might also be that different techniques have different mechanisms of action. For example, an adjustment to the upper cervical vertebrae may affect the vagus nerve, whereas re‐alignment of the cranial bones or soft tissue release in the occipital area may relieve sensations of stiffness or soreness.

The number and frequency of treatments required for resolution of the problem is also in question. There is some evidence that chiropractic does have a dose response in other disorders (for example, see Haas 2004), although it is not known whether this is the case with infantile colic.

Why it is important to do this review

There have been a number of systematic reviews published in recent years, either focusing on chiropractic interventions for colic specifically (Husereau 2003; Ernst 2009) or on manual therapies for non‐musculoskeletal conditions (Ernst 2003; Hawk 2007; Gotlib 2008), but only one (for massage) (Underdown 2006) has been undertaken to the exacting standards of The Cochrane Collaboration or with a meta‐analysis in mind. Carrying out a systematic review on the literature on all the manipulative therapies that use very similar techniques (chiropractic, osteopathy, cranial‐ and spinal‐manipulative therapy) will help to bridge this gap and to create the basis for the methodologically informed, clinically relevant and rigorous incorporation of future studies.

Objectives

To examine the effectiveness of manipulative therapies (specifically, chiropractic, osteopathy and cranial manipulation) for infants less than six months of age with infantile colic.

Methods

Criteria for considering studies for this review

Types of studies

Randomised controlled trials.

Types of participants

Infants younger than six months of age (at entry to study) who were assessed by clinicians as suffering from colic, defined as 'crying excessively'. As there is no consensus on the criteria for crying excessively, we accepted all definitions of excessive unexplained crying for inclusion in this review. Studies of infants with crying of normal duration or where the aetiology of illness causing the crying had been identified were excluded.

Types of interventions

We included active interventions consisting of manipulative therapies of chiropractic, osteopathy, cranial osteopathy, craniosacral therapy and cranial manipulation, compared with no treatment, placebo/sham, standard care or waiting list control for inclusion. Interventions could be applied either on their own or as an adjunct to conventional treatments (for example, counselling/advice and prescription medication), provided that the same adjunct treatment applied to all participants in the study.

Types of outcome measures

In the 'Summary of findings' table, we present results for our primary outcomes.

Primary outcomes

  1. Change in hours crying time per day (post‐treatment versus baseline).

  2. Presence/absence of colic after treatment or at later follow‐up, or both, that is, the number of infants in which excessive crying resolved (using the definition of those conducting the trial).

  3. Any reported adverse outcomes, for example, injury, stroke, arterial dissection, worsening of symptoms.

Secondary outcomes

  1. Changes in frequency of crying bouts (number of crying episodes per day) (post‐treatment versus baseline).

  2. Measures of parental or family quality of life.

  3. Measures of parental stress, anxiety or depression.

  4. Sleeping time, that is, change in duration of peaceful sleeping (post‐treatment versus baseline).

  5. Parental satisfaction.

Timing of outcome assessment

We planned to evaluate outcomes:

  • at the completion of any treatment protocol (that is, any period, any number of treatments), and

  • at a later follow‐up, where data existed and follow‐up periods were sufficiently homogenous.

We considered the different timings separately for each outcome, that is, we included studies that reported an outcome at the two data points in an analysis of 'post treatment' and at 'follow‐up'.

Search methods for identification of studies

The Trials Search Coordinator for the Cochrane Developmental, Psychosocial and Learning Problems Group ran the first searches in July 2011 and updated the searches in April 2012. Duplicate records were identified in EndNOTE and eliminated. No date or language restrictions were imposed.

Electronic searches

We searched the following databases.

  • The Cochrane Central Register of Controlled Trials (CENTRAL) 2012, Issue 4 of 12. Last searched 30 April 2012

  • Ovid MEDLINE(R), 1948 To April Week 3 2012. Last searched 30 April 2012

  • EMBASE (Ovid), 1980 to 2012 Week 17. Last searched 30 April 2012

  • CINAHLPlus (EBSCOhost), 1939 to current. Last searched 1 May 2012

  • PsycINFO (Ovid), 1806 to April Week 3 2012. Last searched 1 May 2012

  • PsycINFO (EBSCOhost),1806 to current. Last searched 19 July 2011

  • LILACS: Latin American and Caribbean Health Sciences Literature, all available years. Last searched 1 May 2012

  • PedRO: Physiotherapy Evidence Database, all available years. Last searched 1 May 2012

  • Science Citation Index, 1970 to current. Last searched 1 May 2012

  • Social Science Citation Index, 1970 to current. Last searched 1 May 2012

  • Conference Proceedings Citation Index ‐ Science, 1990 to current. Last searched 1 May 2012

  • Conference Proceedings Citation Index ‐ Social Science & Humanities, 1990 to current. Last searched 1 May 2012

  • ZETOC (limited to conference proceedings), all available years. Last searched 1 May 2012

  • WorldCat (limited to theses), all available years. Last searched 1 May 2012

  • ClinicalTrials.gov, all available years. Last searched 1 May 2012

  • National Research Register Archive (UK). Last searched December 2010

  • Center Watch Clinical Trials Listing Service (USA). Last searched December 2010

  • UKCRN Portfolio Database. Last searched December 2010

We devised a search strategy for each database by adapting a MEDLINE search: lines 1‐11 (adapted from Lucassen 2003) to retrieve studies about infantile colic; lines 12‐23 (adapted and extended from Proctor 2006) to retrieve studies about relevant manual therapies, and an RCT filter (lines 24‐32), which is the Cochrane highly sensitive search strategy for identifying randomised trials in MEDLINE (sensitivity maximising version) from the Cochrane Handbook for Systematic Reviews of Interventions (Lefebvre 2008). The search histories are reported in Appendix 1.

Searching other resources

A search for citations included in previous reviews and systematic reviews (Lucassen 1998; Garrison 2000; Hughes 2002; Ernst 2003; Husereau 2003; Brand 2005; Hawk 2007; Gotlib 2008; Ernst 2009; Bronfort 2010; Lucassen 2010; Dobson 2011; Alcantara 2011) yielded eight citations, all but one of which had been identified through the electronic searches. A search through bibliographies of articles identified through the search strategy identified one additional citation.

In order to minimise the potential for publication bias, and avoid spurious beneficial intervention effect or miss an important adverse effect (where the results of negative trials are not submitted for publication) we had decided to extend the search to the grey literature. This included contacting as many institutions and fellow researchers in the chiropractic and osteopathic world as possible. These were identified in the following manner in November and December 2010:

  • chiropractic colleges listed by the Council on Chiropractic Education (US) (15 institutions) (CCE 2010);

  • chiropractic colleges listed on the webpages of the Association of Chiropractic Colleges (Canada, US, New Zealand) (19 institutions) (ACC 2010);

  • chiropractic colleges listed on the webpages of the European Council of Chiropractic Education (six institutions) (European Council on Chiropractic Education 2010);

  • the three chiropractic education councils themselves (three institutions);

  • chiropractic colleges listed on Wikipedia (31 institutions) (in Europe, South America, North America) (Wikipedia 2010);

  • osteopathic colleges listed by the General Osteopathic Council (GOsC) (2010) (nine institutions) (General Osteopathic Council 2010b);

  • osteopathic colleges listed on Wikipedia (Wikipedia, 2010b) (21 institutions) (Wikipedia 2010b);

  • osteopathic colleges listed in the American Association of Osteopathic medicine website (30 institutions) (AACOM);

  • all of the members of the Osteopathic International Alliance (54 institutions) (Osteopathic International Alliance 2010);

  • all craniosacral courses identified via Google searches, and the Sutherland Society (four institutions).

We also undertook searches of Google Scholar using the search terms identified above.

After removing duplicates, and following through forward recommendations, some 91 osteopathy‐related institutions and 45 chiropractic‐related institutions were identified and contacts were attempted with them all.

Data collection and analysis

Methodological decisions planned in the protocol but not used in this version of the review are summarised in Table 1.

Open in table viewer
Table 1. Methods not used in this version of the review

Section

Methodological aspect omitted

Types of studies   

We found no cluster‐randomised or cross‐over trials.

Types of interventions  

We found no craniosacral therapy or cranial manipulation trials, and no trials that used waiting list controls.

Types of outcome measures  

We found no studies reporting frequency of crying bouts, measures of parental or family quality of life, measures of stress, anxiety or depression or parental satisfaction.

Timing of outcome assessment

We did not find studies that adequately reported data from any later follow‐up.

Measures of treatment effect  

We had planned to use standardised mean difference if authors had used different measures for the same outcome. However, all studies used the same outcome measures

Unit of analysis issues  

If cluster‐randomised trials had been included, we had planned to use the intraclass correlation coefficient (ICC) to convert trials to their effective sample size before incorporating them into the meta‐analysis, per recommendation in the Cochrane Handbook for Systematic Review of Interventions (Higgins 2008b).

If studies containing three or more intervention arms had been included, we planned to incorporate appropriate pair‐wise comparisons, providing there was no evidence of bias, such as the authors introducing the additional groups after seeing the data (that is, the groups were determined a priori in the protocol) or of selective reporting (that is, data for all cohorts were reported). However, no eligible studies were identified that had this design.

Dealing with missing data  

Data from the studies were generally presented on an available case basis. All studies reported analysis of participants based on the group to which they were allocated, with none reporting participants who did not receive the allocated intervention, so we used no default data points or outcomes for participants who dropped out of the study.

Assessment of reporting biases  

We had planned to use funnel plots to investigate any relationship between effect estimates and study size/precision, had we found more than 10 studies (Sterne 2008). However, the number of studies was too small to warrant such analysis.

Data synthesis  

We had planned to undertake a meta‐analysis of all manipulative therapies together and then to group and analyse by subgroups based on common study characteristics if there were sufficient studies. However, with only five studies included in the main meta‐analysis, we determined that there were too few studies on which to do this

Data synthesis

We had planned to include 'adjusted' estimates of treatment effect that included the baseline outcome measurements as a covariate in a regression analysis (ANCOVA). However, only one of the studies (Miller 2010) included any regression analysis and this study included other factors (age and gender), alongside baseline outcome, as covariates. We therefore elected to include the raw scores from this study.

Data synthesis

We had planned to use the standardised mean difference approach had we found studies that measured the same outcome using different scales but this was not required

Data synthesis

We had planned to use Risk Ratios to report dichotomous outcomes. However, we used Odds Ratios of significant improvement in the analysis, since this is more appropriate to reporting improvements.

Subgroup analysis and investigation of heterogeneity  

We had planned to investigate any significant levels of heterogeneity, where there were are sufficient observations (at least 10 studies for each characteristic modelled (Deeks 2011)), using the following subgroups:

  • type of intervention (different techniques may impact the outcomes). Our main outcome (change in daily hours of crying) had data from three chiropractic and two osteopathic studies.

  • treatment dose (total number of treatments, number of treatments per week, or overall duration of treatment protocol), Our main outcome had data from two studies with a four‐week intervention period (both osteopathic), and three studies with an 8‐ to 15‐day intervention period (all chiropractic).

  • mean age of the participants at onset of colic (earlier onset may imply greater severity of symptoms). This was reported in only one of the studies.

As there were no outcomes with 10 or more studies, we elected not to do a subgroup analysis.

Selection of studies

After obvious de‐duplicating by one review author (DD), each of the abstracts was independently assessed by two of the five review authors . Workload was allocated to avoid review by the paper's own authors. Any citation deemed potentially relevant by at least one author was retrieved in full text and, again, independently assessed by the same two review authors. Any disagreements were resolved through discussion or, if required, by consultation with the remaining team.

We translated one study that was written in German.

Data extraction and management

A data extraction form was designed specifically for the purposes of this review and piloted prior to use. For each study, the same two review authors extracted key characteristics and outcomes into the form and identified where any additional information or clarification was required. We made attempts to contact relevant authors.

Any differences in extracted data were discussed between review authors and resolved, which occurred for approximately around 17% of the main study data points (21 of 120) and risk of bias assessments (8 of 48). The main causes of misalignment were missed information when reading the article and lack of familiarity with the Cochrane guidance for evaluating risks of bias.

DD entered data into the Review Manager 5.1 software (RevMan) (RevMan 2011) and the other authors checked the accuracy of the studies they were most familiar with (had done the data extraction for). For one outcome (presence/absence of colic), that was missed from the proforma, DD extracted the data and entered it into RevMan, and the other authors checked this for accuracy.

Assessment of risk of bias in included studies

Each study was independently evaluated for risk of bias by the same two review authors using the criteria recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2008). Review authors did not assess the risk of bias of any trial they had been involved in.

We reviewed risk of bias as low, high or unclear, across the following domains: (1) sequence generation, (2) allocation concealment, (3) parental blinding, (4) blinding of outcome assessors, (5) incomplete outcome data, (6) selective outcome reporting and (7) other bias.

In the case of differently scored items, the two review authors tried to reach agreement by discussion. Any remaining disagreements were resolved by discussion with the rest of the team. Where the risk of bias was unclear from published information, we attempted to contact study authors for clarification.

DD entered these judgments into a 'Risk of bias' table in RevMan with a brief rationale for the judgement, which was validated by the other review authors responsible for the original extraction.

In order to minimise publication bias, we attempted to obtain the results of any unpublished studies.

Measures of treatment effect

Where outcomes were reported as dichotomous variables, we present the results as odds ratios (OR) with 95% confidence intervals (CI).

Where outcomes were reported as continuous variables, we compared the mean differences (MD) of change scores. The standard deviations of the change scores were reported in all but one study (Olafsdottir 2001), for which we were able to impute the standard deviation of change scores based on the correlation coefficient of other, similar, studies.

We had planned to use standardised MD if authors had used different measures for the same outcome.

Unit of analysis issues

For each included study, we determined whether the unit of analysis was appropriate for the unit of randomisation and the design of each study (that is, whether the number of observations matched the number of 'units' that were randomised ‐ Deeks 2011).

Dealing with missing data

We sought to secure additional data from study authors where data were missing and information on the reasons for missing data (Olafsdottir 2001; Heber 2003; Miller 2010) although we were not always successful (Olafsdottir 2001). Missing data and, where known, the reasons, are described. We assessed the likely risk of bias and impact of missing data and drop‐outs for each study.

It is to be expected that, with a disorder that causes such distress to parents, that there will be a certain impatience with any interventions. We therefore expected fairly high drop‐out rates, with parents likely to withdraw their child either:

  1. if they see no improvement, or a worsening of symptoms ‐ in order to try a different solution to the problem; or

  2. if the child recovers completely ‐ in order that they can resume normal family life as quickly as possible.

Assessment of heterogeneity

We assessed statistical heterogeneity for each meta‐analysis using the Chi2 test from the analysis in RevMan (RevMan 2011), according to the current recommendations in the Cochrane Handbook for Systematic Review of Interventions, using a P value of 0.10 for statistical significance (Deeks 2011). To assess the impact of any heterogeneity on the meta‐analysis, we also calculated the I2 statistic (Higgins 2002; Deeks 2011).

Assessment of reporting biases

See Table 1.

Data synthesis

We conducted the data synthesis using RevMan 5.1 software (RevMan 2011).

We used random‐effects methods as described in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2011), and as outlined in the protocol because we could not be sure that the studies were estimating the same underlying treatment effect (for example, where the trials have different interventions) neither could we be sure that the trials' population or methods would be sufficiently similar. This method also has the advantage of providing more conservative estimates. Since this was the method of analysis, we present the results as the average treatment effect with a 95% CI and estimates of Tau2 and I2 statistics.

For dichotomous outcomes, we performed a meta‐analysis of ORs and calculated the number needed to treat for an additional beneficial outcome, since these (in combination) provide good consistency, mathematical properties and ease of interpretation (Deeks 2011). We based the assumed control group risks on an assessment of typical risks, calculated from the control groups.

For the main primary outcome, the data were imported from RevMan into GRADEPro and the quality of evidence was adjusted based on the assessment of risk of bias appropriate to the outcome and in accordance with The Cochrane Handbook (Schunemann 2008). (please see also Quality of the evidence). The subsequent Summary of Findings table was then imported into Revman .

Subgroup analysis and investigation of heterogeneity

Our plans for investigating heterogeneity are summarised in Table 1. Given the paucity of studies, we were only able visually to inspect the CIs of the studies, seeking non‐overlapping CIs as indicators of potentially significant differences in the treatment effects between the subgroups.

Sensitivity analysis

We planned to conduct sensitivity analyses for each outcome to determine whether findings were sensitive to restricting inclusion to studies judged to be at low risk of bias. In these analyses we re‐evaluated the findings, limiting the inclusion to those studies that:

  • had a low risk of selection bias (associated with sequence generation or allocation concealment);

  • had a low risk of performance bias (associated with issues of blinding);

  • had a low risk of attrition bias (associated with completeness of data);

  • were published (peer reviewed).

It was only possible to do this analysis for one of our pre‐specified outcomes: change in daily hours of crying.

Results

Description of studies

In this review, we included six trials involving 325 infants (see Characteristics of included studies). We identified three ongoing studies (see Characteristics of ongoing studies) and found references to three further studies for which we have been unable to source further information (see Characteristics of studies awaiting classification).

Results of the search

Searches of electronic databases were initially carried out in July 2011 and yielded 227 records. A further search in May 2012 yielded a further 12 records, of which two were related to a study that had already been identified.

A search for citations included in previous reviews and systematic reviews (Lucassen 1998; Garrison 2000; Hughes 2002; Ernst 2003; Husereau 2003; Brand 2005; Hawk 2007; Gotlib 2008; Ernst 2009; Bronfort 2010; Lucassen 2010; Dobson 2011; Alcantara 2011) yielded eight citations, all but one of which had been identified through the electronic searches. A search through bibliographies of articles identified through the search strategy identified one additional citation.

The search for studies known to chiropractic and osteopathic institutions yielded an additional nine potential studies and articles, including two in progress (Friis 2009; Stangl), and one completed but not analysed (Mills 2010).

After removal of obvious duplicates, there were a total of 216 records that were allocated to two review authors for initial review. Of these, both review authors agreed to exclude the record from further analysis in 133 cases, leaving 83 records for which we attempted to obtain full‐text versions. Seventy‐one of these were excluded on further investigation (eight that readers might expect to have been included are listed in the Characteristics of excluded studies table with the reasons, which were primarily study design). Of the remaining 12 studies, six were confirmed for inclusion, three were ongoing (see Characteristics of ongoing studies) and full texts could not be sourced for three (see Studies awaiting classification).

In April 2012 email exchanges took place with the review authors and institutions of those studies that had been identified as 'ongoing' but no further study results were available.

The original citations therefore resulted in six completed studies that were included in a meta‐analysis and three studies that were ongoing.

See Figure 1 for the study flow diagram.

Included studies

In all, six studies met the criteria for inclusion in this review (see Characteristics of included studies).

Three were undertaken and published as undergraduate or graduate theses (Mercer 1999; Heber 2003; Hayden 2006). Hayden's was also published in a peer‐reviewed journal and Mercer's was published as a conference abstract.

Wiberg 1999, Olafsdottir 2001 and Miller 2010 were published in peer‐reviewed journals.

The studies were conducted between 1999 (Wiberg 1999) and 2010 (Miller 2010), and there were some important differences between them (summarised below).

Outcomes

All studies reported at least one of the outcomes that were pre‐defined in the protocol.

  • Change in hours crying time per day. In all, five studies (Wiberg 1999; Olafsdottir 2001; Heber 2003; Hayden 2006; Miller 2010) measured changes in daily hours of crying. All of these were based on 'crying diaries' completed contemporaneously by parents, except one (Heber 2003), which asked parents to estimate daily crying hours retrospectively. One study (Miller 2010) also presented this as a dichotomous outcome "clinically‐significant reduction in crying hours per day", defined as improvement to below two hours of crying per day and as a greater than 30% improvement, based on data extracted from the crying diary.

  • Presence/absence of colic after treatment or at later follow‐up, or both. Two studies reported on 'full recovery' based on the highest response in a five‐point Likert scale completed by parents ('completely recovered', Mercer 1999; 'completely well', Olafsdottir 2001), and this information was made available for a third study (Miller 2010).

  • Adverse outcomes. One study (Miller 2010) reported adverse events (none encountered), based on questionnaires administered during the study.

  • Change in duration of peaceful sleeping was reported by one study (Hayden 2006), based on measurements taken from the 'crying diary'.

Four of our outcomes (all secondary) were not reported by any of the included studies:

  • changes in frequency of crying bouts;

  • measures of parental or family quality of life;

  • measures of parental stress, anxiety or depression;

  • parental satisfaction.

Additional outcomes reported by the studies:

  • one study (Heber 2003) reported mean change in crying intensity, based on a 10‐point Likert scale administered at the start (week 1) and completion (week 5) of the study;

  • one study (Hayden 2006) reported the daily hours of rocking that parents used to quieten their child (assumed to represent a low level of colic), based on data extracted from the crying diary;

  • three studies (Mercer 1999; Olafsdottir 2001; Miller 2010) evaluated significant improvements in colic symptoms reported by parents using the top two responses (combined) for global improvement based on a five‐point Likert scale (i.e. 'completely recovered' and 'somewhat better' (Mercer 1999); 'completely well' and 'marked improvement' (Olafsdottir 2001); 'much improvement' and 'moderate improvement' (Miller 2010)).

All studies reported fairly high drop out rates and therefore performed available case analyses for daily hours of crying with no imputed data points. One study (Olafsdottir 2001) reported no difference to the outcomes following an intention to treat analysis, but did not report the results of this analysis.

Interventions

Four of the studies applied chiropractic interventions (Mercer 1999; Wiberg 1999; Olafsdottir 2001; Miller 2010) and two used osteopathy or cranial osteopathy (Heber 2003; Hayden 2006). Details of the interventions were not clear in all cases. Miller 2010 treated the occiput and spine; Wiberg 1999 treated the spine and pelvis; Mercer 1999 treated the spine (with most adjustments to cervical and thoracic regions); Hayden 2006 treated the "head and other areas"; Heber 2003 treated parietal, visceral and craniosacral. Olafsdottir 2001 did not specify what adjustments were used.

Overall, studies reported fair compliance with the study protocol, but there was a fairly high loss either to discharge or withdrawal.

Duration and frequency of treatments

The treatment regimens varied between the studies:

  • two studies used five visits, spaced weekly over a period of four weeks, although one (Hayden 2006) commenced treatment at the first visit (and hence had no pre‐treatment baseline) and the other (Heber 2003) commenced treatment at the second visit;

  • two studies used a two‐week duration ‐ one of which (Wiberg 1999) did three to five treatments in that time and the other (Mercer 1999) did two to three treatments each week, with a follow‐up one month later;

  • one study (Miller 2010) had a 10‐day duration with treatment 'as needed' and early discharge where appropriate;

  • one study (Olafsdottir 2001) delivered three treatments over an eight‐day period, at two‐ to five‐day intervals.

Participants

  • The number of participants initially randomised varied between 28 and 100 (28 in Hayden 2006; 32 in Mercer 1999; 46 in Heber 2003; 50 in Wiberg 1999; 102 in Miller 2010 ‐ although only two of these groups are included in this analysis, representing 69 infants; and 100 in Olafsdottir 2001). In most studies, participants were excluded from the analysis because of missing data towards the end of the study period. Overall, 325 participants were randomised, of whom 292 completed the protocols. The outcome with the largest number available for the meta‐analysis was improvement in daily hours of crying, which included 231 participants.

  • The age of infants admitted to the studies varied from birth to 12 weeks of age, although different studies variously used a lower limit of: zero (Mercer 1999); one week (Hayden 2006; Miller 2010); two weeks (Wiberg 1999; Heber 2003); three weeks (Olafsdottir 2001); and upper age limits of eight weeks (Mercer 1999; Miller 2010); nine weeks (Olafsdottir 2001); 10 weeks (Wiberg 1999) or 12 weeks/three months (Heber 2003; Hayden 2006). This is an important consideration in colic, where authors generally concur that the mean onset is 1.8 weeks and symptoms of colic usually remit by three months (Paradise 1966).

  • Four studies reported an average age at entry to the study: Wiberg 1999 reported 5.9 weeks (control) and 4.9 weeks (experimental); Mercer 1999 reported a six‐week average of both groups; Hayden 2006 reported 6.3 weeks (control) and 6.6 weeks (experimental); and Miller 2010 reported 5.25 weeks (control) and 4.9 weeks (experimental).

  • Age at onset of colic symptoms was reported in only one study (2.2 weeks in Wiberg 1999). This is important in colic studies, as some authors suggest that earlier onset is linked to more severe symptoms (Stahlberg 1984).

  • Gender percentages varied (where reported) between 34% and 93% male (mean 56%, median 47%). For those studies that reported sex of the infant for both control and experimental groups, the mean was 61% male in control and 53% in experimental groups ( Table 2).

    Open in table viewer
    Table 2. Male/female ratio in studies

    Study

    % male in control group

    % male in treatment group

    Heber 2003

    53%

    53%

    Hayden 2006

    64%

    93%

    Olafsdottir 2001

    67%

    43%

    Wiberg 1999

    45%

    67%

    Mercer 1999

    47%

    67%

    Miller 2010

    68%

    34%

  • Mean gestational age (where reported) varied between 39 and 40 weeks. Hayden 2006 reported 39.2 weeks (control) and 39.6 weeks (experimental); Wiberg 1999 reported 39.6 weeks (control) and 40 weeks (experimental); Miller 2010 reported 39.0 weeks (control) and 39.3 weeks (experimental).

  • Other demographics and participant characteristics were inconsistently reported.

Settings

Studies were conducted in Europe and South Africa in:

  • private osteopathic clinics (Heber 2003; Hayden 2006) ‐ one trial in Gloucester, UK; one trial split between two clinics, in Vienna (Austria) and in Wolfsburg (Germany);

  • private chiropractic clinics (Wiberg 1999; Olafsdottir 2001) ‐ one in Bergen, Norway, one in Copenhagen, Denmark;

  • teaching chiropractic clinics (Mercer 1999; Miller 2010) ‐ one in Technikon Natal, South Africa; one in Bournemouth, UK.

Definition of colic

The generally accepted definition of colic within the broader literature is of inconsolable crying for more than three hours per day for more than three days per week over a three‐week period or longer. This was used by two studies (Olafsdottir 2001; Heber 2003). As is common in colic studies, the three‐week time frame was omitted by the other authors, and various interpretations were applied. Mercer 1999 had an unclear definition of colic. Other authors used the following definitions.

  • Crying for more than three hours, on at least three days per week for at least three weeks (Olafsdottir 2001; Heber 2003), although it is unclear how this reconciles with the age range of infants that included a two‐week‐old (Heber 2003).

  • Ninety minutes of inconsolable crying per 24‐hour period in five of the previous seven days, with normal behaviour outside these periods (Hayden 2006).

  • Colic diagnosis "based on mothers' diagnosis of excessive crying" and verified by baseline crying diary. Note: additional criteria were applied to separate infants diagnosed with ISMO (irritable infant syndrome of musculoskeletal origin), IFCIDS (infant cry‐irritability with sleep disorder syndrome) and KISS (kinetic imbalance due to suboccipital strain) (Miller 2010).

  • At least one violent, inconsolable, crying spell (including motor unrest) lasting for more than three hours on at least five of the seven days of the baseline week (Wiberg 1999).

Study design

All studies were randomised trials. Five out of six studies were of a two‐group design and the other study (Miller 2010) incorporated a third group to evaluate the effects of blinding. This third group was ignored for the purposes of the meta‐analysis.

Randomisation was done by: random number table (Hayden 2006), pre‐populated randomisation list (Heber 2003), permutated groups (of four in Mercer 1999 or 18 in Miller 2010), sealed envelopes (Olafsdottir 2001) or by the drawing of a ticket (Wiberg 1999).

Control conditions

The control condition for three of the studies was 'no treatment' (Olafsdottir 2001; Hayden 2006; Miller 2010). Of the other three studies, one used a sham treatment (non‐functional de‐tuned ultrasound machine ‐ Mercer 1999), one used a dimethicone medication as a placebo (since it has been shown to be no better than placebo ‐ Wiberg 1999). One was not clearly specified, that is, 'conventional medical intervention' (Heber 2003).

Excluded studies

We excluded seven studies because they were not randomised (Koonin 2002; Gludovatz 2003; Karpelowski 2004; Mills 2010) or they were uncontrolled (Denckens 1996; Davies 2007). One study (Browning 2008) was a comparison of two potentially active treatments. See Characteristics of excluded studies.

Risk of bias in included studies

Details of the 'Risk of bias' assessments are shown in the Characteristics of included studies table and Figure 2.


Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Allocation

Most of the studies used a simple but acceptable method of randomisation ‐ typically based on random number tables (Heber 2003; Hayden 2006), the blinded drawing of a ticket (Wiberg 1999) or using permutated groups (Mercer 1999; Miller 2010). One study (Olafsdottir 2001) did not specify the method used.

Mechanisms for concealing allocation were sealed opaque envelopes (Olafsdottir 2001; Miller 2010) or allocation by an independent person (Heber 2003). Three studies did not describe the method used (Mercer 1999; Wiberg 1999; Hayden 2006).

We assessed all studies as either low or unclear risk of bias for both random sequence generation and allocation concealment.

Blinding

Blinding of participants and personnel

Parental blinding was attempted in two of the studies (Olafsdottir 2001; Miller 2010) and, although an intent to blind parents was implied in another (Mercer 1999), it was not entirely clear that was the case. Three studies made no attempt to blind parents so were judged to be at high risk of bias for this domain: Wiberg 1999; Heber 2003; Hayden 2006.

It is not, of course, possible to blind the practitioner in a trial of any manual (or surgical) therapy, which creates a clear theoretical risk of bias resulting from a difference in the credibility with which the real and control interventions are delivered. If present, such bias may overstate the benefits of the real intervention, while potentially under‐stating any improvements in the control group. The effect size of such risk of bias in chiropractic and osteopathy is not known, although the review authors are aware of studies in acupuncture where patient‐reported treatment credibility has been studied in some depth and related to outcome (for example, White 2012 who concluded "The ANCOVA [analysis of covariance] showed that treatment credibility … scores had no effect on outcome, implying that equipoise between groups was achieved"). These studies indirectly suggest that the effect of lack of clinician blinding is minimal, if it exists at all.

We considered, therefore, that studies were at low risk of performance bias if there was adequate parental blinding (Olafsdottir 2001; Miller 2010).

This evaluation notwithstanding, the two studies (Olafsdottir 2001; Miller 2010) that achieved low risk of performance bias both attempted to mitigate any effects of the lack of clinician blinding. Miller 2010 created a script to be delivered by the clinicians and Olafsdottir 2001 minimised the contact between parents and clinician by using a nurse as intermediary to remove the infants to a separate room (away from the parents) ‐ both thereby minimising the opportunity for clinician beliefs to be communicated to the parent.

Blinding of outcome assessors

Since parents were filling out the crying diaries, their blinding is considered as part of 'blinding of participants'. We considered outcome assessors, therefore, as those who were interpreting the crying diaries. Blinding of other study personnel (for example, those doing data extraction, or performing the statistical analysis) was reported in three studies (Wiberg 1999; Olafsdottir 2001; Miller 2010) so we rated these studies as being at low risk of bias for this domain. While outcome assessment blinding was not mentioned in the remaining studies, we determined that it was unlikely (given the status of other blinding and the usual limitations of undergraduate theses) and so assessed Mercer 1999, Heber 2003 and Hayden 2006 as at high risk of bias.

Incomplete outcome data

Details of the incomplete outcome data in each study are provided in the Characteristics of included studies.

Incomplete outcome data troubled most of the studies in this analysis. Generally, greater numbers of withdrawals were reported from the control group than from the intervention groups (Wiberg 1999; Olafsdottir 2001; Hayden 2006; Miller 2010). In one study (Heber 2003), drop‐outs were equal between the groups and, in the remaining study (Mercer 1999), it was not clear which groups the drop‐outs were from.

Where data on the reasons for these drop‐outs were given, infants were often removed from the study control group because of worsening symptoms or lack of improvement (Wiberg 1999; Hayden 2006; Miller 2010) or discharged well from the treatment group (Miller 2010).

We judged Wiberg 1999 as at high risk of bias, Heber 2003 as low risk and the other studies as unclear.

Selective reporting

We did not have access to study protocols for four of the studies and therefore evaluated reporting bias as unclear. For the two studies where protocols were available, one (Hayden 2006) contained insufficient details of any intentions concerning outcomes or analysis, and was therefore determined as also being at unclear risk of bias. The other protocol (for Miller 2010) was much more detailed, containing clear intentions on outcome data but not specifying analysis techniques, and we evaluated it as low risk of bias.

Other potential sources of bias

We found indications of other potential sources of bias in only one study (Heber 2003), where, in contrast to the other studies, which all used a 24‐hour crying diary completed contemporaneously by parents, the authors had instructed parents to "informally register a daily value (of crying hours) and to determine an average value at the end of the week". We felt that this could lead to recall bias, although the effect of this on the study outcomes is unclear.

Effects of interventions

See: Summary of findings for the main comparison Manipulative therapy compared to control condition for infant colic

In this review we included six trials involving 325 infants. The results are summarised by primary and secondary outcomes.

Several of the studies reported multiple outcomes as detailed in Table 3.

Open in table viewer
Table 3. Outcomes reported in each study

Outcome

Hayden 2006

Heber 2003

Mercer 1999

Miller 2010

Olafsdottir 2001

Wiberg 1999

Figure

Primary outcomes

Change in hours crying time per day

Figure 3

Reduction to < 2 hours of crying per day

> 30% improvement in daily hours of crying

Presence/absence of colic ('full recovery')

Figure 4

Adverse effects

Secondary outcomes

Sleeping time

Other outcomes measured

Significant global improvement

Change in intensity of crying

Change in daily hours of rocking and holding

Primary outcomes

Change in hours crying time per day

Five studies (Wiberg 1999; Olafsdottir 2001; Heber 2003; Hayden 2006; Miller 2010) provided data for a total of 231 infants (130 in manipulative therapy groups and 101 in control groups), although this represents around 284 infants initially randomised (147 manipulative therapy, 137 controls). All of the studies reported available case analyses.

Four of the studies reported change‐from‐baseline data. However, one (Olafsdottir 2001) reported mean crying times at baseline and at each visit, but no change scores. Accordingly, although we could calculate the mean change, we had no data for the standard deviations of the change scores and we imputed this based on the correlation coefficient of other similar studies, as follows.

The imputation of missing standard deviations for changes from baseline should be only be done from studies in which the same measurement scale was used, where the same degree of measurement error was present and which had the same time periods (between baseline and final value measurement) (Higgins 2008b).

There were four studies from which we could have imputed correlation coefficients. All used the same measurement scale (hours of crying per day), although the duration of the studies and data recording mechanisms (and hence potential errors) differed as follows:

  • Heber 2003: approximately four weeks' duration, data based on a parental estimation of mean crying hours per day and not from crying diary.

  • Miller 2010: approximately 10 days' duration with data from a detailed crying diary completed contemporaneously by the parent.

  • Wiberg 1999: 12 to 15 days' duration, data from a detailed crying diary completed contemporaneously by the parent.

  • Hayden 2006: four weeks' duration, data from a detailed crying diary completed contemporaneously by the parent.

In selecting which studies would therefore be sufficiently comparable to Olafsdottir 2001 (duration 12 to 15 days, data from a detailed crying diary completed by a parent), we determined that the Wiberg 1999 and Miller 2010 studies were the most appropriate. These yielded correlation coefficients of:

Based on these, a correlation of 0.6 seems reasonable. This translates into standard deviations of 2.6 and 2.7 for the experimental and control groups, respectively in Olafsdottir 2001. Similar correlations (0.5 (treatment); 0.68 (control)) and standard deviations were also imputed when we included Heber 2003 in the analysis.

The data from the trials (Figure 3; Analysis 1.1) favoured manipulative therapies, indicating a statistically significant reduction of one hour 12 minutes in crying time (MD ‐1.20; 95% CI ‐1.89 to ‐0.51). The test of heterogeneity was not significant (Chi2 = 9.08; P = 0.06; I2 = 56%), the estimate of the between‐study variance (Tau2) was 0.34 and the test for overall effect (Z) was 3.41 (P = 0.0007).


Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.1 Change in daily hours of crying

Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.1 Change in daily hours of crying

Visual inspection of the CIs indicated substantial levels of overlap between the studies, with the exception of Olafsdottir 2001 ‐ indicative of generally low levels of heterogeneity, with one potential outlier (see discussion in Quality of the evidence).

We conducted sensitivity analyses to assess the impact of study risk of bias on the result. The result of the meta‐analysis (see Figure 3) remained robust for the following.

  • Studies with a low risk of selection bias (sequence generation and allocation concealment) (Wiberg 1999; Olafsdottir 2001; Heber 2003; Miller 2010): mean reduction in crying of one hour 14 minutes of crying per day (MD ‐1.24; 95% CI ‐2.16 to ‐0.33). The test for heterogeneity was significant (Chi2 = 8.66; P = 0.03; I2 = 65%). Tau2 was 0.57 and the test for overall effect, Z, was 2.66 (P = 0.008).

  • Studies with a low risk of attrition bias (associated with completeness of data) (Heber 2003 only): reduction in crying of one hour and 57 minutes per day (MD ‐1.95; 95% CI ‐2.96 to ‐0.94); test for overall effect: Z was 3.77 (P = 0.0002).

  • Studies that had been published/peer reviewed (Wiberg 1999; Olafsdottir 2001; Heber 2003; Miller 2010): reduction in crying of one hour (MD ‐1.01; 95% CI ‐1.78 to ‐0.24); the test for heterogeneity was not significant (Chi2 = 6.70; P = 0.08); I2 = 55%). Tau2 was 0.34; test for overall effect: Z was 2.57 (P = 0.01).

However, when we included only those studies with a low risk of performance bias (parental blinding) (Olafsdottir 2001; Miller 2010), the analysis indicated a non‐significant reduction in crying hours of 34 minutes (MD ‐0.57; 95% CI ‐2.24 to 1.09; heterogeneity: Chi2 = 3.99; P = 0.05; I2 = 75%; Tau2 = 1.08; test for overall effect: Z = 0.67; P = 0.50).

One of these studies (Miller 2010) included a third group in order to evaluate the effect of parental blinding on reported improvements in crying hours. They found "no statistically significant differences in the mean change in crying time from baseline at any of the time points between the patients of parents who were and were not blinded to treatment."

Clinically significant reduction in daily hours of crying

The distinction between a statistically significant reduction in daily hours of crying and a clinically significant reduction is important, as it is possible to get a strong statistically significant improvement that makes no meaningful difference to family life.

This subset of the change in crying time outcome was not anticipated in the protocol, but we have reported it here because of its potential importance to this particular disorder.

As far as we are aware, there are no accepted definitions of what would constitute a clinically significant reduction in crying for infantile colic, but one study (Miller 2010) (including the results for 52 infants), attempted to identify the proportions of infants whose crying reduced by a clinically relevant amount, defined a priori as: (a) a reduction in crying to below two hours per day and (b) a reduction of 30% or more. This study reported the following results:

  • a significantly greater proportion of infants receiving chiropractic care saw a reduction of crying to two hours per day or less than did those in the control group (OR 6.33; 95% CI 1.54 to 26.00; test for overall effect: Z = 2.56; P = 0.01; NNTB 2.75) (Analysis 1.2);

  • a significantly greater proportion of infants receiving chiropractic care saw a greater than 30% reduction in crying from baseline than did those in the control group (OR 3.70; 95% CI 1.15 to 11.86; test for overall effect: Z = 2.20; P = 0.03; NNTB 3.2) (Analysis 1.3).

While the other studies did not report the proportion of participants achieving any definition of clinically significant reduction in crying time, the results for the groups as a whole are shown in Table 4.

Open in table viewer
Table 4. Clinical significance of reduction in crying time

Study

Treatment group

Control group

Mean crying time at baseline (hours)

Mean crying time at end of study (hours)

% reduction

Mean crying time at baseline (hours)

Mean crying time at end of study (hours)

% reduction

Hayden 2006

2.39

0.89

63%

2.06

1.56

23%

Heber 2003

4.68

1.98

58%

4.65

3.9

16%

Miller 2010

5.4

3.0

44%

5.5

4.5

18%

Olafsdottir 2001

5.1

3.1

39%

5.4

3.1

42%

Wiberg 1999

4.3

1.9

63%

5.2

4.2

19%

Reduction to less than 2 hours of crying per day and reduction by more than 30% shown in bold

Three of the studies reported reductions to an average of less than two hours of crying per day (Wiberg 1999; Heber 2003; Hayden 2006) in the treatment groups compared to one study (Hayden 2006) where this was achieved in the control group. All five studies reported an average reduction in hours crying of more than 30% in the treatment groups, with one (Olafsdottir 2001) achieving comparable reductions in the control group.

Presence/absence of colic

Data were available for 'recovery' from infantile colic (as reported by parents on a Likert scale) for a total of 168 infants in three studies (Mercer 1999; Olafsdottir 2001; Miller 2010). The OR of absence of colic (top category of Likert scale) approached statistical significance (OR 11.12; 95% CI 0.46 to 267.52; heterogeneity: Tau2 = 6.92; Chi2 = 17.91; degrees of freedom (df) = 2; P = 0.0001; I2 = 89%; test for overall effect: Z = 1.48; P = 0.14) (Figure 4).


Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.5 Presence/absence of colic

Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.5 Presence/absence of colic

Visual inspection of the CIs indicated no overlap between the Olafsdottir 2001 and either the Mercer 1999 or Miller 2010 study, indicating high levels of heterogeneity, which was supported by the high value of I2 and the non‐significant probability in the test for overall effect.

Sensitivity analyses were performed for those studies that had a low risk of selection bias, low risk of performance bias and had been peer reviewed (Olafsdottir 2001; Miller 2010)(OR 4.32; 95% CI 0.12 to 157.98; heterogeneity: Tau2 = 6.06; Chi2 = 9.50; df = 1; P = 0.002; I2 = 89%; test for overall effect: Z = 0.80; P = 0.43) (Figure 4). There were no studies with a low risk of attrition bias.

Adverse outcomes

Only one study (Miller 2010; N = 102) reported findings for adverse outcomes. None were recorded.

A case report was incidentally drawn to our attention during the review process. This report outlines the case history of an individual infant who died following treatment for infantile colic by a "so called CranioSacral Therapist" (Holla 2009) who appears to have used an unrecognised technique. We have not undertaken a systematic search for safety studies, although we have introduced the debate in the background section. We may consider a comprehensive search specifically for adverse effects in the update of this review.

Secondary outcomes

One study (Hayden 2006) reported changes in sleeping time for 26 infants, finding an increase of one hour and 10 minutes' sleeping in the treated group (MD 1.17; 95% CI 0.22 to 2.12; test for overall effect: Z = 2.42; P = 0.02).

No studies reported on:

  • changes in frequency of crying bouts;

  • measures of parental or family quality of life;

  • measures of parental stress, anxiety or depression;

  • parental satisfaction.

Other outcomes measured in the studies

Three other outcomes reported in the studies, which we had not anticipated in the protocol, are shown in Table 5.

Open in table viewer
Table 5. Additional outcomes measured in the studies

Change in intensity of crying (measured on scale of 1 to 10)

Heber 2003

45 infants

MD ‐2.10; 95% CI ‐3.00 to ‐1.20

Change in daily hours of rocking and holding

Hayden 2006

28 infants

MD ‐1.17; 95% CI ‐2.12 to ‐0.22

Significant global improvement (measured as top two categories on parent‐rated Likert scale)

Mercer 1999; Olafsdottir 2001; Miller 2010

177 infants in total

Meta‐analysis: OR 16.62; 95% CI 0.63 to 441.60 (heterogeneity: Tau2 = 7.37; Chi2 = 26.52; I2 = 92%; test for overall effect: Z = 1.68; P = 0.09)

Sensitivity analysis was performed to include only studies with a low risk of selection bias, low risk of performance bias and which had been peer reviewed (Olafsdottir 2001; Miller 2010): OR 5.5; 95% CI 0.16 to 192.43 (heterogeneity: Tau2 = 6.22; Chi2 = 18.29; I2 = 95%; test for overall effect: Z = 0.94; P = 0.35)

Discussion

Summary of main results

This review evaluates the effects of manipulative therapies in the treatment of infants with infantile colic. It includes six trials (with a total of 325 infants) that compared manipulative therapies with no treatment or placebo. Overall, in comparison with the group that received either no intervention or a placebo:

  • manipulative therapies had a significant effect on the daily hours of crying ‐ reducing crying by an average of one hour and 12 minutes. This difference is sustained when considering subgroups of studies with a low risk of selection bias (sequence generation and allocation concealment), those with a low risk of attrition bias and those studies that have been published/peer reviewed. However, when considering only those studies with a low risk of performance bias (parental blinding) the result was not significant;

  • a greater proportion of those infants receiving manipulative therapies saw a clinically significant reduction in daily hours of crying (defined either as reduction to less than two hours per day or reduction of more than 30%) than did those receiving no treatment;

  • manipulative therapies did not result in significantly higher proportions of 'full recovery' as reported by parents using a parental global improvement scale;

  • only one study reported adverse events (Miller 2010; N = 102), and none were encountered;

  • manipulative therapies resulted in significant improvements in (longer) sleeping time.

See summary of findings Table for the main comparison

Overall completeness and applicability of evidence

Most included studies reported on the primary outcomes of interest ‐ change in daily hours of crying and presence/absence of colic after treatment. However, with only six studies, four of which randomised 50 infants or fewer, it is difficult to determine the generalisability of the findings.

Included studies were conducted in both teaching clinics and private practice, and in different European and Southern African countries, which would suggest that the findings would be applicable in these contexts. Studies also included treatments carried out by both interns and by experienced practitioners, with no apparent impact on outcomes.

However, as observed by a consumer referee for this review, the outcomes considered represent a very narrow view of the experience of parents with a colicky baby. Management by the family and extended family may be as important as diagnosis and treatment. The impact of therapy on a family's quality of life or the development of the infant, is not something that was measured in any of the included studies.

Further evaluation of the impact of manipulative therapies on parent‐child relationships, attachment and other aspects of parental behaviour and mood, should be considered.

Quality of the evidence

The methodological quality of the included studies was mixed, with:

In relation to the quality of evidence currently available for the impact of manipulative therapies, there are three issues that are worthy of further discussion, namely: parental blinding, attrition rates and statistical heterogeneity.

Parental blinding

Only two studies (Olafsdottir 2001; Miller 2010) blinded parents to treatment. Miller 2010 included a third group to evaluate the effect that parental blinding might have on the outcomes. In this comparison, both groups received treatment. In one, parents were aware that their infant was receiving chiropractic treatment and in the other parents had been told that their infant would be allocated to one of two groups, but did not know whether their infant received treatment or sham. Comparison between these groups showed a small difference in outcome of 0.4 hours (95% CI ‐1.0 to 1.8 hours in favour of the unblinded group), which was not statistically significant.

These findings are difficult to interpret in the absence of other studies that have compared blinded and unblinded results but may indicate that parentally‐reported crying times are at lower risk of performance bias than was previously thought. The review authors have taken a conventional approach and therefore interpreted the impacts of parental blinding conservatively.

Attrition rates

The effects of attrition rates (risk of bias owing to incomplete outcome data) on the change in daily hours of crying are difficult to evaluate. The included studies reported fairly high drop‐out rates, generally unbalanced between the groups ‐ with greater drop‐out rates in the control groups (Wiberg 1999; Olafsdottir 2001; Hayden 2006; Miller 2010).

Where analysed (Wiberg 1999; Hayden 2006; Miller 2010), parents tended to withdraw their infants from the studies for two reasons:

  • because of lack of improvement or worsening symptoms in the control group (17 out of the 23 (74%) were lost from control groups due to worsening symptoms).This may tend to under‐estimate the hours crying per day of the group as a whole at the end of the study, which may over‐estimate any improvement;

  • because of full recovery or significant improvement in the treatment group (7 out of 7 (100%) were lost from treatment groups due to being discharged well). This may tend to over‐estimate the hours crying per day for the group as a whole at the end of the study, which may under‐estimate any improvement).

If the patterns of dropouts in the Wiberg 1999, Hayden 2006 and Miller 2010 findings were replicated across other studies the net result would reduce the apparent differences between the two groups, thereby introducing bias against the intervention. We have reflected this in the assessment of 'unclear' risk of bias due to attrition in the risk of bias tables for these studies (Characteristics of included studies).

Heterogeneity

The statistical tests suggest substantial heterogeneity. From visual inspection of the forest plots, one study has substantially lower overlap of CIs with the others (Olafsdottir 2001). On further inspection, the difference in outcomes is due in large part to the response of the control group in the Olafsdottir study, which was larger than any of the other studies (see Table 6, which shows the differences between the change scores in each study). A sensitivity analysis that excluded this study reduced the I2 statistic to zero, indicating that this study is the main contributor to the heterogeneity.

Open in table viewer
Table 6. Mean changes in crying time between treatment and control groups

Mean changes in crying time

Trial

Treatment group mean

Control group mean

Difference

Hayden 2006

‐1.5

‐0.5

‐1.00

Heber 2003

‐2.7

‐0.75

‐1.95

Miller 2010

‐2.4

‐1.0

‐1.40

Olafsdottir 2001

‐2.0

‐2.3

0.30

Wiberg 1999

‐2.7

‐1.0

‐1.70

Significant findings for such heterogeneity might either be because of methodological diversity or clinical diversity, which leads the review authors to four questions:

  • Methodological diversity

    • Might the differences between the Olafsdottir study and others be due to parental blinding? The effects of blinding in this population and for these outcomes are unclear, as discussed earlier.

  • Clinical diversity

    • Did Olafsdottir select from a different population? Certainly, this study applied a different set of selection criteria from the other studies: not recruiting infants who had any response to four days of cow's milk protein withdrawal, who had signs of lactose intolerance (evaluated by pH and reducing substances in stools) or who had 'insufficient effect' of sucrose on crying.

    • Did Olafsdottir use different intervention or sham techniques? In all of the other studies, we have clear indications that active adjustments included the entire spinal column. Although we have attempted to contact the authors of the Olafsdottir study to confirm which adjustments they used, we have as yet had no response.

    • Was there some other clinical factor that affected outcomes, especially in the control group? The review authors were unable to identify any factors from the information available, and would welcome any views on this.

In the summary of findings Table for the main comparison, the quality of evidence was downgraded due to risk of bias (limitations in study design or execution) and inconsistency (unexplained heterogeneity), which resulted in a grading of 'low quality' for the evidence for improvement in daily hours of crying and for presence/absence of colic (Schunemann 2008).

Potential biases in the review process

The potential for conflicts of interest have been noted and addressed by the allocation of workload, by duplicate evaluation of studies by two authors and by the system of checking used.

The main potential cause of bias in the review process was minimised by ensuring that the decisions regarding eligibility for inclusion and data extraction were completed independently by two review authors, disagreements between whom were resolved by discussion and consensus.

Comprehensive searches, including extensive enquiries for any grey literature sources, were conducted to identify all relevant studies to avoid publication bias.

The review introduced a more detailed analysis of the hours of crying time, by discussing clinically significant improvements.

We considered all manipulative therapies together due to the similarity in techniques when used with infants. Some practitioners may argue that different types of manipulative therapies should be considered separately. We may consider this in future revisions of this review, if there are sufficient high quality studies to make it meaningful.

The review itself was conducted in accordance with the published protocol and we have clearly indicated any deviances or additions.

Lastly, this review has received no direct funding, although the authors acknowledge assistance from their associates and institutions (please see Acknowledgements and Sources of support below).

Agreements and disagreements with other studies or reviews

To the review authors' best knowledge, this is the first meta‐analysis to be attempted on this subject, so there are few direct comparisons with other findings. Previous systematic (rather than meta‐analyses) reviews have found no evidence that either:

have beneficial impacts on infantile colic. Most recently, the review of systematic reviews by Bronfort 2010 stated, "All four systematic reviews concluded there is no evidence manual therapy is more effective than sham therapy for the treatment of colic."

The more detailed and rigorous analysis reported in this review indicates that there is some evidence that manipulative therapies may have a beneficial effect on the natural course of infantile colic.

This difference in the conclusions derives mainly derived from the interpretation of the Olafsdottir 2001 study. This was the largest and best quality (parentally blinded) study included in the systematic reviews (which have not included Miller 2010), hence its negative findings outweighed the other studies' positive conclusions. However, while this meta‐analysis confirms that statistical significance is not reached when considering only those studies with low risk of performance bias (parental blinding), it also highlights high heterogeneity between the Olafsdottir study and the others, which may also account for the differences.

Study flow diagram
Figures and Tables -
Figure 1

Study flow diagram

Risk of bias summary: review authors' judgements about each risk of bias item for each included study
Figures and Tables -
Figure 2

Risk of bias summary: review authors' judgements about each risk of bias item for each included study

Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.1 Change in daily hours of crying
Figures and Tables -
Figure 3

Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.1 Change in daily hours of crying

Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.5 Presence/absence of colic
Figures and Tables -
Figure 4

Forest plot of comparison: 1 Manipulative therapies versus control, outcome: 1.5 Presence/absence of colic

Comparison 1 Manipulative therapy versus control, Outcome 1 Change in daily hours of crying.
Figures and Tables -
Analysis 1.1

Comparison 1 Manipulative therapy versus control, Outcome 1 Change in daily hours of crying.

Comparison 1 Manipulative therapy versus control, Outcome 2 Reduction to less than two hours crying per day.
Figures and Tables -
Analysis 1.2

Comparison 1 Manipulative therapy versus control, Outcome 2 Reduction to less than two hours crying per day.

Comparison 1 Manipulative therapy versus control, Outcome 3 Greater than 30% improvement in daily hours of crying.
Figures and Tables -
Analysis 1.3

Comparison 1 Manipulative therapy versus control, Outcome 3 Greater than 30% improvement in daily hours of crying.

Comparison 1 Manipulative therapy versus control, Outcome 4 Presence/absence of colic.
Figures and Tables -
Analysis 1.4

Comparison 1 Manipulative therapy versus control, Outcome 4 Presence/absence of colic.

Comparison 1 Manipulative therapy versus control, Outcome 5 Change in mean daily hours of sleeping.
Figures and Tables -
Analysis 1.5

Comparison 1 Manipulative therapy versus control, Outcome 5 Change in mean daily hours of sleeping.

Summary of findings for the main comparison. Manipulative therapy compared to control condition for infant colic

Manipulative therapy compared to no treatment or sham for infant colic

Patient or population: infants with colic
Settings: teaching clinics and private practice
Intervention: manipulative therapy
Comparison: no treatment, sham treatment or usual treatment

Outcomes

Illustrative comparative risks* (95% CI)

Relative effect
(95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

Comments

Assumed risk

Corresponding risk

No treatment or sham

Manipulative therapy

Change in daily hours of crying
Crying diary completed by parents

The mean change in daily hours of crying ranged across control groups from
‐0.5 to ‐2.3 hours (reduction)

The mean change in daily hours of crying in the intervention groups was
1.2 hours
(1.89 to 0.51 hours greater reduction than control)

223
(5 studies)

⊕⊕⊝⊝
low1,2

Presence/absence of colic
Global impression of change scale completed by parents

Study population

OR 11.12
(0.46 to 267.52)

185
(3 studies)

⊕⊕⊝⊝
low1,2

Data were either highest category ('completely recovered' or similar) from Likert‐style questionnaire completed by parents or from records of infants discharged well

135 per 1000

634 per 1000
(67 to 977)

Moderate

133 per 1000

630 per 1000
(66 to 976)

Adverse events
Incidents reported by parents

Study population

Not estimable

102
(1 study)

See comment

See comment

See comment

Moderate

*The basis for the assumed risk (e.g. the median control group risk across studies) is provided in footnotes. The corresponding risk (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
CI: Confidence interval; RR: Risk ratio; OR: Odds ratio.

GRADE Working Group grades of evidence
High quality: Further research is very unlikely to change our confidence in the estimate of effect.
Moderate quality: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
Low quality: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
Very low quality: We are very uncertain about the estimate.

1 Several of the studies did not attempt to blind participants
2 There is an unexplained heterogeneity between the Olafsdottir study and the other studies

Figures and Tables -
Summary of findings for the main comparison. Manipulative therapy compared to control condition for infant colic
Table 1. Methods not used in this version of the review

Section

Methodological aspect omitted

Types of studies   

We found no cluster‐randomised or cross‐over trials.

Types of interventions  

We found no craniosacral therapy or cranial manipulation trials, and no trials that used waiting list controls.

Types of outcome measures  

We found no studies reporting frequency of crying bouts, measures of parental or family quality of life, measures of stress, anxiety or depression or parental satisfaction.

Timing of outcome assessment

We did not find studies that adequately reported data from any later follow‐up.

Measures of treatment effect  

We had planned to use standardised mean difference if authors had used different measures for the same outcome. However, all studies used the same outcome measures

Unit of analysis issues  

If cluster‐randomised trials had been included, we had planned to use the intraclass correlation coefficient (ICC) to convert trials to their effective sample size before incorporating them into the meta‐analysis, per recommendation in the Cochrane Handbook for Systematic Review of Interventions (Higgins 2008b).

If studies containing three or more intervention arms had been included, we planned to incorporate appropriate pair‐wise comparisons, providing there was no evidence of bias, such as the authors introducing the additional groups after seeing the data (that is, the groups were determined a priori in the protocol) or of selective reporting (that is, data for all cohorts were reported). However, no eligible studies were identified that had this design.

Dealing with missing data  

Data from the studies were generally presented on an available case basis. All studies reported analysis of participants based on the group to which they were allocated, with none reporting participants who did not receive the allocated intervention, so we used no default data points or outcomes for participants who dropped out of the study.

Assessment of reporting biases  

We had planned to use funnel plots to investigate any relationship between effect estimates and study size/precision, had we found more than 10 studies (Sterne 2008). However, the number of studies was too small to warrant such analysis.

Data synthesis  

We had planned to undertake a meta‐analysis of all manipulative therapies together and then to group and analyse by subgroups based on common study characteristics if there were sufficient studies. However, with only five studies included in the main meta‐analysis, we determined that there were too few studies on which to do this

Data synthesis

We had planned to include 'adjusted' estimates of treatment effect that included the baseline outcome measurements as a covariate in a regression analysis (ANCOVA). However, only one of the studies (Miller 2010) included any regression analysis and this study included other factors (age and gender), alongside baseline outcome, as covariates. We therefore elected to include the raw scores from this study.

Data synthesis

We had planned to use the standardised mean difference approach had we found studies that measured the same outcome using different scales but this was not required

Data synthesis

We had planned to use Risk Ratios to report dichotomous outcomes. However, we used Odds Ratios of significant improvement in the analysis, since this is more appropriate to reporting improvements.

Subgroup analysis and investigation of heterogeneity  

We had planned to investigate any significant levels of heterogeneity, where there were are sufficient observations (at least 10 studies for each characteristic modelled (Deeks 2011)), using the following subgroups:

  • type of intervention (different techniques may impact the outcomes). Our main outcome (change in daily hours of crying) had data from three chiropractic and two osteopathic studies.

  • treatment dose (total number of treatments, number of treatments per week, or overall duration of treatment protocol), Our main outcome had data from two studies with a four‐week intervention period (both osteopathic), and three studies with an 8‐ to 15‐day intervention period (all chiropractic).

  • mean age of the participants at onset of colic (earlier onset may imply greater severity of symptoms). This was reported in only one of the studies.

As there were no outcomes with 10 or more studies, we elected not to do a subgroup analysis.

Figures and Tables -
Table 1. Methods not used in this version of the review
Table 2. Male/female ratio in studies

Study

% male in control group

% male in treatment group

Heber 2003

53%

53%

Hayden 2006

64%

93%

Olafsdottir 2001

67%

43%

Wiberg 1999

45%

67%

Mercer 1999

47%

67%

Miller 2010

68%

34%

Figures and Tables -
Table 2. Male/female ratio in studies
Table 3. Outcomes reported in each study

Outcome

Hayden 2006

Heber 2003

Mercer 1999

Miller 2010

Olafsdottir 2001

Wiberg 1999

Figure

Primary outcomes

Change in hours crying time per day

Figure 3

Reduction to < 2 hours of crying per day

> 30% improvement in daily hours of crying

Presence/absence of colic ('full recovery')

Figure 4

Adverse effects

Secondary outcomes

Sleeping time

Other outcomes measured

Significant global improvement

Change in intensity of crying

Change in daily hours of rocking and holding

Figures and Tables -
Table 3. Outcomes reported in each study
Table 4. Clinical significance of reduction in crying time

Study

Treatment group

Control group

Mean crying time at baseline (hours)

Mean crying time at end of study (hours)

% reduction

Mean crying time at baseline (hours)

Mean crying time at end of study (hours)

% reduction

Hayden 2006

2.39

0.89

63%

2.06

1.56

23%

Heber 2003

4.68

1.98

58%

4.65

3.9

16%

Miller 2010

5.4

3.0

44%

5.5

4.5

18%

Olafsdottir 2001

5.1

3.1

39%

5.4

3.1

42%

Wiberg 1999

4.3

1.9

63%

5.2

4.2

19%

Reduction to less than 2 hours of crying per day and reduction by more than 30% shown in bold

Figures and Tables -
Table 4. Clinical significance of reduction in crying time
Table 5. Additional outcomes measured in the studies

Change in intensity of crying (measured on scale of 1 to 10)

Heber 2003

45 infants

MD ‐2.10; 95% CI ‐3.00 to ‐1.20

Change in daily hours of rocking and holding

Hayden 2006

28 infants

MD ‐1.17; 95% CI ‐2.12 to ‐0.22

Significant global improvement (measured as top two categories on parent‐rated Likert scale)

Mercer 1999; Olafsdottir 2001; Miller 2010

177 infants in total

Meta‐analysis: OR 16.62; 95% CI 0.63 to 441.60 (heterogeneity: Tau2 = 7.37; Chi2 = 26.52; I2 = 92%; test for overall effect: Z = 1.68; P = 0.09)

Sensitivity analysis was performed to include only studies with a low risk of selection bias, low risk of performance bias and which had been peer reviewed (Olafsdottir 2001; Miller 2010): OR 5.5; 95% CI 0.16 to 192.43 (heterogeneity: Tau2 = 6.22; Chi2 = 18.29; I2 = 95%; test for overall effect: Z = 0.94; P = 0.35)

Figures and Tables -
Table 5. Additional outcomes measured in the studies
Table 6. Mean changes in crying time between treatment and control groups

Mean changes in crying time

Trial

Treatment group mean

Control group mean

Difference

Hayden 2006

‐1.5

‐0.5

‐1.00

Heber 2003

‐2.7

‐0.75

‐1.95

Miller 2010

‐2.4

‐1.0

‐1.40

Olafsdottir 2001

‐2.0

‐2.3

0.30

Wiberg 1999

‐2.7

‐1.0

‐1.70

Figures and Tables -
Table 6. Mean changes in crying time between treatment and control groups
Comparison 1. Manipulative therapy versus control

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1 Change in daily hours of crying Show forest plot

5

Mean Difference (IV, Random, 95% CI)

Subtotals only

1.1 Change in daily hours crying for all included studies

5

231

Mean Difference (IV, Random, 95% CI)

‐1.20 [‐1.89, ‐0.51]

1.2 Change in daily hours crying for studies with low risk of selection bias (random sequence generation)

5

231

Mean Difference (IV, Random, 95% CI)

‐1.20 [‐1.89, ‐0.51]

1.3 Change in daily hours crying for studies with low risk of selection bias (allocation concealment)

4

205

Mean Difference (IV, Random, 95% CI)

‐1.24 [‐2.16, ‐0.33]

1.4 Change in daily hours crying for studies with a low risk of performance bias (parental blinding)

2

124

Mean Difference (IV, Random, 95% CI)

‐0.57 [‐2.24, 1.09]

1.5 Change in daily hours crying for studies with low risk of attrition bias (selective reporting)

1

40

Mean Difference (IV, Random, 95% CI)

‐1.95 [‐2.96, ‐0.94]

1.6 Change in daily hours crying for studies that have been peer reviewed/published

4

191

Mean Difference (IV, Random, 95% CI)

‐1.01 [‐1.78, ‐0.24]

2 Reduction to less than two hours crying per day Show forest plot

1

52

Odds Ratio (M‐H, Random, 95% CI)

6.33 [1.54, 26.00]

3 Greater than 30% improvement in daily hours of crying Show forest plot

1

52

Odds Ratio (M‐H, Random, 95% CI)

3.70 [1.15, 11.86]

4 Presence/absence of colic Show forest plot

3

Odds Ratio (M‐H, Random, 95% CI)

Subtotals only

4.1 Presence/absence of colic for all included studies

3

185

Odds Ratio (M‐H, Random, 95% CI)

11.12 [0.46, 267.52]

4.2 Presence/absence of colic for studies with peer review, low risk of selection bias, low risk of performance bias

2

155

Odds Ratio (M‐H, Random, 95% CI)

4.32 [0.12, 157.98]

5 Change in mean daily hours of sleeping Show forest plot

1

26

Mean Difference (IV, Random, 95% CI)

1.17 [0.22, 2.12]

Figures and Tables -
Comparison 1. Manipulative therapy versus control