BACKGROUND

Lumbar radiculopathy, also referred to as sciatica, is a syndrome of radiating pain in a lumbar nerve root distribution that may also include motor weakness and sensory disturbances.1 Nerve root compression is typically caused by intervertebral disc herniation or degenerative changes and less commonly by infection, inflammation, neoplasm, vascular disease, or congenital abnormalities.1, 2 Prevalence ranges from 1.6 to 13.4%;3 studies suggest that the highest prevalence was between ages 45 and 64, and men are more commonly affected than women.1, 4, 5 A meta-analysis of 11 studies reported that 59% of individuals with low back pain seek medical care.6, 7 General medical practitioners, chiropractors, and orthopedists are the most commonly consulted clinicians for low back pain in North America.7

Treatment may involve surgical management to address the underlying cause, nonsurgical management of symptoms, or both. The objective of this review was to determine the efficacy, safety, costs, and cost-effectiveness of surgical versus nonsurgical interventions in adults with symptomatic lumbar radiculopathy.

METHODS

We searched PubMed from January 1, 2007, to April 10, 2019 using relevant medical subject headings (MeSH) and keywords (Appendix A of the Online Supplement). We also searched the Cochrane Library and clinicaltrials.gov. We hand-searched systematic reviews and key articles to identify studies published prior to 2007. Inclusion criteria included English-language, randomized or controlled trials in adults with symptomatic lumbar radiculopathy not related to infection, cancer, congenital, or major traumatic etiologies. Eligible interventions included discectomy, laminectomy, laminotomy, foraminotomy, nucleotomy, and nucleoplasty, including micro- and minimally invasive approaches. Eligible nonsurgical comparators included physical therapy, pharmacologic treatment, spinal manipulation, chiropractic treatment, or combinations thereof. We selected studies that reported pain, functioning, psychological distress, quality of life (QOL), neurologic symptoms, or return to work at least 4 weeks post randomization or later. We also selected studies that reported surgical mortality, surgery-related morbidity, reoperations, persistent opioid use, or cost outcomes. We excluded studies conducted in countries not rated “very high” on the 2016 United Nations Human Development Index.8

Title and abstracts were screened by a single team member after dual independent review of the first 50 titles and abstracts showed high interrater reliability (Light’s kappa = 0.84). Full-text articles were dually reviewed for eligibility. Data abstraction was completed by one team member and reviewed for accuracy by a second. The risk of bias of included studies was independently assessed by two team members using the Cochrane Risk of Bias (RoB 2.0) tool9 for trials and the Quality of Health Economic Studies (QHES) instrument10 for cost studies; discrepancies were resolved by discussion. We rated the risk of bias at the study level unless outcomes within the study warranted different ratings.

We considered outcomes reported at less than 12 weeks to be short-term, outcomes reported between 12 and up to 52 weeks to be medium-term, and outcomes reported at 52 weeks or later to be long-term. For cost outcomes, we adjusted outcomes in foreign currency to United States (U.S.) dollars based on the U.S. Department of Treasury mid-year exchange rate for the year reported and then used the chain-weighted consumer price index to adjust to 2010 U.S. dollars.11, 12 We calculated absolute mean differences (AMD) between groups and conducted statistical significance testing when not reported if possible. We synthesized findings in narrative and tabular formats. When we had at least three studies reporting similar outcomes and follow-up timing, we used Stata (version 15) to calculate a pooled standardized mean difference (SMD) using a random effects model.

We assessed the strength of evidence for between-group differences in outcomes at short-, medium-, and long-term follow-up using a modified GRADE approach.13 We modified GRADE by allowing “insufficient” ratings for single-study bodies of evidence with very serious concerns in one or more domains or when we were unable to draw a conclusion about the treatment effect because of very inconsistent findings.

RESULTS

We screened 1954 citations (Fig. 1). We excluded 1717 records after title and abstract review and 225 records after full-text review yielding a total of 8 studies from 12 articles for inclusion.

Figure 1
figure 1

Study flow diagram.

Study Characteristics

Of the eight studies identified, seven randomized controlled trials (RCT) reported efficacy or safety outcomes for discectomy, microdiscectomy, and percutaneous disc decompression with or without coblation (Table 1).4, 14,15,16,17,18,19 Nonsurgical comparators included spinal manipulation, physiotherapy, epidural steroid injection, and conservative management.

Table 1 Characteristics of Included Randomized Controlled Trials Comparing Surgery to Conservative Management for Lumbar Radiculopathy

Nearly all studies required correlation of clinical symptoms with imaging for participant enrollment. Most studies excluded participants with serious neurologic deficits and required 6 weeks of conservative treatment without improvement before enrollment. The number of participants randomized ranged from 40 to 501. All studies enrolled males and females; the mean age ranged from 36 to 48. Four studies reported the mean duration of symptoms at baseline, which ranged from 8.6 weeks to 2 years. We rated five studies as high risk of bias,4, 14, 17,18,19 one as some concerns for bias,16 and one as high risk of bias for efficacy outcomes later than 12 weeks and some concerns for bias for efficacy outcomes less than 12 weeks.15

We identified three studies comparing surgical to nonsurgical interventions that reported at least one cost outcome.20,21,22 Two studies21, 22 reported cost-effectiveness data based on RCTs conducted in the USA17 and the Netherlands.16 The third study was a cost-effectiveness analysis using inputs from U.S. sources.20 All compared discectomy or microdiscectomy with nonsurgical management. Two reported findings from both a societal perspective and a payor perspective.21, 22 We rated two studies as good quality21, 22 and one study as fair quality.20 Detailed findings related to efficacy and safety outcomes, cost outcomes, and risk of bias assessments are in Appendix B, C, and D of the Online Supplement, respectively.

Efficacy Outcomes

Pain

Six RCTs reported short- or medium-term pain outcomes14,15,16,17,18,19 and five RCTs reported long-term pain outcomes.4, 14, 17,18,19 Key study results are in Table 2; detailed results are in Appendix Table B-1 of the Online Supplement. We were unable to quantitatively synthesize findings for pain because we did not have at least three studies that reported a similar measure.

Table 2 Key Individual Study Findings Related to Pain, Physical Function and Disability, and Quality of Life

Three RCTs comparing discectomy or microdiscectomy to conservative management with medication and physical therapy reported leg pain and back pain using the visual analog scale (VAS) (0 millimeters (mm) [no pain] to 100 mm [worst pain]).15, 17, 18 Leg pain decreased from baseline by 41 to 57 mm with surgery compared to a 20- to 36.5-mm decrease for conservative management through short- and medium-term follow-up. The AMD between groups at 26-week follow-up were generally statistically significant and ranged from − 6 to − 26 mm.15, 17, 18 VAS back pain outcomes followed a similar pattern at all time points.15, 17, 18 Two of the RCTs also reported long-term leg pain outcomes;17, 18 although within-group improvements persisted, between-group differences were no longer statistically significant.

Four RCTs reported SF-36 Bodily Pain (0 [worst pain] to 100 [no pain]) and observed improvement in both groups at short- and medium-term follow-up (range 14.1 to 40.9 point increase for surgery, 17.3 to 30.5 point increase for comparator), but between-group differences were mixed.15, 16, 18, 19 Weinstein et al.19 reported an AMD at 12 weeks of 2.9 (95% CI, − 2.2 to 8.0), and Peul et al.18 reported an AMD at 8 weeks of 8.4 (95% CI, 3.2 to 13.5) and at 26 weeks of 3.3 (95% CI, − 1.8 to 8.4). Both of these studies compared discectomy or microdiscectomy to medication and physical therapy. In contrast, Gerszten et al.,15 which compared plasma disc decompression to epidural steroid injection, reported a significant between-group difference favoring surgery at 26 weeks (actual values not reported [NR], P = 0.004). Finally, McMorland et al.16 reported no difference in the repeated measure AMD from 6 to 12 weeks (actual values NR, P = 0.34) comparing microdiscectomy to spinal manipulation. Similar to the VAS, within-group improvements in the SF-36 Bodily Pain measure persisted, but between-group differences were not significant in the two RCTs reporting long-term outcomes.18, 19

Two studies comparing microdiscectomy or discectomy to medication and physical therapy reported short-, medium-, and long-term pain using the Sciatica Index (scale 0 [no pain] to 24 [worst pain]).18, 19 Scores improved in both groups at short-term follow-up; between-group AMDs were statistically significant and ranged from − 2.1 to − 4.0 points favoring surgery. Within-group improvements persisted, but between-group differences were mixed at long-term follow-up.

Other Pain Outcomes

Three RCTs reported other pain outcomes.4, 14, 16 These studies confirmed similar findings to the measures of pain previously reported. We did not use these studies in our strength of evidence ratings for the pain outcome because they were each only used in one study. Individual study results are available in Appendix Table B-1 of the Online Supplement.

Physical Function and Disability

Five RCTs15,16,17,18,19 reported a short- or medium-term and three RCTs16,17,18,19 reported a long-term physical function and disability outcome (Table 2). Detailed study results are in Appendix Table B-2 of the Online Supplement.

Three RCTs reported short- and medium-term outcomes for the Oswestry Disability Index (0 [no impairment] to 100 [worst impairment]). Scores improved in both groups but significantly more in the surgery groups.15, 17, 19 Weinstein et al.19 reported larger improvements for microdiscectomy or discectomy compared to medication and physical therapy at 12 weeks (AMD − 4.7 points; 95% CI, − 9.3 to − 0.2). Gerszten et al.15 reported statistically significant between-group AMDs of − 8, − 9, and − 10 points at 6, 12, and 26 weeks, respectively, for plasma disc decompression compared with epidural steroid injection. Similar AMDs were reported by Osterman et al.,17 which compared microdiscectomy to physical therapy. Although the within-group improvements persisted over time, the two RCTs that reported long-term outcomes reported between-group differences that were no longer significant.17, 19

Two RCTs reported the Roland–Morris Disability Questionnaire (1 [no impairment] to 24 [worst impairment]) in the short-, medium-, or long-term.16, 18 Peul et al.,18 which compared microdiscectomy to medication and physical therapy, reported significant between-group differences at 8 weeks (AMD − 3.1 points; 95% CI, − 4.3 to − 1.7), but not at 26 weeks, 52 weeks, 2 years, or 5 years. In contrast, McMorland et al.16 reported a repeated measure AMD from 6 to 12 weeks that was not significant (actual value NR, P = 0.199) comparing microdiscectomy to spinal manipulation. Three RCTs reported short- and medium-term SF-36 Physical Functioning subscales (0 [worst impairment] to 100 [no impairment]) with mixed findings.16, 18, 19

We pooled data from these five studies for the Roland–Morris Disability Score and the Oswestry Disability Index. The SMD over the short- to medium-term was − 0.32 (95% CI, − 0.63 to − 0.01; 5 RCTs, 941 participants; I2 = 75.7%). Because of heterogeneity as indicated by the I2 statistic, we removed Gerzsten et al.15 from the analysis. This study used a different surgical intervention and different comparator group compared with the other studies, and the treatment effect was markedly different. Without Gerzsten et al., the SMD was − 0.16 (95% CI, − 0.30 to − 0.03; 4 RCTs, 851 participants; I2 = 0%). Over the long-term (2 years), the SMD was − 0.06 (95% CI, − 0.20 to 0.07; 3 RCTs; 811 participants; I2 = 0%) (Fig. 2).

Figure 2
figure 2

Meta-analysis of RCTs comparing surgery to conservative management.

Quality of Life

Two RCTs16, 17 reported short-term and one RCT17 reported long-term quality of life outcomes (Table 2). Detailed study results are in Appendix Table B-1 of the Online Supplement. McMorland et al.16 reported no between-group difference in repeated measures cumulative total SF-36 scores through 12 weeks (actual values NR, P = 0.382). For Osterman et al.,17 which compared microdiscectomy to physical therapy, the calculated between-group AMDs using the 15D QOL measure (0 [worse] to 1 [better]) ranged from 0.01 to 0.05 at 6 weeks, 12 weeks, 26 weeks, 52 weeks, and 2 years (P values NR). The repeated measure AMD from 6 weeks to 2 years was − 0.03 (95% CI, − 0.07 to 0.01).

Neurologic Symptoms

Two RCTs reported short- or medium-term neurologic symptoms.15, 17 Detailed study results are in Appendix Table B-1 of the Online Supplement. Neither RCT observed significant between-group differences. Gerzten et al.,17 which compared plasma disc decompression to epidural steroid injection, reported similar full muscle strength and normal tactile sensitivity at 6 weeks on each side and at each lumbosacral nerve root level between groups. Osterman et al.17 reported a similar rate of muscle weakness at 6 weeks (53.8 vs. 46.2%), 12 weeks (42.3 vs. 46.2%), and 52 weeks (28.6 vs. 30%) for participants allocated to microdiscectomy compared to physiotherapy.

Return to Work

Five RCTs reported a return to work outcome.4, 14, 15, 17, 19 Detailed study results are provided in Appendix Table B-2 of the Online Supplement. Return to work outcomes were measured by actual return to work, self-reported ability to work, receipt of disability benefits, or other related measures. Except for one RCT rated high risk of bias and conducted in Greece,14 no between-group differences in return to work outcomes were observed.

Safety Outcomes

Seven RCTs reported at least one safety outcome. Detailed findings are in Appendix Table B-3 of the Online Supplement.

None of the six RCTs that reported surgical mortality reported any deaths among participants allocated to surgery.4, 15,16,17,18,19 All-cause mortality was rare and was similar between surgical and nonsurgical participants in the three RCTs reporting this outcome.4, 15, 19

Six RCTs reported surgical morbidity outcomes;14,15,16,17,18,19 dural tears were the most commonly reported morbidity within and among studies. Weinstein et al.19 reported 10 (4.0%) dural tears or spinal fluid leaks, 4 (1.6%) superficial postoperative wound infections, 1 (0.40%) vascular injury, 2 (0.81%) other intraoperative complications, and 9 (3.6%) other unspecified postoperative complications among participants who underwent discectomy or microdiscectomy within 2 years. Gerzten et al.15 reported five (11%) and seven (18%) procedure-related adverse events among participants allocated to plasma disc decompression and epidural steroid injection, respectively (calculated P = 0.55); the authors used a broad definition of adverse events including increased radicular pain, pain at the injection site, and increased weakness. Other surgical morbidities reported by studies included one case of urosepsis (3.6%)17 and one wound hematoma and two dural tears (combined 1.6%).18 Two RCTs reported no surgical complications.14, 16

Five RCTs reported reoperation at a follow-up ranging from 52 weeks to 2 years; the incidence of reoperations varied between 0 and 10.1%.14, 16,17,18,19 Weinstein et al.19 reported 25 (10.1%) reoperations at 2 years among participants who underwent discectomy or microdiscectomy. Peul et al.18 reported that nine (7%) participants allocated to the surgical intervention and eight (12%) participants allocated to conservative management who crossed over had reoperations for recurrent sciatica by 5 years. Osterman et al. reported two reoperations (7.1%) at 2 years among participants who underwent microdiscectomy;17 in the remaining two RCTs, one RCT reported one reoperation (3.2%) at 2 years14 and another RCT reported no reoperations at 52 weeks16 among participants allocated to the surgical intervention.

Only one RCT reported on persistent opioid use. Gerszten et al.15 reported that reduction in use of narcotics at 26 weeks was not significantly different between participants who were allocated to plasma disc decompression compared to those who were allocated to epidural steroid injections (actual values NR).

Cost Outcomes

Three studies provided data related to the cost-effectiveness of surgery compared with nonsurgical treatment;20,21,22 the mean cost per quality-adjusted life year gained from the payor perspective ranged from $51,156 to $83,322 in 2010 U.S. dollars. Additional information related to cost outcomes is available in Appendix C and Table C-1 of the Online Supplement.

DISCUSSION

A summary of our strength of evidence ratings using a modified GRADE approach is in Table 3 with details in Appendix Table E-1 of the Online Supplement. Among adults with symptomatic lumbar radiculopathy, surgery resulted in a meaningfully greater reduction in pain than conservative management in the short- and medium-term (low strength of evidence) but not the long-term (very low strength of evidence). Improvements in physical function and disability were small to trivial in the short- and medium-term (very low strength of evidence) and not meaningfully different in the long-term (very low strength of evidence). Quality of life measures were not different in the short- and medium-term (very low strength of evidence) and inconsistent in the long-term (insufficient evidence). Neurological symptoms (very low strength of evidence) and return to work (very low strength of evidence) were not different at any time point.

Table 3 Strength of Evidence Assessment Comparing Surgery to Nonsurgical Interventions in Persons with Symptomatic Lumbar Radiculopathy

Strength of evidence ratings were downgraded for efficacy outcomes largely due to serious or very serious concerns in the risk of bias and imprecision domains. Five of the included RCTs were rated as high risk of bias and studies were generally underpowered for outcomes, resulting in imprecise estimates. We rated inconsistency as serious for some pain and physical function and disability measures. We assessed the evidence on return to work outcomes as indirect because the definitions used for this measure varied across studies, there may be differences in work culture between the USA and Europe that might impact this measure, and due to concerns that this measure may reflect variation in surgeon practice for release back to work versus an accurate reflection about the ability of an individual to resume work functions.

Significant crossover may explain the lack of statistically significant between-group differences, particularly in the long-term. For example, Peul et al.18 reported that of the 142 participants randomized to conservative management, 55 (39%) underwent surgery by 52 weeks, 62 (44%) underwent surgery by 2 years, and 66 (46%) by 5 years. Crossover occurred between groups in both directions in Weinstein et al.;19 46.1% of participants allocated to surgery did not receive surgery by 26 weeks follow-up, and 36.3% of participants allocated to conservative management received surgery. Another possible explanation is that radicular symptoms may improve over time on their own due to the natural course of disc herniations. Of the few between-group differences that were statistically significant, most were smaller than the minimally important clinical difference reported in the broader literature.

Based on the evidence, surgery may be safe; no studies reported surgical mortality (low strength of evidence) and surgical morbidity was infrequent and largely limited to dural tears (low strength of evidence). The rate of reoperations ranged from 0 to 10% (very low strength of evidence). For participants allocated to surgery, there was no difference in all-cause mortality (low strength of evidence) compared with conservative management, but the evidence on persistent opioid use was insufficient. Strength of evidence ratings for safety outcomes were mostly downgraded for imprecision due to inadequate sample sizes for what are often rare events. Evidence on persistent opioid use was limited to one study with a high risk of bias and was thus rated as insufficient. The findings related to safety outcomes may not be generalizable to clinical practice because participants enrolled in RCTs often have fewer comorbidities than patients treated in usual practice.

The cost-effectiveness of surgery compared to conservative management depends on the willingness to pay threshold used. Among the three studies, cost per QALY gained ranged from $51,156 to $83,322 in 2010 U.S. dollars from a payor perspective (very low strength of evidence); the rating was downgraded for risk of bias and imprecision. Although no definitive consensus exists, costs per QALY gained of less than $50,000 are generally considered cost-effective, costs between $50,000 and $150,000 are considered of intermediate value, and costs more than $150,000 per QALY gained are considered low value, though we note these thresholds are typically applied to costs from a societal perspective.23, 24

Relevant clinical practice guidelines generally recommend considering surgical intervention, particularly discectomy or microdiscectomy and related decompressive procedures, when selected criteria are met. The National Institute for Health and Care Excellence (2016) (UK)25 recommends spinal decompression for sciatica when nonsurgical treatment has not improved pain or function and radiological findings are consistent with sciatica symptoms. Both the American Pain Society (2009)26 and the American Society of Interventional Pain Physicians (2013)56 recommend surgery for cases with lumbar disc prolapse. The North American Spine Society (2012)27 recommends discectomy for patients with lumbar disc herniation with radiculopathy whose symptoms warrant surgical intervention; for patients with less severe symptoms, they note surgery or conservative management appears effective for both short- and long-term relief.

Limitations

We rated five of the included RCTs as high risk of bias. Common sources of bias included inadequate methods of randomization or allocation concealment, lack of blinding (participants, clinicians, or outcome assessors), crossover, and attrition. Though blinding is challenging to achieve for RCTs of surgical interventions, a lack of blinding remains a source of bias, particularly for patient-reported outcomes that are subjective, such as pain and quality of life. Attrition was often high for long-term and occasionally short-term outcomes.

Our inclusion criteria limited eligible studies to those publicly available in English and excluded efficacy outcomes reported before 4 weeks. We only included “intent-to-treat” analyses because “as treated” analyses have a higher risk of bias as participants generally cross over for reasons that are related to outcomes. Weinstein et al.19 reported an as-treated analysis in addition to the intent-to-treat analysis and found favorable effects for discectomy and microdiscectomy compared with conservative management through 2 years of follow-up; the between-group difference at 52 weeks was 15.0 (95 % CI, 10.9 to 19.2) for the SF-36 Bodily Pain subscale, 17.5 (95% CI, 13.6 to 21.5) for the SF-36 Physical Functioning subscale, and − 15.0 (95% CI, − 18.3 to − 11.7) for the Oswestry Disability Index. Finally, because of variations in work culture and healthcare systems between the USA and other countries, the applicability of return to work and cost-effectiveness outcomes is unknown.

Conclusion

Surgery probably reduces pain and possibly function more in the short- and medium-term, but this difference does not persist in the long-term. Although surgery may be safe, it may or may not be cost-effective when compared with nonsurgical interventions depending on a decision maker’s willingness to pay threshold. For patients presenting with symptomatic lumbar radiculopathy, these findings can be used to inform decision-making on surgical versus nonsurgical management for symptom relief.