This article has been amended to include a factual correction. An error was indentified subsequent to its original printing (2011; 34:87), which was acknowledged in an erratum printed in 2012; 35(1):27. The online article and its erratum are considered the version of record.
Femoral neck fractures are common in the elderly; however, agreement on classification and treatment varies. It was hypothesized that computed tomography (CT) would increase agreement for Garden Classification and treatment plan over plain radiographs alone. This article presents results of an online survey completed by 32 respondents at a single institution. The survey was comprised of 5 elderly patients with femoral neck fractures using plain radiographs and CT images. Cases were randomly presented in 3 formats: (1) plain radiograph, (2) CT, and (3) plain radiograph and CT together. Patients were described as low-energy trauma, 65 years or older, and cleared for surgery. Garden Classification and treatment plans were queried. A single case was repeated for intraobserver reliability. Kappa was calculated for inter- and intraobserver reliability. The addition of CT and modification of the Garden Classification (nondisplaced vs displaced) improved interobserver agreement in all cases. Participants were 1.7× more likely (P=.042) to change their Modified Garden Classification when CT was added to plain radiograph compared to plain radiograph added to CT. Treatment agreement was slight to fair. Intraobserver agreement varied from slight to moderate. The rate of arthoplasty recommendations was similar across attending subspecialties; however, arthroplasty-trained surgeons were 20 to 60 times more likely to recommend total hip arthroplasty (P=.009) over hemiarthroplasty compared to nonarthroplasty-trained surgeons. The addition of CT to plain radiograph after femoral neck fracture improves Garden Classification agreement. However, treatment agreement was not impacted by CT. Factors other than improved classification agreement appeared to direct surgeons’ treatment recommendations.
For nearly a century, femoral neck fractures have been thought of as the unsolved fracture owing to the controversies among physicians over the treatment and attempts at classification for this fracture. Currently in North America, femoral neck fractures are classified by plain radiographs of the hip and pelvis according to the Garden system.1
Garden originally described 4 stages of femoral neck fractures based on an anteroposterior (AP) radiograph of the hip.1 Despite widespread use of this system, the inter- and intraobserver reliability has been questioned.2-4 Some authors have suggested the addition of a lateral hip radiograph or collapsing the classification into a binary system of nondisplaced or minimally displaced fractures versus displaced fractures to improve agreement among observers.3,4 In addition, little agreement remains among surgeons regarding the treatment of specific fractures.4 Surgical treatment plans may be influenced by factors such as age, rather than by fracture classification.4 Nevertheless, surgeons continue to seek a classification that is simple, reproducible, and that predicts treatment.
Computed tomography (CT) has been demonstrated to improve classification agreement and treatment plans for various fracture patterns.5,6 It was hypothesized that the addition of CT to the radiographic evaluation of femoral neck fractures would improve the interobserver agreement for the Garden Classification and improve agreement for the surgical treatment plan.
Materials and Methods
Following institutional board approval, surgical case logs from 3 hospitals within a single health system were retrospectively reviewed to identify patients treated for femoral neck fractures from January 1, 2008 to June 1, 2009. The first 5 cases of low-energy, nonpathologic, femoral neck fractures in patients 65 years or older with high quality AP pelvis, AP hip, and lateral hip radiographs as well as a CT scan through the fracture were selected for inclusion in the study.
The survey was limited to 5 after a pilot of the study interface found 5 cases translated to an average of 12 minutes to complete the survey. This was done to increase compliance with participation in our survey so it did not become excessively long.
We then designed an online survey. The survey consisted of images for each of the 5 cases presented randomly in 3 formats:
| || |
| ||Figure 1: AP pelvis radiograph (A). AP hip radiograph (B). Lateral hip radiograph displayed in survey interface (C). |
- Plain radiograph only, consisting of an AP pelvis radiograph, AP hip radiograph, and a lateral hip radiograph (Figure 1).
- Computed tomography only, consisting of 10 consecutive axial images through the fracture site and a single coronal image centered at the fracture (Figure 2).
- lain radiograph and CT together.
| || |
|Figure 2: Axial CT cut (A). Coronal CT cut displayed in survey interface (B). |
For each set of images, 2 questions were presented to the study participants:
- What is the Garden Classification of the Fracture?
- What treatment do you recommend?
A depiction of each of the 4 Garden Classifications with a verbal description was located below the corresponding radio button used to select a response for Garden Classification. Additionally, a radio button used for response selection was located below the 4 available treatment choices: (1) closed reduction and percutaneous pinning, (2) open reduction and internal fixation (ORIF), (3) hemiarthroplasty, or (4) total hip arthroplasty (THA).
The online survey prevented a participant from recording a response without viewing all available images, and participants were not able to review previously presented cases. The plain radiograph-only images and CT-only images from 1 patient were randomly selected to be repeated at random during the survey to assess intraobserver reliability.
Participation in the study was voluntary and was offered to all 16 fourth- and fifth-year residents, 3 fellows, and 26 attending physicians, regardless of subspecialty, at a single institution via email invitation. The study was available online to the participants for 6 weeks. Upon following the email link, participants created a unique user name and password allowing them the option of completing the survey in more than 1 sitting. Informed consent was obtained prior to gaining access to the survey.
The clinical scenario presented to the participants was that of a patient 65 years or older sustaining low-energy trauma, who was medically stable for anesthesia and surgery. No additional clinical information was provided.
The free-marginal kappa (+1.0 representing total agreement and 0 representing no agreement) was used for statistical analysis to measure interobserver and intraobserver agreement.7 This measure is used to assess agreement between multiple raters for categorical variables when raters are not required to assign a certain number of responses to each category.7 Classification results were analyzed using the original Garden classification (4 categories) and the modified Garden classification (displaced or nondisplaced). Additionally, treatment plans were analyzed as all treatment choices (ORIF, CRPP, hemiarthroplasty or total hip arthroplasty) as well as a modified treatment plan of either fixation or arthroplasty. Kappa values were classified according to Landis and Koch8 (Table 1). Differences between groups were calculated using a Chi squared test with Yate’s correction or Fisher’s exact test in cases where cell values were <5. All statistics were calculated with SPSS version 16.0 (SPSS Inc, Chicago, Illinois).
Fourteen senior residents, 1 fellow, and 17 attending orthopedic surgeons completed the survey. The attending subspecialties included: adult reconstruction (4), spine (2), hand (2), general (2), trauma (2), tumor (1), pediatrics (1), sports (1), foot and ankle (1), and other (1). The overall interobserver agreement for the Garden Classification and the Modified Garden Classification is shown in Table 2. For plain radiograph alone, the agreement for Garden Classification rated as poor (Kappa=0.137); however, agreement improved to fair with CT alone or CT and plain radiograph together (Kappa=0.216 and Kappa=0.223, respectively). Agreement also improved with modification of the Garden Classification in all patients, achieving moderate agreement when CT, either alone or combined with plain radiograph, was employed to view the fracture (Kappa=0.467 and Kappa=0.430, respectively). Computed tomography, either alone or combined with plain radiograph, improved agreement over plain radiograph alone in all cases.
Agreement also improved for the Garden Classification and modified Garden Classification when attending orthopedic surgeons were analyzed separately. These results are shown in Table 3. The greatest agreement attained was by attending orthopedic surgeons with CT, either alone or combined with plain radiograph, and modification of the Garden classification, achieving moderate agreement (Kappa=0.547 and Kappa=0.505, respectively).
Treatment agreement varied from poor to fair and was independent of the imaging modality employed. Additionally, treatment agreement did not improve with modification of the treatment options to fixation versus arthroplasty, indicating significant variation between the treatment recommendations among surgeons. Additionally, the improved agreement in fracture classification with the addition of CT did not improve treatment agreement.
The percentage of participants who changed their Garden Classification or Modified Garden Classification after the addition of CT to plain radiograph or plain radiograph to CT was investigated. Subjects were 1.7× more likely to change their Modified Garden Classification when CT was added to plain radiograph compared to plain radiograph added to CT (P=.042), indicating that CT was more likely to guide a partipant’s fracture classification than plain radiograph.
There were also cases in which a change in classification when an individual participant was presented with a different imaging modality led to a treatment plan change. The rate of change in treatment plan after a change in classification varied little across imaging modality. However, a change in Modified Garden Classification for an individual surgeon led to a change in treatment for that surgeon 73% to 81% of the time. This would indicate that for an individual surgeon, a change in perceived displacement frequently impacts their treatment plan.
Finally, the effect of an attending surgeon’s subspecialty focus on treatment was analyzed. The rate of recommendation for arthroplasty according to attending subspecialty (arthroplasty trained, trauma trained, and other subspecialty) for both hemiarthroplasty and THA combined as well as for THA alone are shown in Table 4. The arthroplasty recommendation rate was similar across subspecialty (P>.8) and imaging modality (P>.8). However, arthroplasty-trained surgeons were 20× more likely, with plain radiograph only (OR 20.7; 95% CI: 2.4, 153.5; P<.010), to 60× more likely with CT (OR 60.7; 95% CI: 6.5, 501.1; P<.001) to recommend THA compared to hemiarthroplasty if they felt arthroplasty was indicated compared to trauma-trained surgeons and surgeons with other subspecialty training.
Femoral neck fractures are common in the elderly and can have devastating effects on the functional status for these patients.9,10 It is estimated that with the increasing elderly population, there will be >650,000 hip fractures per year by 2050.11 Despite the frequency of these fractures, the lack of a reliable and easily reproducible fracture classification predicting treatment highlights our continued lack of understanding of this type of fracture. In many respects, it remains the unsolved fracture.12
Multiple femoral neck fracture classifications have been proposed, but the classification according to Garden remains the most commonly employed classification in North America. Over the years, however, its reliability, reproducibility, and ability to predict treatment has been questioned. Frandsen et al2 concluded that observers had poor ability to delineate the stages of the Garden classification after presenting 100 preoperative radiographs of femoral neck fractures to orthopedic surgeons and radiologists in various stages of training. Complete agreement on classification among all observers was found in only 22 cases, while in 33 fractures, the observers disagreed as to whether the fracture was displaced. Kappa values were not reported.
Thomsen et al3 investigated the reliability of the Garden classification by presenting 96 sets of radiographs (AP and lateral) to 6 reviewers on 2 occasions. Only fair agreement (Kappa=0.39 and Kappa=0.40) was found for each reading. However, this improved to substantial agreement (Kappa=0.68 and Kappa=0.67) when Stage I and II were combined and compared to Stage III and IV combined. It was concluded that the Garden classification had poor reliability.
Oakes et al4 investigated the effect of the Garden classification on the proposed treatment plan. Fair agreement was found for the Garden Classification (Kappa=0.43) and this changed little with the addition of a lateral radiograph (Kappa=0.43). Similar to Thomsen et al,3 agreement improved to substantial when Stage I and II were combined and compared to Stage III and IV combined (Kappa=0.68). Additionally, it was found that treatment plan was rarely impacted by a change in classification. To improve reliability, it was suggested that the Garden classification be modified to 2 stages: valgus impacted or nondisplaced versus displaced. Additionally, it was concluded that patient age may play a greater role in determining treatment plan than fracture classification.
While not previously used for femoral neck fractures, CT has been evaluated in the classification and treatment of other fractures. Chan et al5 presented 21 cases of tibial plateau fractures to orthopedists and radiologists. The participants were first queried on their classification and treatment when presented with plain radiographs. This was followed by similar questioning when the CT scans were presented. It was found that while the addition of CT did not improve agreement for the Schatzker classification, the addition of CT improved the ability to perceive comminution and depression, ultimately leading to improved agreement in treatment plan. Katz et al6 found similar increased agreement in treatment plan, as well as improved sensitivity to detection of comminution and fracture gapping when CT was compared to plain radiograph in the evaluation of 15 distal radius fractures.
As in the report by Chan et al5 on tibial plateau fractures and the Schatzker classification, this study evaluated the effect of CT on a fracture classification originally devised from plain radiographs. In fact, Garden’s original description was based on AP radiographs with a focus on the trabeculae within the acetabulum and femoral head to guide staging. However, Garden’s descriptions of the fracture patterns do not preclude the addition of imaging in other planes in determining the classification stage and one could suppose that additional imaging perspectives should increase understanding of the fracture and likely classification agreement. However, previous authors have reported little effect on Garden classification agreement with the addition of a lateral radiograph.4 On the contrary, this study found that additional imaging in the form of axial CT cuts and a single coronal cut (simulating the AP radiograph), whether viewed alone or in addition to plain radiographs, improved agreement in all cases. This improvement was enhanced when the Garden classification was modified, as had been previously described, to minimally displaced versus displaced. This agrees with previous reports and a survey by Zlowodzki et al13 that found that only 39% of surgeons surveyed thought that they could distinguish between all 4 Garden classification types, while 96% felt they could differentiate between Garden I/II (minimally displaced) and Garden III/IV (displaced). Computed tomography provides improved spatial understanding of the fracture and its effect is most profound in detecting displacement, as had been suggested by Chan et al5 and Katz et al.6
Level of training also impacts agreement for the Garden classification. Residents and fellows achieved only fair agreement when CT was used in the Modified Garden classification, while attending surgeons all of types of subspecialty training achieved moderate agreement with CT and modification of the Garden Classification. This agrees with previous fracture classification studies which have found improved agreement among attending surgeons.5
In regards to treatment plan, the data demonstrate poor overall agreement regardless of imaging or modification of the treatment to binary categories (fixation and arthroplasty). Even when interobserver agreement for classification improved for the study group as a whole, especially for the Modified Garden Classification, this did not translate to improved agreement in treatment plan. This distinction between minimally displaced verses displaced has been suggested as an important factor in determining treatment in the elderly. However, it is clear that factors other than fracture classification, such as surgeon preference, subspecialty training, patient age, and activity level must have a greater influence on treatment plan than fracture classification alone. This is best illustrated by reviewing patient 3. This case achieved almost perfect agreement for the Modified Garden Classification with nearly universal agreement that the fracture was displaced; however, treatment agreement changed little, ranging from only slight to fair. Even when surgeons could agree on fracture classification, they could not agree on method of treatment
In an effort to evaluate the contribution of plain radiograph compared to CT on the Garden classification and Modified Garden classification, the percentage of classification changed when CT added to plain radiograph and plain radiograph added to CT was analyzed. No difference in the number of classification changes existed for the Garden classification; however, reviewers were 1.7× more likely (P=.042) to change their Modified Garden classification when CT was added to plain radiograph compared to plain radiograph added to CT. This suggests that CT played a more significant roll in determining the Modified Garden Classification (ie, perception of displacement) when viewed in combination with plain radiograph.
In some instances, changes in classification were found that led to changes in treatment plan. For the Garden classification, 50% to 55% of the time, a change in classification resulted in a change in treatment plan and no statistical difference existed across imaging modalities. However, for the modified Garden Classification, a change in classification resulted in a change in treatment plan 73% to 81% percent of the time, regardless of imaging modality. This would suggest that despite poor agreement on treatment plans across surgeons, a change in Modified Garden Classification for an individual surgeon, representing a change in perceived displacement, contributes substantially to an individual’s treatment plan. Thus, surgeons may consider their perceived displacement of the fracture when formulating a treatment plan despite disagreeing on displacement and preferred treatment with other surgeons.
Other studies have suggested that CT would better detect subtle comminution and displacement not revealed on plain radiograph.5,6 It was hypothesized that for femoral neck fractures, this would lead to an increase in the number of fractures with higher Garden classification. The data do not support this supposition, as there was no significant difference across imaging modality in the number of fractures that reclassified to a higher Garden classification. It is possible that in certain cases of femoral neck fracture, plain radiograph may exaggerate the perception of fracture displacement.
When reviewing treatment of femoral neck fractures, increasing literature suggest that THA may provide improved outcomes for specific populations of elderly patients.14,15 It has been thought that surgeons not trained in arthroplasty would be less likely to perform THA for fracture; however, the affect of subspecialty training on treatment plan has not been previously studied. While the number of attending surgeons in our sample is small, clear associations were evident. The rate of recommendation of arthroplasty was similar across all imaging modalities and subspecialties (arthroplasty-trained, trauma-trained, other), however, arthroplasty-trained surgeons were 20 to 60× more likely (P<.009) to recommend THA over hemiarthroplasty than nonarthroplasty trained surgeons when arthroplasty was the preferred treatment. At this subspecialized academic medical center, this finding most likely results from the differences in the scope of practice and familiarity with inserting an acetabular cup by surgeons that rarely perform THA. However, a multicenter investigation of both community and academic institutions would best address this finding.
The relatively low intraobserver agreement certainly limits this study. In an effort to keep this voluntary, uncompensated survey to a reasonable completion time frame, a single randomly repeated case, consisting of plain radiograph only and CT only, was selected to evaluate intraobserver reliability. This case proved also to be one of the most controversial cases with very poor interobserver reliability as well. Repeating the study in its entirety at an additional sitting would have likely improved intraobserver agreement. Additionally, to our knowledge, this is the first study to assess classification agreement through an online survey that can be completed at anytime and even allows the participant to stop participation and return to the survey at a later time via a personal login and password. Since the setting in which the participant took the survey is unknown, it is possible that the partipants’ attention to the study would have been compromised compared to administering the survey in a more controlled environment, as previous studies had. Moreover, while the study was voluntary, it was offered to surgeons at the author’s home institution. Subjects may have felt obligated to participate, which potentially would have affected their true interest in the study and ultimately the attention and focus participants devoted. These factors could also have contributed to the low intraobserver agreement.
Our study design also differed from previous investigations of classification agreement by surveying numerous participants with relatively few cases. This prevented an analysis of the effect of imaging modality on specific fracture patterns. For instance, imaging modality may be of little benefit in widely displaced or nondisplaced fractures, but an effect may be evident for those with intermediate displacement. Additionally, it presented the possibility that a single, outlying case could greatly affect the analysis of the group.
As with any survey with repeated images, recall bias is possible, however the random presentation of images for each partipant as well as the difficulty in correlating an independent axial CT image with the corresponding plain radiograph likely limited this effect.
Finally, the brief clinical scenario presented with the survey may have been too broad. Oakes et al4 had concluded that age was likely of greater importance than fracture pattern in determining treatment plan. Our description of a patient over the age of 65 without including activity level likely presents an unrealistic situation, as many surgeons would treat an active 65-year-old patient differently from a sedentary 90-year-old patient.
Despite methodological limitations, this study provides evidence that CT, either alone or in combination with plain radiograph, improves agreement for the Garden and Modified Garden Classifications. Additionally, it supports previous work that demonstrated greatly improved agreement though modification of the Garden Classification. However, this study highlights the vast disparities in the treatment of femoral neck fractures, even when near universal agreement for classification can be attained for a specific fracture. These disparities, especially the differences in treatment patterns between subspecialties, should be further investigated.
- Garden RS. Low-angle fixation in fractures of the femoral neck. J Bone Joint Surg Br. 1961; 43(4):647-663.
- Frandsen PA, Andersen E, Madsen F, Skjødt T. Garden’s classification of femoral neck fractures. An assessment of inter-observer variation. J Bone Joint Surg Br. 1988; 70(4):588-590.
- Thomsen NO, Jensen CM, Skovgaard N, et al. Observer variation in the radiographic classification of fractures of the femur using Garden’s system. Int Orthop. 1996; 20(5):326-329.
- Oakes DA, Jackson KR, Davies MR, et al. The impact of the Garden Classification on Proposed Operative Treatment. Clin Orthop Relat Res. 2003; (409):232-240.
- Chan PS, Klimkiewicz JJ, Luchetti WT, et al. Impact of CT scan on treatment plan and fracture classification of tibial plateau fractures. J Orthop Trauma. 1997; 11(7):484-489.
- Katz MA, Beredjiklian PK, Bozentka DJ, Steinberg DR. Computed tomography scanning of intra-articular distal radius fractures: does it influence treatment? J Hand Surg Am. 2001; 26(3):415-421.
- Brennan RL, Prediger DJ. Coefficient Kappa: Some uses, misuses, and alternatives. Educ Psychol Meas. 1981; 41(3):687-699. http://epm.sagepub.com/content/41/3/687.abstract. Accessed May 17, 2010.
- Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977; 33(1):159-174.
- Bentler SE, Liu L, Obrizan M, et al. The aftermath of hip fracture: discharge placement, functional status change, and mortality. Am J Epidemiol. 2009;170(10):1290-1299.
- Maggi S, Siviero P, Wetle T, et al. A multicenter survey on profile of care for hip fracture: predictors of mortality and disability. Osteoporos Int. 2010; 21(2):223-231.
- Shah AK, Eissler J, Radomisli T. Algorighms for the treatment of femoral neck fractures. Clin Orthop Relat Res. 2002; (399):28-34.
- McCarroll HR. Has a solution for the unsolved fracture been found? Problems and complications of fractures of femoral neck. J Am Med Assoc. 1953; 153(6):536-540.
- Zlowodzki M, Bhandari M, Keel M, Hanson BP, Schemitsch E. Perception of Garden’s classification for femoral neck fractures: an international survey of 298 orthopaedic trauma surgeons. Arch Orthop Trauma Surg. 2005; 125(7):503-5.
- Ravikumar KJ, Marsh G. Internal fixation versus hemiarthroplasty versus total hip arthroplasty for displaced subcapital fractures of femur – 13 year results of a prospective randomized study. Injury. 2000; 31(10):793-797.
- Skinner P, Riley D, Ellery J, Beaumont A, Coumine R, Shafighian B: Displaced subcapital fractures of the femur: A prospective randomized comparison of internal fixation, hemiarthroplasty and total hip replacement. Injury. 1989; 20(5):291-293.
Dr Melvin is from the Department of Orthopedic Surgery, Carolinas Medical Center, Charlotte, North Carolina; Dr Matuszewski is from the Department of Orthopedic Surgery, University of Maryland, College Park, Maryland; and Drs Scolaro, Baldwin, and Mehta are from the Department of Orthopedic Surgery, University of Pennsylvania, Philadelphia, Pennsylvania.
Drs Melvin, Matuszewski, Scolaro, Baldwin, and Mehta have no relevant financial relationships to disclose.
Correspondence should be addressed to: J. Stuart Melvin, MD, Department of Orthopedic Surgery, Carolinas Medical Centre, 1320 Scott Ave, Charlotte, NC 28203 (firstname.lastname@example.org).