Dr Wilder is from the Department of Physical Medicine and Rehabilitation, University of Virginia, Charlottesville, Va; Drs Heather Vincent and Kevin Vincent are from the Division of Physical Medicine and Rehabilitation, Department of Orthopaedics and Rehabilitation, UF & Shands Orthopaedics and Sports Medicine Institute, University of Florida, Gainesville, Fla; Dr Stewart is from Norlanco Medical Associates Elizabethtown, and Dr Pack is from the Department of Physical Medicine and Rehabilitation, The University of Pittsburgh, Pittsburgh, Pa.
Address correspondence to Kevin R. Vincent, MD, PhD, Division of Physical Medicine and Rehabilitation, Department of Orthopaedics and Rehabilitation, UF & Shands Orthopaedics and Sports Medicine Institute, University of Florida, PO Box 112727, Gainesville, FL 32610; e-mail: firstname.lastname@example.org.
The processes of accumulation of bone micro damage and bone remodeling, both resulting from bone strain, contribute to the formation of stress fractures.1 Stress fractures account for 0.7% to 20% of all injuries presenting to sports medicine clinics.2 Track and field athletes have the highest incidence of stress fractures, compared with athletes in other major sports.3 Despite the variation of stress fracture prevalence in specific bony sites, the most common sites is the tibia, followed by the metatarsals and fibula.4 The site of stress fractures also appears to vary among sports. For example, among track athletes, navicular stress fractures predominate; tibial stress fractures are most common in distance runners, and metatarsal stress fractures predominate in dancers.2,5,6
Patients with stress fractures typically present with a history of insidious onset of activity-related pain that progressively worsens over time. The most obvious physical examination feature is localized bony tenderness. Special clinical tests, such as the one-legged hop test and the fulcrum test, may be useful in detecting stress fractures of the lower extremities. Other commonly used tests include imaging studies, such as radiographs, bone scans, and magnetic resonance imaging (MRI). In approximately two-thirds of symptomatic patients, radiographs are initially negative, and only half ever develop positive radiograph findings.7 The most common sign in early stress fracture is a region of focal periosteal bone formation. The gray cortex sign (a cortical area of decreased density) may also be seen.8,9 In those cases where clinically indicated, advanced imaging such as a bone scan or MRI should be used to confirm the diagnosis. Bone scans can confirm the stress fracture diagnosis as early as 2 to 8 days after the onset of symptoms.10,11 Magnetic resonance imaging has also shown promise in grading progressive stages of stress fractures severity.8,10
In cases where these tools are not readily available, the application of ultrasound or a vibrating tuning fork may be helpful in the diagnosis of stress fractures by increasing pain at the fracture site; however, there is a paucity of data regarding the clinical utility of these commonly used tools. Although the use of tuning forks to diagnose stress fractures is a widely held practice, only 1 scientific investigation has examined this diagnostic technique. Lesho12 compared the performance of a 128-Hz tuning fork test with nuclear scintigraphy for the identification of tibial stress fractures. Sensitivity and specificity of the tuning fork tests were 75% and 67%, respectively. The positive and negative predictive values were 77% and 63%, respectively. It was concluded that the tuning fork test was not sensitive enough to rule out a stress fracture on the basis of a negative test. However, it was recommended that in a setting in which there was a moderate to high risk of stress fractures, it might be reasonable to avoid bone scan by instituting treatment for tibial stress fractures when the tuning fork test was positive. The validity or reliability of the tuning fork test in the detection of acute simple fractures has therefore not been thoroughly examined. Therefore, the purpose of this pilot study was to examine the diagnostic ability, sensitivity, and specificity of tuning forks (of 3 different frequencies) on lower limb stress fractures in trained runners.
Materials and Methods
A total of 45 consecutive runners suspected of having stress fractures were evaluated through the Runners Clinic at our institution (Table 1). Each participant read and signed an informed consent document. Following a clinical examination for leg pain symptoms, each athlete underwent the tuning fork test and appropriate imaging (Table 2). Patients were included in the study if a stress fracture was suspected based on clinical history and examination. Typical history includes a change in exercise volume prior to the start of symptoms, pain that becomes progressively earlier in an exercise bout, pain with a one-legged hop, and joint tenderness with palpation. All consecutive patients with a suspected stress fracture who met the previous criteria were offered enrollment. The only exclusion criterion was failure to meet the requirements of clinical suspicion, as described. All study procedures were approved by the institutional review board, and all procedures complied with the treatment of human subjects as specified by the American College of Sports Medicine.
Table 1: Participant and Fracture Profile of Runners Presenting with Leg Pain Symptoms (N = 45)
Table 2: Imaging Techniques Performed and Time Frame of Imaging Techniques from the Onset of Pain in Runners with Leg Pain Symptoms (N = 45)
Plain radiographs initially were obtained from 44 of the 45 participants. Advanced imaging (bone scan or MRI) was also obtained in 34 of the participants. For negative radiographs, this was indicated clinically to evaluate for the presence or absence of stress fracture. For positive plain radiographs, advanced imaging was additionally useful in confirming acuity and grading the lesion, as well as allowing correlation of the tuning fork test with positive imaging results.
Tuning Fork Testing
The tuning fork test consisted of placing a vibrating tuning fork on the skin directly above the point of maximal bone tenderness for 10 seconds. Three trials were performed on each participant, one each with tuning forks of different vibration frequencies (128, 256, and 512 Hz). The order of tuning fork placement was performed randomly among the participants. The random order was determined using a computer program that generated a list of random assignments for each participant and the tuning fork order. Every effort was made by the examiner to apply equal pressure during each trial. All tuning fork measurements were obtained by the first author (R.P.W.).
Each participant used a standardized pain rating scale to grade his or her perception of any increased pain as a result of the application of the vibrating tuning fork, compared with an initial trial without vibration. Standard directions for the rating scale were provided by the examiners to each participant on subjective pain ratings to reduce intraparticipant error. Pain was rated as follows: 0 = no pain; 1 = mild pain; 2 = moderate pain; 3 = severe pain.
Data were analyzed using SPSS software (version 13.0; SPSS, Inc, Chicago, Ill). Frequencies, means, and standard deviations were determined for specific descriptive variables of the participant pool. A Kruskal-Wallis test was used to determine whether differences existed among the pain ratings generated by the 3 tuning forks. Univariate analyses of variance were performed to determine whether differences in pain ratings occurred with each tuning fork. Tukey post hoc tests were used when appropriate to determine where differences existed among group comparisons. Odds ratios were calculated to determine the odds of a positive imaging finding in the presence of a mild, moderate, or severe pain rating.
Cox proportional hazards regression analysis was used to examine the association between the risk of stress fracture diagnosis and the independent variables of interest.13 Cox proportional hazards regression models for stress fracture development were used to assess each tuning fork frequency, and separate regression models were generated for each tuning fork. Pain ratings (0–3) were analyzed as time-dependent covariates whose values could change during the injury course. The level of significance was set at P < .05 for all statistical tests.
The study included 45 participants (19 men, 26 women) (Table 1). The majority of stress fractures were located in the tibia, followed by the metatarsals (Table 1). All participants underwent ≥1 imaging technique following the pain assessment with the tuning forks. A total of 44 of 45 participants underwent radiographic imaging, whereas 84% of all participants had advanced imaging tests (MRI or bone scan) (Table 2). The time from the initial onset of symptoms to time of the imaging assessment (lag time) ranged from 0 to 432 days. Of the 45 participants enrolled in this study, 35 had stress fractures, confirmed by radiographic imaging.
Pain Ratings Induced by the Tuning Forks
The 3 tuning forks generated different average pain ratings. The 256-Hz tuning fork produced overall higher average pain scores (1.73±0.96) than either the 128-Hz or 512-Hz tuning forks (1.17±1.05 and 0.82±1.05; P = .0001). Mean pain ratings produced by each tuning fork in participants with positive and negative imaging findings for stress fractures are presented in Figure 1. The average pain ratings for all 3 tuning forks combined were higher in participants with positive imaging findings than in those with negative imaging findings (1.33±.08 versus 0.8±1.01, respectively; P = .056).
Figure 1. Pain Ratings Induced by the 3 Tuning Forks (mean±SD).
Positive Pain Ratings Predict Positive Imaging Findings
Regardless of tuning fork used, higher pain ratings were associated with a positive radiographic, MRI, or bone scan imaging finding (r = 0.156, P = .056). The odds ratio of any positive pain score (rating of 1–3) to positive imaging evidence of stress fracture was 2.67, compared with a pain score of 0. A score of 3 (severe pain) from any tuning fork was associated with a higher risk of imaging evidence of stress fracture, compared with scores 0 through 2; the odds ratio of demonstrating imaging evidence of stress fractures was 5.91 with a pain rating of 3, compared with pain ratings of <3.
Sensitivity and Specificity of the Tuning Fork Measures
Both sensitivity and specificity of the 3 tuning forks to detect imaging evidence of a stress fracture from each imaging technique were determined (Table 3). Among the 3 tuning forks, the 256-Hz fork generated the highest sensitivity scores, whereas the 512-Hz fork produced the lowest sensitivity scores. However, in contrast, the specificity scores of the 256-Hz fork were the lowest, compared with the remaining 2 tuning forks. Therefore, the false-positive rate of pain to predict stress fracture imaging was high with the 256-Hz fork. Positive predictive values (PPV) and negative predictive values (NPV) were determined for each tuning fork for each separate imaging test type and for all imaging types combined. Although all of the tuning forks had good PPV scores for the MRI, the PPV values for radiographs and bone scans were inconsistent. Negative predictive values were highest for all 3 tuning forks for the radiograph but were inconsistent for the MRI and bone scan tests. The overall PPV and NPV of each tuning fork to predict positive stress fracture findings were not high, regardless of imaging technique (range, 51.4–58.6).
Table 3: Sensitivity and Specificity of the Tuning Forks to Detect True Positive Stress Fracture Imaging Results
The cumulative hazard of developing a positive stress fracture imaging finding in participants who responded with any pain value to any of the 3 tuning forks over the time since initial pain symptoms is shown in Figure 2. Greater duration of pain symptoms detected by a tuning fork was associated with an increased risk of a positive diagnosis of stress fracture. Cumulative hazards for a positive stress fracture imaging finding are shown for each tuning fork on the basis of the pain ratings obtained from the tuning fork (Figures 3A through 3C). Of all tuning forks, stress fractures were formally diagnosed more frequently with mild to severe participant pain ratings than with no pain rating for the 256-Hz fork (Figure 3B). Similar hazard patterns were not consistently observed with the 128-Hz and 512-Hz forks.
Figure 2. Cox Regression Hazard of Developing a Positive Stress Fracture Imaging Finding in Participants Who Responded to Any Tuning Fork with Pain Ratings of 1 to 3 (N = 36; “fork Responders”).
Figure 3. Cox Regression Hazard of Developing a Positive Stress Fracture Imaging Finding when Using the 128-Hz Tuning Fork (A), the 256-Hz Tuning Fork (B), and the 512-Hz Tuning Fork (C).
This pilot study provided novel data regarding the diagnostic value of tuning forks for stress fractures in runners. The evidence suggests that pain response (rating from 1–3) induced by tuning forks at the site of bone pain is related to a positive imaging finding; a pain rating of 3 further strengthens this relationship. Given that the 256-Hz fork induced higher average pain ratings than did the 128-Hz or 512-Hz forks, and high pain scores can better predict a stress fracture finding, we can infer from these data that the 256-Hz fork may be more useful in predicting actual stress fractures than the remaining tuning forks. Promising patterns of pain ratings elicited by the 256-Hz fork to the development of stress fractures emerged over the duration of symptoms, compared with the 2 other tuning forks, such that higher tuning fork induced pain ratings were related with an increased risk of a positive stress fracture finding, especially as the duration of the symptoms increased. The other 2 tuning forks did not elicit this same response.
Earlier studies attempted to find a cost-efficient, expeditious test for stress fracture diagnosis. Several studies have examined the association between ultrasound-induced pain and stress fractures.14–16 The accuracy of ultrasound to correctly identify stress fractures has been mixed. Two studies reported accuracy as high as 92.6% to 96%.15,16 In contrast, other ultrasound studies have produced less promising results, with sensitivity to detect stress fractures ranging from 43% to 75%.14,17,18 Romani et al19 compared visual analog pain rating values following application of a 1-MHz continuous ultrasound to Fredericson grading ratings on MRI. Overall, participants were correctly classified by ultrasound in only 40% of cases. None of the participants found to have stress fracture by MRI were correctly classified by ultrasound. The authors concluded that pain induced by ultrasound was not particularly sensitive for detecting tibial stress fractures.19
Magnetic resonance imaging has been touted as the single best technique for assessment of patients with suspected tibial stress injuries.20 The presence of a periosteal reaction on radiographs at the site of pain symptoms may predict a high-grade stress injury by MRI criteria.21 Gaeta et al20 compared the sensitivity and specificity of MRI, computed tomography, and bone scintigraphy (bone scan) for detecting tibial stress fractures. The sensitivity of the bone scan was 74%. Magnetic resonance imaging sensitivity, specificity, accuracy, PPV, and NPV for tibial stress fractures were 88%, 100%, 90%, 100%, and 62%, respectively.20 The sensitivity, specificity, accuracy, PPV, and NPV for the computed tomography scan were 42%, 100%, 52%, 100%, and 26%, respectively. Although MRI is highly accurate, it is costly and requires time for the patient and may not be available in all communities.
The tuning fork technique in this study produced a mixed range of specificity and sensitivity, which does not match MRI capability. Depending on the specific tuning fork, the PPV of stress fractures compared with MRI and bone scan were 72.2% to 90%, and 66.6% to 85.7%, respectively. Positive predictive values were lowest when radiograph results were used (31.5%–47.6%). Lesho12 indicated similar ranges of specificity and sensitivity of detecting fractures found by positive bone scans when a 128-Hz tuning fork was used. The study also concluded that this technique cannot necessarily replace imaging techniques for detection of stress fractures but may be useful when accessibility to imaging equipment is low. Advantages of the tuning fork technique for identifying stress fractures include cost-efficiency, no side effects of the technique, accessibility to all patients, and ease of administration. Although the specificity and sensitivity of the 3 forks varied, these findings may be a consequence of varying lag times of the imaging assessments, medical histories of the patient, and several different anatomical sites of fracture.
Study Limitations and Strengths
Limitations to this study deserve comment. Despite promising results, this small sample size precluded conclusion as to which tuning fork best actually predicted a positive imaging finding. These findings need to be repeated in a larger cohort with similar sites of injury (eg, all tibial bone sites) or a sample large enough to analyze statistical cell sizes of different bone types. Long lag times for some participants before imaging evidence may have prevented accurate assessment of bone fracture, and this may have lowered the specificity. Finally, the small sample size precluded making conclusions about bone type and imaging sensitivity due to too few observations with the different anatomical bone sites. The strength of the study was identification of a relationship between tuning fork-induced pain and positive imaging for stress fractures despite the wide variation within the study group.
These results suggest that the tuning fork test could be used as an adjunct to clinical examination only and that a negative test should not supplant appropriate imaging for the diagnosis of stress fractures in runners. In the clinic, patients could be requested to quantify their perception of pain using the 4-point scale (no pain increase, mild, moderate, severe) or with a standard 10-point scale. These data suggest that any tuning fork-induced pain rating is associated with a positive imaging finding, but a reported pain value of 3 can be considered more highly predictive for the presence of stress fracture. The tuning fork test may be most useful in those cases where imaging may not be readily available, such as in the field where a decision must be made regarding participation in running. Given its ability to produce a greater pain response, we propose that the 256-Hz tuning fork may be the most useful frequency in the clinical setting.
- Brukner PD, Bennell KL. Stress fractures. In: O’Connor FG, Wilder RP, eds. Textbook of Running Medicine. New York, NY: McGraw-Hill; 2001:227–256.
- Bergman AG, Fredericson M. MR imaging of stress reactions, muscle injuries, and other overuse injuries in runners. Magn Reson Imaging Clin N Am. 1999;7:151–174, ix.
- Johnson AW, Weiss CB Jr, Wheeler DL. Stress fractures of the femoral shaft in athletes—more common than expected. A new clinical test. Am J Med. 1994;22:248–256.
- Brukner P, Bennell K, Matheson G. Stress Fractures. Carlton, Victoria, Australia: Blackwell Science; 1999.
- Brukner P, Bradshaw C, Khan KM, White S, Crossley K. Stress fractures: A review of 180 cases. Clin J Sport Med. 1996;6:85–89.
- Matheson GO, Clement DB, Mckenzie DC, Taunton JE, Lloyd-Smith DR, MacIntyre JG. Stress fractures in athletes: A study of 320 cases. Am J Sports Med. 1987;15:46–58. doi:10.1177/036354658701500107 [CrossRef]
- Daffner RH, Pavlov H. Stress fractures: Current concepts. Am J Roentgenol. 1992;159:245–252.
- Fredericson M. Stress fractures of the lower extremities. In: O’Connor FG, Wilder RP, Salis RE, St. Pierre P, eds. Sports Medicine: Just the Facts. New York, NY: McGraw-Hill; 2005:390–396.
- Mulligan ME. The “gray cortex”: An early sign of stress fracture. Skeletal Radiol. 1995;24:201–203. doi:10.1007/BF00228923 [CrossRef]
- Fredericson M, Bergman AG, Hoffman KL, Dillingham MS. Tibial stress reaction in runners. Correlation of clinical symptoms and scintigraphy with a new magnetic resonance imaging grading system. Am J Sports Med. 1995;23:472–481. doi:10.1177/036354659502300418 [CrossRef]
- Roub LW, Gumerman LW, Hanley EN Jr, Clark MW, Goodman M, Herbert DL. Bone stress: A radionuclide imaging perspective. Radiology. 1979;132:431–438.
- Lesho EP. Can tuning forks replace bone scans for identification of tibial stress fractures?Mil Med. 1997;162:802–803.
- Kleinbaum DG. Survival Analysis: A Self-Learning Text. New York, NY: Springer-Verlag; 1996.
- Giladi M, Nili E, Ziv Y, Danon YL, Aharonson E. Comparison between radiology, bone scan, and ultrasound in the diagnosis of stress fractures. Mil Med. 1984;149:459–461.
- Moss A, Mowatt AG. Ultrasonic assessment of stress fractures. Br Med J. 1983;286:1479–1480. doi:10.1136/bmj.286.6376.1479 [CrossRef]
- Nitz AJ, Scoville CR. Use of ultrasound in early detection of stress fractures of the medial tibial plateau. Mil Med. 1980;145:844–846.
- Boam WD, Miser WF, Yuill SC, Delaplain CB, Gayle EL, MacDonald DC. Comparison of ultrasound examination with bone scintiscan in the diagnosis of stress fractures. J Am Board Fam Pract. 1996;9:414–417.
- Devereaux MD, Parr GR, Lachmann SM, Page-Thomas P, Hazleman BL. The diagnosis of stress fractures in athletes. JAMA. 1984;252:531–533. doi:10.1001/jama.252.4.531 [CrossRef]
- Romani WA, Perrin DH, Dussault RG, Ball DW, Kahler DM. Identification of tibial stress fractures using therapeutic continuous ultrasound. J Orthop Sports Phys Ther. 2000;30:444–452.
- Gaeta M, Minutoli F, Scribano E, et al. CT and MR imaging findings in athletes with early tibial stress injuries: Comparison with bone scintigraphy findings and emphasis on cortical abnormalities. Radiology. 2005;235:553–561. doi:10.1148/radiol.2352040406 [CrossRef]
- Kijowski R, Choi J, Mukharjee R, de Smet A. Significance of radiographic abnormalities in patients with tibial stress injuries: Correlation with magnetic resonance imaging. Skeletal Radiol. 2007;36:633–640. doi:10.1007/s00256-006-0272-4 [CrossRef]
Participant and Fracture Profile of Runners Presenting with Leg Pain Symptoms (N = 45)
|Bone tested (no. of cases in group)|
Imaging Techniques Performed and Time Frame of Imaging Techniques from the Onset of Pain in Runners with Leg Pain Symptoms (N = 45)
|IMAGING TECHNIQUE||NO. OF CASES IN GROUP||DAYS FROM SYMPTOM ONSET TO IMAGING TEST|
|Radiograph||44||75.4±95.3 (range, 0–420)|
|Magnetic resonance imaging||16||125.3±122.4 (range, 15–432)|
|Bone scan||22||103.3±104.9 (range, 10–388)|
Sensitivity and Specificity of the Tuning Forks to Detect True Positive Stress Fracture Imaging Resultsa
|RADIOGRAPH (N = 44)||MRI (N = 16)||BONE SCAN (N = 22)||ALL TESTS|
|− RESULT||+ RESULT||− RESULT||+ RESULT||− RESULT||+ RESULT|
|128-Hz tuning fork|
| No pain||12||2||3||2||3||5|
| Positive pain||20||10||3||8||2||12|
| Sensitivity (true positive)||83.3%||80.0%||70.5%|
| Specificity (true negative)||37.5%||50.0%||60.0%|
|256-Hz tuning fork|
| No pain||6||1||1||1||1||4|
| Positive pain||25||12||5||9||3||14|
| Sensitivity (true positive)||92.3%||90.0%||77.7%|
| Specificity (true negative)||19.3%||20.0%||25.0%|
|512-Hz tuning fork|
| No pain||20||3||5||5||2||11|
| Positive pain||11||10||1||5||3||6|
| Sensitivity (true positive)||76.9%||50.0%||35.3%|
| Specificity (true negative)||64.5%||83.3%||40.0%|