Stereopsis is the perception of depth and is strictly a binocular phenomenon. It is expressed in seconds of arc (arcsec) and is considered the most developed and complex type of binocular vision.1,2
Stereoacuity depends on both sensory and motor abilities. Eyes are required to make convergent and fixating movements for standard near testing, and cortical fusion mechanisms, sensitive to retinal disparity, are also required.1 Because many factors influence stereopsis, its measurements can provide clinicians with essential information regarding the general health of the visual function, making it a significant vision screening tool.3
Fluctuations in stereoacuity have clinical implications. Improvements in stereoacuity are a key outcome measure after strabismus surgery.4 Deteriorations in stereoacuity are a central factor when monitoring the control of intermittent exotropia and are part of the considerations determining timing for strabismus surgery.1,5 Birch et al.6 reported that postoperatively stereoblind children were 3.6 times more likely to need repeat surgery later in childhood after infantile esotropia surgery. Following accommodative esotropia surgery, odds increased to 17.4 to 32.2-time folds.6 Normal stereoacuity is essential for multiple visuomotor tasks and it has been linked to better reading ability.7,8
Many popular and accessible stereoacuity tests require polarized glasses.1–3 However, testing can be challenging9 in certain populations, particularly patients with short attention spans (eg, children and patients with intellectual disability), which may lead to underestimation or the inability to evaluate “true” stereoacuity. Ciner et al.10 and Birch et al.11 found that testability was high even in young children,10,11 strengthening the notion of variability in compliance in this age group.
This pilot study investigated the new Bernell Evaluation of Stereopsis Test (BEST) (Bernell Corporation, Mishawaka, IN), a filter-free method, and compared it to the Randot Stereotest (Randot) (Stereo Optical, Inc., Chicago, IL) in a pediatric cohort. The BEST uses a lenticular technology for stereoacuity testing.11 Cylindrical lenses refract light from the targets, forming binocular parallax and the illusion of depth without filter glasses.11 The BEST uses colorful animal targets (eg, tiger, cat, dog, bear, bird, bee, and fish) to measure between 400 and 40 arcsec, which may appeal to younger children. Gross stereopsis is tested with a dinosaur target (Figures 1–2).
A young girl tested with the Randot Stereotest (Stereo Optical, Inc., Chicago, IL).
A young girl tested with the Bernell Evaluation of Stereopsis Test (BEST) (Bernell Corporation, Mishawaka, IN).
Patients and Methods
Children aged 3 to 18 years who were examined at the Center for Pediatric Ophthalmology from July to November 2018 participated in the study. This retrospective study included all healthy children, with and without strabismus, who were evaluated with the Randot and BEST. Children with intellectual disability and children who demonstrated no stereoacuity on either test were excluded from the study. The research was approved by the Hadassah-Hebrew University Medical Center institutional review board.
Each child was assessed with the BEST and Randot in an unmasked fashion at the same clinic visit by one of two pediatric ophthalmologists (OL or IA). Manufacturer instructions were followed by using standard fluorescent room lighting in all sessions, and the testing distance was fixed at 40 cm. Test plates were presented perpendicular to the visual axes, and no twisting of test plates or head movements was allowed. Stereoacuity outcome was defined as the lowest (finest) disparity a child could reliably distinguish. Tests were performed while each child was wearing his or her habitual correction.
Statistical analysis was performed using SPSS software for Windows (version 22.0; SPSS, Inc., Chicago, IL). Mean visual acuity was converted to logarithm of the minimum angle of resolution (logMAR) acuity prior to statistical analysis based on common conversion units.13 Stereoacuity values were converted to log arcsec for analysis, because stereoacuity thresholds are not on a linear scale. Unless otherwise specified, stereoacuity values were presented in log arcsec. Differences in agreement were represented in Bland–Altman plots, and measurements of agreement were calculated accordingly.14 A difference of 10 arcsec was considered clinically significant. Two-sided P values less than .05 were considered statistically significant. Clinical parameters were tested for normality by the Shapiro–Wilk test. Independent and paired t tests were used for continuous variables with a normal distribution. The Wilcoxon signed–rank and Mann–Whitney U tests were used for variables without a normal distribution. Unless otherwise specified, results are presented as mean ± standard deviation.
This study included 107 participants. Seven participants did not meet the inclusion criteria and were excluded from the analysis. The analysis included 100 children between 3.3 and 17.8 years of age. The mean age was 8.52 ± 3.18 years and 53% of the children were female. Mean best corrected visual acuity (BCVA) was 0.178 ± 0.16 logMAR, and 64% of patients had no strabismus on the alternate cover test. Patient characteristics are presented in Table 1.
Patient Characteristics (N = 100)
Mean BEST stereoacuity scores were 1.772 ± 0.27 log arcsec (78.5 ± 84.7 arcsec) and mean Randot scores were 1.778 ± 0.39 log arcsec (95.2 ± 111.03 arcsec) (P = .835) (Figure 3). Similarly, stratifying by age and gender revealed no significant differences between the tests, which is represented in Table 2 and Figure 4. Presence of strabismus and visual acuity did not affect differences between test methods (all P > .05, Table 2).
Scatterplot of stereopsis results measured on the Bernell Evaluation of Stereopsis Test (BEST) (Bernell Corporation, Mishawaka, IN) versus those measured on the Randot Stereotest (Stereo Optical, Inc., Chicago, IL). Stereoacuity scores by BEST compared to Randot (in seconds of arc [arcsec]). Notice the group of children on the right side of the figure, who scored much better on the BEST compared to the Randot.
BEST and Randot Mean Stereoacuities According to Subgroup (log arcsec)
Difference between stereopsis tests according to age. Comparison of steropsis tests (in seconds of arc [arcsec]) according to age [years]). Note the difference in stereoacuity scores (Y axis = Randot Stereotest [Stereo Optical, Inc., Chicago, IL]) minus the Bernell Evaluation of Stereopsis Test (BEST) (Bernell Corporation, Mishawaka, IN) in the younger ages, which decline as age increases. Also visible is the tendency for higher results in the Randot for all ages.
The Bland–Altman agreement analysis revealed an overall bias of 0.0073 log arcsec (1.02 arcsec; 95% confidence interval [CI]: 0.04219 to 0.05679, 1.102 to 1.1396 arcsec) with limits of agreement of −0.4816 log arcsec (95% CI: −0.5664 to −0.3967) to 0.4962 log arcsec (95% CI: 0.4113 to 0.5810) (0.3299 to 3.1347 arcsec), which was within the predetermined minimal clinically significant difference of 10 arcsec. However, significant proportional bias was noted between tests (t = 5.566, P < .001), because the differences were significantly larger with higher log arcsec mean test results (Figure 5).
A Bland–Altman agreement plot showing stereoacuity scores between the Randot Stereotest (Stereo Optical, Inc., Chicago, IL) and the Bernell Evaluation of Stereopsis Test (BEST) (Bernell Corporation, Mishawaka, IN). Points were jittered for clarity. The plot illustrates a significant proportional bias between the tests (t = 5.566, P < .001), whereas differences are larger with higher results. Note that the mean difference is represented as Randot–BEST. Values were higher in the Randot with higher mean values. Solid line = mean bias; dashed lines = limits of agreement; dotted line = regression line of difference with 95% confidence intervals; SD = standard deviation.
Stereoacuity measurements were better in orthophoric children (n = 65) than in children with strabismus (n = 35) on the Randot (1.6529 ± 0.32 vs 2.0097 ± 0.40 log arcsec, P < .001) and BEST (1.6906 ± 0.17 vs 1.9248 ± 0.36 log arcsec, P < .001).
No significant difference was noted in children with monocular amblyopia with a difference of two or more lines of BCVA between eyes (n = 15) compared to children with a difference of fewer than two lines of BCVA between eyes (n = 85). Randot scores were 1.9356 ± 0.36 versus 1.7499 ± 0.39 log arcsec (P = .090), whereas BEST scores were 1.8114 ± 0.22 versus 1.7657 ± 0.28 log arcsec (P = .556). A difference of three or more lines of BCVA between eyes (n = 8) and a difference of fewer than three lines of BCVA between eyes (n = 92) were not significant, because stereoacuity scores were 1.8644 ± 0.33 versus 1.7702 ± 0.39 log arcsec (P = .516) for the Randot and 1.7710 ± 0.21 versus 1.7727 ± 0.28 log arcsec (P = .987) for the BEST. In children with (n = 4) and without (n = 96) a difference of four or more lines of BCVA between eyes, BEST stereoacuity scores were 1.8960 ± 0.23 versus 1.7674 ± 0.27 log arcsec (P = .363) and Randot stereoacuity scores were 2.0730 ± 0.19 versus 1.7655 ± 0.39 log arcsec (P = .124), respectively.
Equalized Finest Scores. We compared two stereoacuity tests, different in the finest measurable stereoacuity values (20 arcsec by Randot and 40 arcsec by BEST). A child with “true” stereoacuity of 20 arcsec may score 20 arcsec on the Randot, but only 40 arsec on the BEST. Therefore, we repeated the analysis after converting any result better than 40 arcsec (eg, 20 or 30 arcsec) on the Randot to 40. The results remained consistent because the Bland–Altman analysis showed a similar small bias (0.0788 log arsec, 1.198 arcsec) and narrow limits of agreement (−0.2975 to 0.4551 log arsec, 0.5040 to 2.8516 arcsec) with a significant proportional bias (t = 2.357, P = .013) (Figure 6). Crude values obtained by the BEST were statistically significant compared to the values attained by the Randot (1.8514 ± 0.32 log arcsec [71.0 ± 2.0 arcsec] vs 1.7726 ± 0.27 log arcsec [59.23 ± 1.86 arcsec], respectively) (P < .001).
A Bland–Altman agreement plot of stereoacuity scores between the Randot Stereotest (Stereo Optical, Inc., Chicago, IL) and the Bernell Evaluation of Stereopsis Test (BEST) (Bernell Corporation, Mishawaka, IN) after equalizing minimum score results to 40 for both tests. The plot again illustrates significant proportional bias between the tests (t = 2.357, P = .013). SD = standard deviation
Excluded Patients. Seven patients were excluded from the analysis because they did not meet our inclusion criteria. Two patients were adults (older than 18 years of age) who were stereoblind (no measurable stereopsis on either test), and five were children with a mean age of 7.68 ± 3.57 years and BCVA of 0.30 ± 0.29; all but one was heterophoric. One child was stereoblind on the Randot but measured 200 arcsec on the BEST. One child had gross stereopsis on the BEST (approximately 3,000 arcsec) but scored 400 arcsec on the Randot. Two children had gross stereopsis (approximately 3,000 arcsec) on both tests, and one child was stereoblind on both tests.
This study demonstrates that the BEST stereo-acuity measurements were comparable to those of the Randot, with minimal bias and narrow limits of agreement. However, a significant proportional bias was noted at lower stereoacuities, with the BEST demonstrating better values. The mean stereoacuities were not significantly affected by age or gender. Both tests were influenced by strabismus, but were not significantly influenced by monocular amblyopia with a difference up to four lines of BCVA between eyes.
Testing for Stereoacuity
Stereoacuity tests were divided into those that did and did not require a filter (Table 3). Tests that use filter glasses allow one image to be presented to one eye, whereas a disparate image is presented to the other eye.3 The Vectography method uses polarized test plates with polarized glasses to present disparity.3 The anaglyphs methods are based on the same principle: using a red/green target and filter to isolate disparate images for each eye.3
Comparison of Commonly Used Stereoacuity Tests
Although popular, filter-requiring techniques have several drawbacks. Because they require filter glasses, compliance may be an issue in very young children. In a study by Pai et al.,9 preschool children had better testability (86% to 98.1%) with filter-free tests compared with tests that required filter glasses (30.9% to 94.6%).3 The differences were mostly attributed to the observation that younger children struggled to wear filter glasses during the examinations.3 Filter-requiring techniques may hinder fusion and result in dissociation (eg, the anaglyphs method), possibly manifesting a latent nystagmus or an intermittent strabismus, resulting in underestimations of stereoacuity.1,3 Other limitations of filter glasses include the time spent wearing the glasses, the necessity to disinfect the glasses after every use, the risk of infection when such disinfection is inadequate, and the inability to monitor the patient's eye alignment during testing. Based on our clinical experience, testing with the BEST was quicker than the Randot, which is a possible advantage when testing patients with shorter attention spans (eg, younger children).
Challenges With Test Modalities
Monocular cues exist in many forms of stereopsis tests, whether a filter is required or not,3,7 most notably in the lower levels of stereoacuity targets, where the amount of horizontal disparity is substantial.4 Some observers may use monocular cues to pass the test, resulting in overestimation of their stereoacuity.1,3 The common vectography technique can be divided into the random dot (“cyclopean” or “global”) and the contour (“local”) methods.10 The former is considered more effective in detecting vision abnormalities because fewer monocular cues exist.10 Some authors suggest that good stereoacuity measurements with random dot targets are the most demanding achievement in binocular vision.7 First introduced by Julesz15 in 1960, the random dot stereogram is believed to avoid monocular cues by positioning the tested targets within a random dot background. The modern Randot test is designed differently than Julesz's original design. It combines vectographic random dot shapes on a random dot background (500 to 250 arcsec) with figures of contoured stereoscopic patterns on a random dot background (circles: 400 to 20 arcsec, animals: 400 to 100 arcsec). Despite a random dot background, monocular cues exist in both the circle and animal Randot tests.3,7 However, children could not use monocular cues to pass the Randot circle test.16 Furthermore, Fawcett and Birch observed17 that when stereoacuity was 160 arcsec or better, stereoacuites measured with the Randot circle test demonstrated good agreement with those measured with random dot stereograms (lacking monocular cues).
Filter-free techniques (eg, the Frisby and Lang stereotests) also have depth cues, such as motion parallax or light and shadow differences.3,7 Thus, testing conditions must be controlled (eg, using standardized room lighting conditions, presenting test plates perpendicularly to the visual axes, and not allowing twisting of the test plates or head movements), which can be hard to ensure in children.3,7 Both the Lang and Frisby tests are not colorful and may be less attractive to younger children. Moreover, the range of stereoacuities measured by these tests is somewhat limited (Table 3.).
Depth cues such as motion parallax also exist in the new BEST. However, this test was distinctively planned to reduce these cues. In each row, one target appears binocularly closer and one appears binocularly deeper. A child can use motion parallax (eg, twisting the test plates or moving the head) to determine which target is “different” or “jumping,” but determining whether the target protrudes or is depressed requires sufficient stereopsis. Thus, a notable weakness of this test design is its potential high false-positive rate, with a 50% chance of guessing, monocularly, which of the two “different” targets appears closer or deeper. We found it useful to use one set of test targets (either the tiger, cat, dog, and bear, or the bird, bee, and fish) to demonstrate what we mean by “closer” or “deeper” and ask the child to tell us which targets in the second set look similar to the targets that we presented in the first set, thus increasing test understanding and decreasing test guessing and overestimation.
The results of this study show that the BEST stereoacuity measurements had no significant difference in crude values to those attained with the Randot. Bias was minimal (0.0073 log arcsec; 1.02 arcsec) and limits of agreement were narrow (−0.4816 to 0.4962 log arcsec; 0.3299 to 3.1347 arcsec) and below what would reasonably be considered clinically significant. However, a significant trend for better results on the BEST with higher mean log arcsec values was observed (Figure 5). Although this proportional bias was significant and represents a bias of the BEST test compared to the Randot, the difference was clinically insignificant.
Sensitivity to Amblyopia and Strabismus
Differences in BCVA between eyes may potentially reduce stereoacuity scores.1,7 In anisometropic amblyopia, a weak linear correlation was observed between monocular visual acuity and Randot stereoacuity scores.18 A decrease in one line of BCVA resulted in the reduction of stereoacuity by approximately 6 arcsec.18 Therefore, stereopsis tests may not be sensitive to monocular anisometropic amblyopia, unless there are substantial differences in BCVA between eyes. However, it is unlikely for a child with a severe vision disorder to demonstrate a stereoacuity of 60 arcsec or better.10
Conversely, stereopsis tests demonstrated a strong sensitivity to strabismus.19 Strabismus in early childhood prevents normal development of binocular sensory neurons in visual cortex, severely damaging stereopsis.20 The oculomotor system in normal children ensures correct fixation on an object in all viewing distances, whereas this system is flawed in children with strabismus. Even a misalignment of 0.25 degrees would add a disparity of 900 arcsec to all parts of the stimulus, greatly damaging stereoacuity.7 Leske and Holmes21 observed that true stereopsis is only possible with deviations of 4 prism diopters or less.
The current study demonstrated similar findings. We showed that both the Randot and BEST are influenced by strabismus, with resultant reduced mean stereoacuities in children who were not orthophoric. However, they were not sensitive to monocular amblyopia, even when there was a difference of four or more lines of BCVA between eyes.
Our study had several limitations. First, it was retrospective in design, which has inherent drawbacks. Second, the study only included children who demonstrated measurable numeric scores on both tests. Children with gross stereopsis or who were stereoblind on either test were excluded. Third, the cohort included a few children with monocular amblyopia. Fourth, we measured strabismus with the alternating prism cover test, thus combining phorias with tropias and possibly explaining why children with strabismus had measurable stereopsis. Consequently, this study may not reflect differences in children with such characteristics.
We observed that BEST stereopsis values were comparable to those of the Randot, with no significant difference in crude values, minimal bias, and narrow limits of agreement. Although the difference was minimal, BEST scores were better than Randot scores in lower stereoacuities. Based on these results, we could not confirm that the BEST was equivalent to the Randot However, the BEST may continue to be regarded as comparable. When equalizing the finest scores, the difference between the tests was statistically significant, although clinically minimal. Both tests were influenced by the presence of strabismus but were not significantly affected by monocular amblyopia. Further studies are required to understand the BEST's testability, testing times, normative scores, and accuracy when determining fluctuations in stereoacuity measurements after treatment. The BEST may be an effective tool for measuring stereopsis, and it may have several advantages over filtered glasses methods.