Retinopathy of prematurity (ROP) is a significant cause of childhood blindness in many middle-income countries1 and is the leading preventable and treatable cause of childhood blindness in the United States.2 Although the burden of childhood blindness from ROP could be reduced by appropriate screening and treatment, there are many barriers to effective ROP screening, including a shortage of ophthalmologists skilled at ROP screening3 and a lack of access to these ophthalmologists. Current guidelines in the United States state that ROP screening examinations should be performed by an ophthalmologist trained in ROP screening using binocular indirect ophthalmoscopy,3 but alternative screening approaches are needed because the supply of ophthalmologist ROP screeners is unlikely to meet the need in many parts of the world.1
A previous study found that the Vantage Plus LED Digital Binocular Indirect Ophthalmoscope system (Keeler Instruments, Inc., Broomall, PA), a binocular indirect ophthalmoscope with an integrated camera that can capture and store still or dynamic digital images or both during the examination, could capture still images of sufficient quality to demonstrate the presence of posterior pole disease (pre-plus or plus disease).4 Because the field of view obtained by the Keeler system is similar to that seen during the examination with binocular indirect ophthalmoscopy, one could infer that the Keeler system would be able to not only capture valuable information about the posterior pole, but also the retinal periphery. If the Keeler system could capture images of both the posterior pole and retinal periphery that were of sufficient quality to demonstrate the zone and stage of ROP, this system would be a potential tool for both ROP screening and teaching.
The primary purpose of this study was to determine whether digital video images of the retina obtained using an indirect ophthalmoscopy imaging system could be accurately graded for zone and stage of ROP, and the presence of plus or pre-plus disease. The secondary aim was to determine whether these digital retinal video images could be accurately graded to detect the presence of disease requiring treatment (ie, type 1 ROP) by comparing two predefined criteria for referral.
Patients and Methods
This study was approved by the Duke Health System Institutional Review Board and conformed to the requirements of the U.S. Health Insurance Portability and Privacy Act. A retrospective chart review of infants screened for ROP in the Duke University Neonatal Intensive Care Unit (NICU) was performed. Infants were screened for ROP per recommended guidelines at the time of screening.5 All examinations were performed by one of two pediatric ophthalmologists (SFF or DKW), who were both experienced ROP examiners. As part of routine ROP screening, we digitally recorded every examination using the Vantage Plus LED Digital Binocular Ophthalmoscope system and a 28-diopter condensing lens. Follow-up examinations occurred according to current published guidelines at the time of the examination.5,6 The presence or absence of ROP and the zone, stage of ROP, and presence or absence of plus or pre-plus disease were documented for each eye according to current international classification guidelines.7,8
We extracted demographic data including birth date, gestational age, birth weight, and date of ROP examinations. Post-menstrual age was calculated based on examination date and birth date. Clinical examination findings were recorded as the lowest zone and highest stage of ROP documented for each eye during a given examination session.
To be eligible for inclusion in this study, infants had to have been hospitalized in the Duke University NICU; screened for ROP from November 1, 2009, to November 16, 2011; have digital video retinal images acquired using the Keeler system during their routine ROP examination(s); and have a birth weight of less than 1,500 g or gestational age of 30 weeks or younger. We excluded infants if they had received laser or anti-vascular endothelial growth factor treatment prior to having an examination recorded by the Keeler system during the study period.
We chose video images from one examination date for each infant. We enhanced the sample with images of stage 3 ROP by preferentially choosing examination dates where infants were diagnosed as having stage 3 ROP. If an infant had stage 3 ROP on more than one examination date, then one date was randomly chosen from those dates. Otherwise, for each infant, an examination date was randomly selected from all examination dates in which there was a video recording for that infant. Any examination performed after laser or anti-vascular endothelial growth factor treatment was excluded from the study.
After the examination date was chosen, one of the authors (SGP) reviewed and edited all videos recorded by the Keeler system for the infant on that date using a video editing program (Windows Movie Maker 2.6; Microsoft Corp., Redmond, WA). All videos were edited to remove extraneous non-retinal images (eg, images of placing or removing the eyelid speculum) or images in which the examiner was pointing at ocular pathology. Only images from the right eye were included for each infant.
After the videos were edited, these images were randomly numbered and presented to the graders without any accompanying demographic or clinical information.
Masked to demographic information and clinical examination findings, two ophthalmologists (one expert [SFF] and one non-expert [RSD] in ROP screening) independently reviewed the video images and evaluated them for (1) image quality, (2) zone, (3) stage of ROP, and (4) presence of pre-plus or plus disease. The non-expert in ROP screening was a general ophthalmologist who received approximately 6 months of training in ROP screening during 4 years of ophthalmology residency, had no additional fellowship training, and had worked for approximately 7 months after residency prior to participating in this study. The non-expert did not have any specific training to participate in this study.
In this study, pre-plus disease was defined according to the International Classification of ROP revisited guidelines as “vascular abnormalities of the posterior pole that are insufficient for the diagnosis of plus disease but that demonstrate more arterial tortuosity and more venous dilatation than normal” (p. 995)7 and plus disease was defined as the presence of sufficient vascular dilation and tortuosity in two or more quadrants of the eye as compared to a standard photograph.7,8 Based on the ability of the grader to determine the stage of ROP in the video selected for each infant, image quality was graded as follows: “good” (a video in which the grader could easily discern the stage of ROP), “fair” (a video in which it was difficult to clearly discern the stage of ROP), and “poor” (a video in which the grader was unable to discern the stage of ROP). The zone (I, II, III) and stage (none, 1, 2, or 3) of ROP were defined according to the International Classification of ROP revisited guidelines.7 Because the expert performed ROP screening for 50% of the time period of image acquisition, she pledged to not grade any videos that she recognized.
We used SAS 9.3 (SAS Institute, Inc., Cary, NC) for all statistical analyses. Prior to the commencement of this study, a sample size calculation indicated that to appropriately power our study to detect a sensitivity of 0.80 (95% confidence interval: 0.725 to 0.875), a sample size of 114 was required. Before analyzing the data, we defined the “reference standard” as the diagnosis of ROP by indirect ophthalmoscopy during the clinical examination. For the primary analysis of accuracy, we evaluated the ability of each grader to accurately identify the zone and stage of ROP and the presence or absence of pre-plus or plus disease on reviewing the Keeler videos compared to the “reference standard” (ie, clinical examination diagnosis). In our secondary analysis, we determined the accuracy (ie, sensitivity and specificity) of two predefined criteria for referral in detecting disease requiring treatment (ie, type 1 ROP). Because we wanted to evaluate whether grading the zone and stage of ROP on reviewing the video images would increase sensitivity and specificity of screening for type 1 ROP compared to the grading for the presence of pre-plus or plus disease alone, our first criterion for referral was defined as the presence of pre-threshold disease, pre-plus disease, or plus disease, and our second criterion for referral was defined as the presence of only pre-plus or plus disease. Type 1 ROP is the presence of stage 3 in zone I, any stage ROP with plus disease in zone I, or stage 2 or 3 with plus disease in zone II.6 Pre-threshold disease is any ROP in zone I, stage 2 in zone II with plus disease, or stage 3 in zone II.9
A total of 114 infants were included (median gestational age: 26 weeks, range: 23 to 33 weeks; median birth weight: 840 g, range: 450 to 2,300 g; median post-menstrual age at examination: 35 weeks, range: 29 to 46 weeks). As diagnosed by indirect ophthalmoscopy, our enhanced sample of images comprised 14% of images with retinal vascularization that ended in zone I, 74% in zone II, and 12% in zone III; 15% with stage 1, 20% with stage 2, and 26% with stage 3 ROP; and 15% with pre-plus disease and 9% with plus disease (Table 1).
Retinopathy of Prematurity Diagnosis by Indirect Ophthalmoscopy and Image Quality of Keeler Videos
Based on the ability of each grader to determine the stage of ROP in the video images, the expert judged 60% (n = 68) and the non-expert judged 55% (n = 63) of the images to have fair or good image quality (Table 1). Of the images the expert believed had fair or good image quality, there was a higher percentage of those from the entire set of videos with zone II > I > III, stage 3 > 2 > 1 > immature vasculature, and plus disease > pre-plus disease > a normal posterior pole (Table 1). The expert did not recognize any of the images used in this study and thus did not recuse from grading. The videos were graded at least 1.5 years after the expert acquired the images. Of the images the non-expert believed had fair or good image quality, there was a higher percentage of those from the entire set of videos with zone I > III > II, stage 3 > 1 > 2 > immature vasculature, and plus disease > pre-plus disease > a normal posterior pole (Table 1).
Of the images that the expert believed were of fair or good quality, the expert and non-expert correctly identified zone (75% vs 74% of images, respectively), stage of ROP (75% vs 40% of images, respectively) (Figure 1), and the presence of pre-plus or plus disease (79% of images) (Table 2).
Representative video indirect ophthalmoscopy system (Keeler Instruments, Inc., Broomall, PA) images with arrows showing (A) stage 1, (B) stage 2, and (C) stage 3 retinopathy of prematurity.
Keeler Video Indirect Ophthalmoscopy Images Considered as Fair or Good Image Quality by an Expert in Retinopathy of Prematurity Screening (n = 68)
Using the reference standard of indirect ophthalmoscopy-reported type 1 ROP, the sensitivity of grading Keeler images for the presence of pre-threshold disease, pre-plus disease, or plus disease for both the expert and non-expert was 100%, and the specificity was 75% and 79%, respectively (Table 3).
Accuracy of Identifying Type 1 ROP by Grading Keeler Video Images of the Retina
Using the reference standard of indirect ophthalmoscopy-reported type 1 ROP, the sensitivity of grading Keeler videos for pre-plus or plus disease was 92% for the expert and 100% for the non-expert, and the specificity was 77% for the expert and 82% for the non-expert (Table 4). Of the cases of clinically diagnosed type 1 ROP by indirect ophthalmoscopy, the non-expert graded all cases as either pre-plus or plus disease, whereas the expert judged one case as having a normal posterior pole. On further examination of this case, we had the expert evaluate the video images of the other eye from the same examination session; they graded the video images as having good quality and identified the eye has having stage 3 ROP in zone I with pre-plus disease.
Accuracy of Identifying Type 1 ROP by Grading the Presence or Absence of Posterior Pole Disease on Keeler Video Images
Looking at only the images judged by the expert to be of good or fair quality, the inter-grader reliability was 68% (Kappa = 0.1) for zone; 56% (Kappa = 0.4) for stage of ROP; 79% (Kappa = 0.6) for the presence or absence of pre-plus or plus disease; 91% (Kappa = 0.8) for the presence of pre-threshold, pre-plus, or plus disease; and 88% (Kappa = 0.7) for the presence of pre-plus disease or plus disease.
This study found that digital retinal video images of fair or good quality obtained using the Keeler system could be graded for ROP. Compared with the entire set of video images, those that the expert and non-expert considered to have fair or good image quality were more likely to include eyes with stage 3 or plus disease (Table 1). It is unclear whether the clinical examiner dwelled more on viewing severe rather than mild disease pathology or that more severe ROP was more distinct and thus easier to capture on imaging.
The non-expert had a more difficult time identifying the stage of ROP, whereas the ROP expert was able to accurately grade zone, stage of ROP, and presence of pre-plus or plus disease when grading the video images (Table 2). Neither grader misclassified zone by more than one zone (ie, zone I disease may have been graded as zone I or II, but never as zone III disease, and zone III disease may have been graded as zone II or III, but never as zone I disease). The expert never misclassified ROP stage by more than 1 stage, but the non-expert had a more difficult time identifying stage 3 ROP. For the presence of pre-plus or plus disease, neither grader misclassified by more than one category (ie, “no posterior pole disease” may have been graded as no disease or pre-plus disease, but never as plus disease; conversely, plus disease may have been graded as pre-plus or plus disease, but never as “no posterior pole disease”). More importantly, plus disease was always identified as pre-plus disease or worse by both the expert and non-expert.
Looking at the Keeler system as a true “screening tool” for ROP, we wished to evaluate whether, in addition to grading for the presence of pre-plus or plus disease, the grading of zone and stage of ROP would further enhance the identification (or decrease the chance of false-negative classification) of infants with type 1 ROP.
In a previous study, we found that ROP experts could grade still images of the posterior pole acquired by the Keeler system with high accuracy for pre-plus or plus disease compared to the results of the clinical examination, and that the grading of pre-plus or plus disease was highly sensitive and specific for the presence of plus disease.4
We wanted to evaluate how inclusive we needed to be in our screening criteria to make the grading of videos captured by the Keeler system an acceptable “screening test” for ROP (ie, whether the grading of zone and stage of ROP, in addition to the grading of the presence of pre-plus or plus disease, would further increase sensitivity and specificity of screening for type 1 ROP). Thus, in our secondary analysis, we compared the sensitivities and specificities of two scenarios. In the first scenario (which we will refer to as “posterior pole only”), we found that if we had graders evaluate video images in only one eye for pre-plus or plus disease, the expert missed one case of type 1 ROP (sensitivity = 92%), whereas the non-expert was able to correctly identify all cases of type 1 ROP. Both graders had high specificity (77% for the expert and 82% for the non-expert) for identifying type 1 ROP. In the second scenario (which we will refer to as “posterior pole and periphery”), we found that if we had the graders evaluate the videos in one eye for zone, stage of ROP, and presence of pre-plus or plus disease to diagnose the presence of pre-threshold disease, pre-plus disease, or plus disease, both the expert and non-expert had 100% sensitivity, whereas their specificity was 75% (expert) and 79% (non-expert) for identifying type 1 ROP.
In the two scenarios, both showed a high sensitivity and specificity for identifying infants with treatment-requiring (type 1) ROP compared to the clinical examination, suggesting that the Keeler system shows promise as an ROP screening tool. A good “screening test” must have a high sensitivity so that those with disease requiring treatment are not missed.
Of the cases of clinically diagnosed treatment-requiring (type 1) ROP by indirect ophthalmoscopy in the posterior pole-only scenario, one case (1 of 12, 8%, an eye with stage 3 in zone I with pre-plus disease) was judged to have a normal posterior pole by review of the image by the expert (Table 4); the video images were judged by that grader to be of fair quality. We also had the expert evaluate the video images captured by the Keeler system of the other eye on the same infant from the same examination session and they judged it to be of good quality and to have stage 3 ROP in zone I with pre-plus disease. In this example, the infant would have failed screening criteria if both eyes were evaluated because of the presence of pre-plus disease noted in the left eye on review of the video images. Because ROP can present asymmetrically in the same individual, it is important that each infant have both eyes examined in a true “screening” scenario so that the more severely affected eye is not missed on screening. Thus, if screening criteria using the Keeler system required the evaluation of video images from both eyes and image quality to be fair or good, the presence of pre-plus or plus disease in either eye would trigger a standard diagnostic examination by an ophthalmologist trained in ROP screening using indirect ophthalmoscopy to reevaluate the infant not only for the presence of pre-plus or plus disease, but also for the zone and stage of ROP in the retinal periphery. Further research is needed to evaluate the true sensitivity and specificity of screening for type 1 ROP if both eyes are evaluated simultaneously and prospectively.
Although a good “screening test” must have a high sensitivity so that those with disease requiring treatment are not missed, high specificity is also desirable so that those without disease are not subjected to unnecessary examinations. Both scenarios described above had high specificities for ruling out type 1 ROP when it was not present according to the clinical examination (Tables 3–4).
Although the expert and non-expert showed poor inter-grader reliability for grading zone and stage of ROP, they both showed high inter-grader reliability for grading the presence of pre-threshold disease, pre-plus or plus disease, and pre-plus or plus disease as a dichotomous and trichotomous variable. Inter-grader reliability for grading the presence of pre-plus or plus disease was similar to that reported in a previous study evaluating the ability to grade for the presence of pre-plus or plus disease in still images of the posterior pole acquired by the Keeler system.4
If it served adequately as a pure “screening test,” examination of only the posterior pole for the presence of pre-plus or plus disease (not also of the zone and stage of ROP) would provide certain advantages. Namely, “posterior pole only” screening (ie, evaluating only for the presence or absence of pre-plus or plus disease in the posterior pole) would save time and may be less stressful on infants requiring screening. In addition, the posterior pole is easier to image than the retinal periphery, can be captured in a still (which has a smaller file size) versus a dynamic video image, and lends itself to a wider variety of imaging modalities (including other narrow-field imaging devices such as the NIDEK NM200-D camera10 [Nidek, Inc., Gamagori, Japan] and Pictor camera11 [Volk Optical, Inc., Mentor, OH]) and expertise level among potential imagers and graders. As shown by our study, it may be more difficult for a non-expert (compared to an ROP expert) to accurately identify stage of ROP from a review of video images. Therefore, by omitting assessment of stage of ROP from the image grading process, one could theoretically improve reliability of graders identifying infants needing a diagnostic examination by an ophthalmologist with binocular indirect ophthalmoscopy. This “posterior pole disease only” grading strategy might, in turn, allow acceptable image grading for treatment-requiring ROP by those with a wider range of training and expertise.
This study’s findings must be considered in light of several limitations. With respect to graders’ ability to identify the stage of ROP, many videos in this study were judged to have poor image quality. Because this was a retrospective study, the videos evaluated were not captured for the purposes of this study. Thus, the quality of videos included in this study likely underestimates the quality obtainable if the acquisition of videos was purely for ROP “screening” (rather than clinical teaching) purposes. Also, all videos were obtained by two pediatric ophthalmologists who are experienced ROP examiners. A prospective study evaluating the ability of ophthalmologists with varying degrees of expertise to acquire videos of adequate quality for ROP screening to include all screened infants, with a protocol in place for reimaging those with poor quality video images within a reasonable time frame, is needed. Also, our study only evaluated images acquired from one eye of an infant on one examination date. Because ROP can have an asymmetric presentation in the same individual, we believe it is important that each infant have both eyes examined in a true “screening” scenario.
The limited number of ophthalmologists trained and willing to screen for ROP underscores the need for a true ROP “screening test” to help decrease the burden of screening on these experts and to hopefully increase access to screening for infants at risk of ROP. Because high-quality video images of the retina captured by the Keeler system can be graded for the presence of type 1 ROP with high accuracy, less experienced ophthalmologists able to use the indirect ophthalmoscope to capture high-quality video images (but less confident in their diagnosis of ROP) could help screen for type 1 ROP with the guidance of ROP experts from a distance. ROP experts could evaluate videos captured by non-experts to rule out an urgent need for a bedside examination by the expert without traveling to the bedside, which could decrease the amount of time an expert ROP examiner spends performing ROP examinations (both at the bedside and from a distance). The results of this and a previous study4 suggest that the Keeler system or a comparable system may be suitable not only for remote ROP screening, but also to help educate ophthalmologists in the nuances of ROP evaluation, especially in the identification of stage of ROP and the presence of pre-plus or plus disease.
High-quality video images of the retina obtained by the Keeler system can be read with high sensitivity and specificity to screen for type 1 ROP. The Keeler system holds promise as a tool for ROP screening and teaching.
- Gilbert C. Retinopathy of prematurity: a global perspective of the epidemics, population of babies at risk and implications for control. Early Hum Dev. 2008;84:77–82. doi:10.1016/j.earlhumdev.2007.11.009 [CrossRef]
- Kong L, Fry M, Al-Samarraie M, Gilbert C, Steinkuller PG. An update on progress and the changing epidemiology of causes of childhood blindness worldwide. J AAPOS. 2012;16:501–507. doi:10.1016/j.jaapos.2012.09.004 [CrossRef]
- Kemper AR, Wallace DK. Neonatologists’ practices and experiences in arranging retinopathy of prematurity screening services. Pediatrics. 2007;120:527–531. doi:10.1542/peds.2007-0378 [CrossRef]
- Prakalapakorn SG, Freedman SF, Wallace DK. Evaluation of an indirect ophthalmoscopy digital photographic system as a retinopathy of prematurity screening tool. J AAPOS. 2014;18:36–41. doi:10.1016/j.jaapos.2013.10.018 [CrossRef]
- Fierson WM. Screening examination of premature infants for retinopathy of prematurity. Pediatrics. 2006;117:572–576. doi:10.1542/peds.2005-2749 [CrossRef]
- Early Treatment for Retinopathy of Prematurity Cooperative Group. Revised indications for the treatment of retinopathy of prematurity: results of the Early Treatment for Retinopathy of prematurity Randomized Trial. Arch Ophthalmol. 2003;121:1684–1694. doi:10.1001/archopht.121.12.1684 [CrossRef]
- International Committee for the Classification of Retinopathy of Prematurity. The international classification of retinopathy of prematurity revisited. Arch Ophthalmol. 2005;123:991–999. doi:10.1001/archopht.123.7.991 [CrossRef]
- Capone A Jr, Ells AL, Fielder AR, et al. Standard image of plus disease in retinopathy of prematurity. Arch Ophthalmol. 2006;124:1669–1670. doi:10.1001/archopht.124.11.1669-c [CrossRef]
- Cryotherapy for Retinopathy of Prematurity Cooperative Group. Multicenter trial of cryotherapy for retinopathy of prematurity: preliminary results. Arch Ophthalmol. 1988;106:471–479. doi:10.1001/archopht.1988.01060130517027 [CrossRef]
- Skalet AH, Quinn GE, Ying GS, et al. Telemedicine screening for retinopathy of prematurity in developing countries using digital retinal images: a feasibility project. J AAPOS. 2008;12:252–258. doi:10.1016/j.jaapos.2007.11.009 [CrossRef]
- Prakalapakorn SG, Wallace DK, Freedman SF. Retinal imaging in premature infants using the Pictor noncontact digital camera. J AAPOS. 2014;18:321–326. doi:10.1016/j.jaapos.2014.02.013 [CrossRef]
Retinopathy of Prematurity Diagnosis by Indirect Ophthalmoscopy and Image Quality of Keeler Videos
|Parameter||Diagnosis by Indirect Ophthalmoscopy of All Images (n = 114)||Images Judged by Expert in ROP Screening as Fair or Good IQa (n = 68)||Images Judged by a Non-expert in ROP Screening as Fair or Good IQa (n = 63)|
|No. (% of all Images)||No. (% of Those With that Zone, Stage, or Degree)|
| I||16 (14)||7/16 (44)||11/16 (69)|
| II||84 (74)||56/84 (67)||44/84 (52)|
| III||14 (12)||5/14 (36)||8/14 (57)|
| None||44 (39)||18/44 (41)||18/44 (41)|
| 1||17 (15)||8/17 (47)||10/17 (59)|
| 2||23 (20)||15/23 (65)||12/23 (52)|
| 3||30 (26)||27/30 (90)||23/30 (77)|
|Posterior pole disease|
| None||87 (76)||44/87 (51)||42/87 (48)|
| Pre-plus||17 (15)||14/17 (82)||11/17 (65)|
| Plus||10 (9)||10/10 (100)||10/10 (100)|
| Total||114||68/114 (60)||63/114 (55)|
Keeler Video Indirect Ophthalmoscopy Images Considered as Fair or Good Image Qualitya by an Expert in Retinopathy of Prematurity Screening (n = 68)
|Expert in ROP screening||I||3||1||0|
|Non-expert in ROP screening||I||5||9||0|
|Expert in ROP screening||None||16||2||2||0|
|Non-expert in ROP screening||None||13||4||1||1|
|Posterior Pole diseased||None||Pre-plus||Plus|
|Expert in ROP screening||None||40||4||0|
|Non-expert in ROP screening||None||41||5||0|
Accuracy of Identifying Type 1 ROP by Grading Keeler Video Images of the Retina
|Grader||Presence of Prethreshold Disease,a Pre-plus or Plus Disease||Reference Standardb|
|Type 1 ROP||No Type 1 ROP|
|Expert in ROP screening||Present||12||14|
|Non-expert in ROP screening||Present||12||12|
|Total||12 (18%)||56 (82%)|
Accuracy of Identifying Type 1 ROP by Grading the Presence or Absence of Posterior Pole Disease on Keeler Video Imagesa
|Grader||Disease Type||Reference Standarda|
|Type 1 ROP||No Type 1 ROP|
|Expert in ROP screening||Plusb||8||2|
|Non-expert in ROP Screening||Plusb||6||0|
|Total||12 (18%)||56 (82%)|