OBJECTIVE

Diabetic macular edema (DME) is the primary cause of vision loss among individuals with diabetes mellitus (DM). We developed, validated, and tested a deep learning (DL) system for classifying DME using images from three common commercially available optical coherence tomography (OCT) devices.

RESEARCH DESIGN AND METHODS

We trained and validated two versions of a multitask convolution neural network (CNN) to classify DME (center-involved DME [CI-DME], non-CI-DME, or absence of DME) using three-dimensional (3D) volume scans and 2D B-scans, respectively. For both 3D and 2D CNNs, we used the residual network (ResNet) as the backbone. For the 3D CNN, we used a 3D version of ResNet-34 with the last fully connected layer removed as the feature extraction module. A total of 73,746 OCT images were used for training and primary validation. External testing was performed using 26,981 images across seven independent data sets from Singapore, Hong Kong, the U.S., China, and Australia.

RESULTS

In classifying the presence or absence of DME, the DL system achieved area under the receiver operating characteristic curves (AUROCs) of 0.937 (95% CI 0.920–0.954), 0.958 (0.930–0.977), and 0.965 (0.948–0.977) for the primary data set obtained from CIRRUS, SPECTRALIS, and Triton OCTs, respectively, in addition to AUROCs >0.906 for the external data sets. For further classification of the CI-DME and non-CI-DME subgroups, the AUROCs were 0.968 (0.940–0.995), 0.951 (0.898–0.982), and 0.975 (0.947–0.991) for the primary data set and >0.894 for the external data sets.

CONCLUSIONS

We demonstrated excellent performance with a DL system for the automated classification of DME, highlighting its potential as a promising second-line screening tool for patients with DM, which may potentially create a more effective triaging mechanism to eye clinics.

Diabetic macular edema (DME) is the primary cause of vision loss among individuals with diabetes mellitus (DM), and can develop at any stage of diabetic retinopathy (DR) (1). Although substantial international guidelines and national programs for the screening of DR already exist for preventing vision loss among such individuals (27), these programs mostly run on two-dimensional (2D) retinal fundus photographs, which have demonstrated limited performance in screening for DME. Because DME is a 3D condition involving edematous thickening of the macula, screening for DME using 2D retinal fundus photographs has reportedly led to very high false-positive rates (e.g., >86% in Hong Kong and >79% in the U.K.), increasing the number of non-DME cases unnecessarily referred to ophthalmologists and straining clinical resources (8,9). Furthermore, there is an increasing awareness for differentiating eyes with center-involved DME (CI-DME), which are more likely to have visual impairments and require more timely management strategies (e.g., intravitreal injections of anti–vascular endothelial growth factor), compared with eyes with non-CI-DME, for which treatment needs may be less urgent—to obtain the most cost-effective outcomes for patients with DM (6).

Optical coherence tomography (OCT), particularly spectral domain or Fourier domain OCT, is a noninvasive technique for imaging 3D layered retinal structures within seconds. It has been proposed as an alternative screening tool for DME (10,11), particularly as a second-line screening tool for those who screen positive based on 2D retinal fundus photographs (12). However, the identification of DME from OCT images, as well as the classification into the CI-DME and non-CI-DME subtypes, still requires human assessment, either by ophthalmologists or professionally trained technicians and graders, who may need to manually review multiple cross-sectional OCT B-scan images from the volumetric scan slice by slice. While OCT viewing platforms have some built-in automated features (e.g., macular thickness, central subfield thickness, comparison of normative databases), it is not possible to compare across different commercial OCT devices because of their unique manufacturer algorithms and normative databases (13,14).

Over the past few years, several automated deep learning (DL) systems for DME detection and fluid segmentation from OCT images have been developed (1520). Studies investigating these systems demonstrate that DL algorithms can accurately detect DME from OCT images and that they have the potential to enhance and speed up clinical workflows through automated image interpretation (21). However, several critical gaps remain. First, most of the proposed DL algorithms have been trained and tested on OCT images obtained from a single commercial device in a single center, with a lack of external data sets to test generalizability. Second, and perhaps more importantly, no studies to date have tested these algorithms on their classification of DME into CI-DME and non-CI-DME subgroups, which is important for triaging patients into timely referral intervals or specialized clinics such as retina clinics.

To address these gaps, we developed a novel multitask DL system, applying a segmentation-free classification approach, for the automated classification of DME from OCT images obtained from three common commercially available OCT devices (CIRRUS OCT, SPECTRALIS OCT, and Triton OCT). Specifically, according to the different scanning protocols for each device, we first trained a deep convolutional neural network (CNN) to screen for DME using 3D volume-scans from CIRRUS OCT, followed by another CNN using a series of 2D B-scans from SPECTRALIS OCT and Triton OCT. Second, we developed algorithms to classify DME cases into CI-DME and non-CI-DME subgroups. Third, we trained CNNs to simultaneously detect retinal abnormalities other than DME using images from all three OCT devices.

Data Sets

Primary Development, Testing, and Validation Data Set

The primary data set for training, testing, and primary validation was retrospectively drawn from the Chinese University of Hong Kong-Sight Threatening Diabetic Retinopathy (CUHK-STDR) study from November 2015 to June 2019 (22). Briefly, this is an ongoing prospective, observational cohort study, aimed at identifying new risk factors for DR progression. Inclusion criteria were as follows: patients with type 1 or type 2 DM, age >18 years, and treatment naive at baseline. Exclusion criteria were eyes with prior retinal surgery, intravitreal injection, macular laser photocoagulation, or pan-retinal laser photocoagulation and eyes with pathology that interferes with imaging (e.g., dense cataract, corneal ulcer). We extracted macular OCT images, obtained with the following devices and protocols, from all participants in the CUHK-STDR study: 1) CIRRUS OCT (Carl Zeiss Meditec, Dublin, CA) with a 6 mm × 6 mm 3D macular cube (512 A-scans per B-scan; 128 B-scans over 1,024 samplings) scanning protocol, 2) SPECTRALIS OCT (Heidelberg Engineering, Heidelberg, Germany) with a high-resolution 6.3 mm × 6.3 mm (1,024 A-scans per B-scan, 25 B-scans) and high-speed 6.5 mm × 4.9 mm (1,024 A-scans per B-scan; 19 B-scans) scanning protocol; and 3) Triton OCT (Topcon Corp., Tokyo, Japan) with a high-resolution radial 9 mm × 30° (1,024 A-scans per B-scan; 12 B-scans) scanning protocol.

External Testing Data Sets

We identified seven independent, retrospectively collected data sets of macular OCT images of patients with DM from different centers to test the performance of the DL system in the classification of DME and non-DME retinal abnormalities. The images for this study were selected by the site investigators from a certain time period that represented their DM cohorts at each site. Data sets 1–3 (External 1, 2, and 3) were collected, respectively, from the Singapore Integrated Diabetic Retinopathy Program (SiDRP), Singapore; the Eye Clinic at Alice Ho Miu Ling Nethersole Hospital (AHNH); and the Byers Eye Institute at Stanford, Stanford University Medical Center. All three centers used CIRRUS OCT with the same scanning protocol for the primary data set. Data sets 4–6 (External 4, 5, and 6) were collected, respectively, from the Eye Clinic of the Aier medical group in Guangzhou, China; the Westmead Institute for Medical Research; and the Eye Clinic at United Christian Hospital. All three centers used SPECTRALIS OCT with the same scanning protocols for the primary data set. Finally, data set 7 (External 7) was collected from the Retina Clinic of Joint Shantou International Eye Center (JSIEC). The center used Triton OCT with the same scanning protocol for the primary data set.

Ground Truth Labeling

All anonymized OCT scans were labeled by well-trained graders (F.T. and Z.T.) on full-screen, high-resolution 27-inch monitors (Koninklijke Philips N.V.) in the CUHK Ophthalmic Reading Centre, following the reference standards and grading definitions listed below. F.T. and Z.T.’s intragrader Cohen κ of grading for presence or absence of DME was 0.947 and 0.924 and for presence or absence of non-DME retinal abnormalities was 0.940 and 0.923, respectively. Intergrader Cohen κ of grading for presence or absence of DME was 0.889, and grading for presence or absence of non-DME retinal abnormalities was 0.862. A panel of retina specialists adjudicated the positive cases during ground truth labeling.

A gradable scan was defined as being assessable for macular morphology, having good image quality, and being free of image artifacts. An acceptable scan was defined as being assessable for macular morphology and subsequent pathology labeling, despite having fair image quality because of artifacts (e.g., low signal strength). An ungradable scan was defined as not being assessable for macular morphology, having insufficient image quality because of artifacts. Systems training did not include ungradable scans.

The presence of DME was defined as either perceptible retinal thickening or the presence of DME features (e.g., intraretinal cystoid spaces, subretinal fluid, and hard exudates) in the macula. For eyes with DME, CI-DME was defined as either retinal thickening or the presence of DME features in the macula involving the central subfield zone (1 mm in diameter), whereas non-CI-DME was defined as retinal thickening or the presence of DME features in the macula not involving the central subfield zone. Retinal thickening was defined according to DRCR.net protocol–defined thresholds (≥320 µm for men and ≥305 µm for women on SPECTRALIS OCT; ≥305 µm for men and ≥290 µm for women on CIRRUS OCT) and the threshold of the Moorfields DME study (≥350 µm on a Topcon OCT) (23,24). The absence of DME is defined as absence of retinal thickening and any DME features. Finally, the presence of non-DME retinal abnormalities was defined as any abnormal appearance in the OCT scan other than DME (e.g., age-related macular degeneration, epiretinal membrane abnormalities, central serous chorioretinopathy, and macular holes).

The current study presented here was conducted in accordance with the 1964 Declaration of Helsinki and was approved by local research ethics committees. Because the study is a retrospective analysis of fully anonymized OCT images, the ethics committees waived the requirements for informed consent.

Development of the DL System

A detailed description of the development of the DL system can be found in Supplementary 1. Briefly, we built a 3D multitask CNN for analyzing 3D volume scans imaged by CIRRUS OCT and a 2D multitask CNN for analyzing a series of 2D B-scans imaged by SPECTRALIS OCT and Triton OCT (25). Fig. 1 illustrates the architecture for both versions of the CNN, which comprises three components: a shared feature extraction module, a “DME classification” module, and a “non-DME retinal abnormalities classification” module.

Figure 1

The proposed 3D multitask convolutional neural network (CNN) for 3D OCT volumetric scans (A) and 2D multitask CNN for 2D B-scan images (B). Both our networks have three components: a shared feature extraction module, a DME classification (no DME, non-CI-DME, and CI-DME) module, and an abnormality classification module. We used a 3D version of residual network (ResNet)-34 for the 3D CNN and ResNet-18 for the 2D CNN. We applied the following presence-based strategy to obtain per-scan (volume) level results for the 2D CNN: 1) If any B-scans are predicted as CI-DME, the whole scan is classified as CI-DME; 2) if 1 does not hold and at least one B-scan is predicted as non-CI-DME, the whole scan is classified as non-CI-DME; and 3) if both 1 and 2 do not hold, the whole scan is classified as non-DME. Conv, convolution; GAP, global average pooling; Max, maximum; Norm., normalization; ReLU, rectified linear unit.

Figure 1

The proposed 3D multitask convolutional neural network (CNN) for 3D OCT volumetric scans (A) and 2D multitask CNN for 2D B-scan images (B). Both our networks have three components: a shared feature extraction module, a DME classification (no DME, non-CI-DME, and CI-DME) module, and an abnormality classification module. We used a 3D version of residual network (ResNet)-34 for the 3D CNN and ResNet-18 for the 2D CNN. We applied the following presence-based strategy to obtain per-scan (volume) level results for the 2D CNN: 1) If any B-scans are predicted as CI-DME, the whole scan is classified as CI-DME; 2) if 1 does not hold and at least one B-scan is predicted as non-CI-DME, the whole scan is classified as non-CI-DME; and 3) if both 1 and 2 do not hold, the whole scan is classified as non-DME. Conv, convolution; GAP, global average pooling; Max, maximum; Norm., normalization; ReLU, rectified linear unit.

Close modal

Statistical Analysis

Numerical data were analyzed with the Wilcoxon rank sum test. The χ2 test was used for analysis of categorical data, including the demographic characteristics of all participants and the data variances of different data sets. The discriminative performance of the DL system (classifying the presence or absence of DME, CI-DME vs. non-CI-DME, and the presence or absence of non-DME retinal abnormalities) was evaluated using area under the receiver operating characteristic curves (AUROCs), in addition to percentage rates for sensitivity, specificity, and accuracy. All statistical analyses were performed with RStudio (version 1.1.463, 2009–2018; RStudio, Inc.). Each external testing data set for the current study was weighted to make sure that with the sample sizes from the retrospective data sets there was sufficient power to estimate the performance of the DL system; there were positive case ranges of 2–92%, 57–94%, and 3–26% for any DME, CI-DME, and non-DME retinal abnormalities, respectively. To detect an AUROC of ≥0.7 with >80% power (α = 0.05), we estimated that 17–174, 36–252, and 17–23 cases would be required from each data set for each classification task.

A total of 100,727 OCT images, representing 4,261 eyes from 2,329 subjects with DM, were used for development, primary validation, and external testing. These images include 7,006 volume scans from CIRRUS OCT, 48,810 B-scans from SPECTRALIS OCT, and 44,911 B-scans from Triton OCT. Table 1 shows the characteristics of the study participants in both the primary data set and the external testing data sets.

Table 1

Summary of primary and external testing data sets for training, validating, and testing the multitask DL system

Data setOCT deviceSample for DME classification moduleSample for non-DME retinal abnormalities classification module
Total sample (n = 100,727 images)No DMENon-CI-DMECI-DMEAbsencePresence
Primary CIRRUS HD-OCT No. of OCT volumes 2,580 413 795 3,405 383 
  No. of eyes 655 69 215 828 111 
   No. of subjects 295 38 141 402 72 
  Male sex, n (%) 136 (46.1) 21 (55.3) 82 (58.2) 209 (52.0) 30 (41.7) 
  Age (years), mean (SD) 60.8 (15.4) 62.7 (11.6) 63.7 (9.2) 60.6 (13.8) 68.0 (10.6) 
 SPECTRALIS OCT No. of OCT volumes 864 203 328 1,040 355 
  No. of OCT B-scans 25,073 4,392 1,050 25,815 4,700 
  No. of eyes 368 74 103 383 162 
  No. of subjects 164 36 74 171 103 
  Male sex, n (%) 84 (51.2) 17 (47.2) 50 (67.6) 95 (55.6) 56 (54.4) 
  Age (years), mean (SD) 59.5 (15.6) 64.2 (10.1) 62.2 (11.2) 58.3 (14.9) 65.4 (10.7) 
 Triton OCT No. of OCT volumes 2,121 421 790 2,163 1,169 
  No. of OCT B-scans 31,639 2,171 5,633 30,280 9,163 
  No. of eyes 575 89 296 538 427 
  No. of subjects 267 37 186 263 227 
  Male sex, n (%) 133 (49.8) 16 (43.2) 92 (49.5) 117 (44.5) 124 (54.6) 
  Age (years), mean (SD) 60.4 (14.9) 63.5 (12.7) 64.5 (12.7) 61.9 (12.9) 62.3 (13.6) 
External 1 (Singapore, SERI) CIRRUS HD-OCT No. of OCT volumes 2,320 19 26 2,292 73 
  No. of eyes 1,349 11 18 1,330 48 
  No. of subjects 663 25 654 43 
  Male sex, n (%) 395 (59.6) 5 (55.6) 11 (44.0) 381 (58.3) 30 (69.8) 
  Age (years), mean (SD) 59.5 (9.2) 53.8 (6.1) 58.3 (7.9) 58.3 (9.1) 61.6 (9.8) 
External 2 (Hong Kong, AHNH) CIRRUS HD-OCT No. of OCT volumes 412 69 178 486 173 
  No. of eyes 355 64 154 422 151 
  No. of subjects 141 34 117 186 106 
  Male sex, n (%) 86 (61.0) 23 (67.6) 72 (61.5) 116 (62.4) 65 (61.3) 
  Age (years), mean (SD) 69.1 (9.81) 61.2 (15.6) 64.8 (9.0) 65.5 (11.2) 69.2 (8.4) 
External 3 (U.S., Stanford) CIRRUS HD-OCT No. of OCT volumes 96 42 56 151 43 
  No. of eyes 96 42 56 151 43 
  No. of subjects 48 18 40 71 35 
  Male sex, n (%) 22 (45.8) 6 (33.3) 16 (40.0) 24 (33.8) 20 (57.1) 
  Age (years), mean (SD) 60.44 (12.0) 62.8 (10.7) 65.5 (10.8) 65.9 (11.3) 68.8 (9.2) 
External 4 (China, Aier) SPECTRALIS OCT No. of OCT volumes 172 27 172 30 
  No. of OCT B-scans 3,352 309 81 3,541 201 
  No. of eyes 172 27 172 30 
  No. of subjects 104 21 101 25 
  Male sex, n (%) 61 (58.7) 1 (33.3) 10 (47.6) 58 (57.4) 14 (56.0) 
  Age (years), mean (SD) 57.2 (10.0) 69.8 (5.41) 63.0 (11.6) 58.2 (10.2) 61.2 (13.1) 
External 5 (Australia, WIMR) SPECTRALIS OCT No. of OCT volumes 46 36 121 178 25 
  No. of OCT B-scans 3,578 1,151 361 4,588 502 
  No. of eyes 46 36 121 178 25 
  No. of subjects 11 11 81 87 16 
  Male sex, n (%) 6 (54.5) 5 (45.5) 53 (65.4) 55 (63.2) 9 (56.3) 
  Age (years), mean (SD) 54.2 (14.7) 54.5 (5.7) 62.2 (8.4) 59.5 (9.7) 65.6 (6.9) 
External 6 (Hong Kong, UCH) SPECTRALIS OCT No. of OCT volumes 53 50 296 327 72 
  No. of OCT B-scans 3,980 4,484 999 8,361 1,102 
  No. of eyes 53 50 296 327 72 
  No. of subjects 11 14 191 160 56 
  Male sex, n (%) 5 (45.5) 9 (64.3) 119 (62.3) 84 (52.5) 35 (62.5) 
  Age (years), mean (SD) 66.5 (11.6) 61.7 (9.1) 67.2 (9.8) 65.9 (10.5) 69.4 (7.6) 
External 7 (China, JSIEC) Triton OCT No. of OCT volumes 36 23 394 116 337 
  No. of OCT B-scans 1,295 507 3,666 4,872 596 
  No. of eyes 36 23 394 116 337 
  No. of subjects 29 17 280 54 272 
  Male sex, n (%) 11 (37.9) 4 (23.6) 128 (45.7) 20 (37.0) 120 (44.1) 
  Age (years), mean (SD) 57.2 (12.8) 60.0 (7.7) 58.9 (8.8) 59.0 (10.4) 58.7 (9.0) 
Data setOCT deviceSample for DME classification moduleSample for non-DME retinal abnormalities classification module
Total sample (n = 100,727 images)No DMENon-CI-DMECI-DMEAbsencePresence
Primary CIRRUS HD-OCT No. of OCT volumes 2,580 413 795 3,405 383 
  No. of eyes 655 69 215 828 111 
   No. of subjects 295 38 141 402 72 
  Male sex, n (%) 136 (46.1) 21 (55.3) 82 (58.2) 209 (52.0) 30 (41.7) 
  Age (years), mean (SD) 60.8 (15.4) 62.7 (11.6) 63.7 (9.2) 60.6 (13.8) 68.0 (10.6) 
 SPECTRALIS OCT No. of OCT volumes 864 203 328 1,040 355 
  No. of OCT B-scans 25,073 4,392 1,050 25,815 4,700 
  No. of eyes 368 74 103 383 162 
  No. of subjects 164 36 74 171 103 
  Male sex, n (%) 84 (51.2) 17 (47.2) 50 (67.6) 95 (55.6) 56 (54.4) 
  Age (years), mean (SD) 59.5 (15.6) 64.2 (10.1) 62.2 (11.2) 58.3 (14.9) 65.4 (10.7) 
 Triton OCT No. of OCT volumes 2,121 421 790 2,163 1,169 
  No. of OCT B-scans 31,639 2,171 5,633 30,280 9,163 
  No. of eyes 575 89 296 538 427 
  No. of subjects 267 37 186 263 227 
  Male sex, n (%) 133 (49.8) 16 (43.2) 92 (49.5) 117 (44.5) 124 (54.6) 
  Age (years), mean (SD) 60.4 (14.9) 63.5 (12.7) 64.5 (12.7) 61.9 (12.9) 62.3 (13.6) 
External 1 (Singapore, SERI) CIRRUS HD-OCT No. of OCT volumes 2,320 19 26 2,292 73 
  No. of eyes 1,349 11 18 1,330 48 
  No. of subjects 663 25 654 43 
  Male sex, n (%) 395 (59.6) 5 (55.6) 11 (44.0) 381 (58.3) 30 (69.8) 
  Age (years), mean (SD) 59.5 (9.2) 53.8 (6.1) 58.3 (7.9) 58.3 (9.1) 61.6 (9.8) 
External 2 (Hong Kong, AHNH) CIRRUS HD-OCT No. of OCT volumes 412 69 178 486 173 
  No. of eyes 355 64 154 422 151 
  No. of subjects 141 34 117 186 106 
  Male sex, n (%) 86 (61.0) 23 (67.6) 72 (61.5) 116 (62.4) 65 (61.3) 
  Age (years), mean (SD) 69.1 (9.81) 61.2 (15.6) 64.8 (9.0) 65.5 (11.2) 69.2 (8.4) 
External 3 (U.S., Stanford) CIRRUS HD-OCT No. of OCT volumes 96 42 56 151 43 
  No. of eyes 96 42 56 151 43 
  No. of subjects 48 18 40 71 35 
  Male sex, n (%) 22 (45.8) 6 (33.3) 16 (40.0) 24 (33.8) 20 (57.1) 
  Age (years), mean (SD) 60.44 (12.0) 62.8 (10.7) 65.5 (10.8) 65.9 (11.3) 68.8 (9.2) 
External 4 (China, Aier) SPECTRALIS OCT No. of OCT volumes 172 27 172 30 
  No. of OCT B-scans 3,352 309 81 3,541 201 
  No. of eyes 172 27 172 30 
  No. of subjects 104 21 101 25 
  Male sex, n (%) 61 (58.7) 1 (33.3) 10 (47.6) 58 (57.4) 14 (56.0) 
  Age (years), mean (SD) 57.2 (10.0) 69.8 (5.41) 63.0 (11.6) 58.2 (10.2) 61.2 (13.1) 
External 5 (Australia, WIMR) SPECTRALIS OCT No. of OCT volumes 46 36 121 178 25 
  No. of OCT B-scans 3,578 1,151 361 4,588 502 
  No. of eyes 46 36 121 178 25 
  No. of subjects 11 11 81 87 16 
  Male sex, n (%) 6 (54.5) 5 (45.5) 53 (65.4) 55 (63.2) 9 (56.3) 
  Age (years), mean (SD) 54.2 (14.7) 54.5 (5.7) 62.2 (8.4) 59.5 (9.7) 65.6 (6.9) 
External 6 (Hong Kong, UCH) SPECTRALIS OCT No. of OCT volumes 53 50 296 327 72 
  No. of OCT B-scans 3,980 4,484 999 8,361 1,102 
  No. of eyes 53 50 296 327 72 
  No. of subjects 11 14 191 160 56 
  Male sex, n (%) 5 (45.5) 9 (64.3) 119 (62.3) 84 (52.5) 35 (62.5) 
  Age (years), mean (SD) 66.5 (11.6) 61.7 (9.1) 67.2 (9.8) 65.9 (10.5) 69.4 (7.6) 
External 7 (China, JSIEC) Triton OCT No. of OCT volumes 36 23 394 116 337 
  No. of OCT B-scans 1,295 507 3,666 4,872 596 
  No. of eyes 36 23 394 116 337 
  No. of subjects 29 17 280 54 272 
  Male sex, n (%) 11 (37.9) 4 (23.6) 128 (45.7) 20 (37.0) 120 (44.1) 
  Age (years), mean (SD) 57.2 (12.8) 60.0 (7.7) 58.9 (8.8) 59.0 (10.4) 58.7 (9.0) 

Aier, Aier School of Ophthalmology; SERI, Singapore Eye Research Institute; Stanford, Byers Eye Institute at Stanford; UCH, United Christian Hospital; WIMR, Westmead Institute for Medical Research.

Table 2 shows the discriminative performance of the DL system in DME classification (presence vs. absence of DME) for the primary validation and exte-rnal testing data sets at volume scan level. For the primary data set, the DL system achieved AUROCs of 0.937 (95% CI 0.920–0.954), 0.958 (95% CI 0.930–0.977), and 0.965 (95% CI 0.948–0.977) among images obtained from the CIRRUS, SPECTRALIS, and Triton OCTs, respectively, with sensitivities of 87.4%, 92.7%, and 94.3%; specificities of 100%, 98.9%, and 98.6%; and accuracies of 96.4%, 96.3%, and 96.9%. For classifying CI-DME and non-CI-DME among eyes with any DME, the DL system achieved AUROCs of 0.968 (95% CI 0.940–0.995), 0.951 (95% CI 0.898–0.982), and 0.975 (95% CI 0.947–0.991) among images obtained from the CIRRUS, SPECTRALIS, and Triton OCTs, with sensitivities of 95.8%, 92.3%, and 98.9%; specificities of 97.8%, 97.9%, and 96.2%; and accuracies of 96.3%, 94.4%, and 98.0%. For the external data sets, the discriminative performance of the DL system with different OCT devices was similar to that for the primary data set. For the classification of any DME, the ranges for AUROCs, sensitivity, specificity, and accuracy were 0.906–0.956, 81.4–100.0%, 89.7–100.0%, and 92.6–99.5%, respectively. For the classification of CI-DME and non-CI-DME, the ranges for AUROCs, sensitivity, specificity, and accuracy were 0.894–1.000, 87.1–100.0%, 85.7–100.0%, and 91.3–100.0%.

Table 2

Discriminative performance of the multitask DL system in the classification of DME and CI-DME across primary validation and external testing data sets

Classification task and OCT deviceData setAUROC (95% CI)Sensitivity, % (95% CI)Specificity, % (95% CI)Accuracy, % (95% CI)
Presence vs. absence of DME      
 CIRRUS HD-OCT Primary 0.937 (0.920–0.954) 87.4 (82.7–91.6) 100.0 (100.0–100.0) 96.4 (95.1–97.6) 
 External 1 0.906 (0.947–0.968) 81.4 (69.8–93.0) 99.8 (99.6–100.0) 99.5 (99.2–99.7) 
 External 2 0.929 (0.907–0.947) 89.5 (85.1–93.0) 96.3 (93.9–97.9) 93.6 (91.5–95.4) 
 External 3 0.930 (0.894–0.965) 86.0 (78.5–92.5) 100.0 (100.0–100.0) 93.5 (90.0–96.5) 
 SPECTRALIS OCT Primary 0.958 (0.930–0.977) 92.7 (86.9–96.4) 98.9 (96.3–99.9) 96.3 (93.7–98.1) 
 External 4 0.956 (0.935–0.978) 100.0 (100.0–100.0) 91.3 (86.6–95.4) 92.6 (88.6–96.0) 
 External 5 0.936 (0.879–0.994) 97.6 (95.2–99.4) 89.7 (75.9–100.0) 96.4 (93.8–98.5) 
 External 6 0.949 (0.922–0.977) 96.4 (94.2–98.4) 93.4 (87.9–97.8) 95.7 (93.7–97.5) 
 Triton OCT Primary 0.965 (0.948–0.977) 94.3 (90.9–96.8) 98.6 (96.9–99.5) 96.9 (95.4–98.1) 
 External 7 0.954 (0.930–0.971) 99.3 (97.9–99.9) 91.7 (77.5–98.3) 98.6 (97.1–99.5) 
CI-DME vs. non-CI-DME      
 CIRRUS HD-OCT Primary 0.968 (0.940–0.995) 95.8 (92.3–95.6) 97.8 (93.3–100.00) 96.3 (93.1–98.9) 
 External 1 0.939 (0.851–1.000) 95.5 (86.4–100.0) 92.3 (76.9–100.0) 94.3 (85.7–100.0) 
 External 2 0.894 (0.847–0.931) 87.1 (81.1–91.8) 91.7 (81.6–97.2) 88.3 (83.5–92.2) 
 External 3 1.000 (1.000–1.000) 100.0 (100.0–100.0) 100.0 (100.0–100.0) 100.0 (100.0–100.0) 
 SPECTRALIS OCT Primary 0.951 (0.898–0.982) 92.3 (84.0–97.1) 97.9 (88.9–99.9) 94.4 (88.9–97.7) 
 External 4 0.929 (0.863–0.995) 88.9 (71.0–97.6) 100.0 (29.2–100.0) 90.0 (0.735–0.979) 
 External 5 0.899 (0.851–0.947) 94.0 (89.5–97.7) 85.7 (76.2–93.7) 91.3 (87.2–94.9) 
 External 6 0.934 (0.905–0.962) 94.2 (91.4–96.6) 92.5 (86.9–97.2) 93.7 (91.5–96.0) 
 Triton OCT Primary 0.975 (0.947–0.991) 98.9 (95.9–99.9) 96.2 (89.2–99.2) 98.0 (95.4–99.4) 
 External 7 0.975 (0.955–0.988) 100.0 (99.1–100.0) 95.0 (75.1–99.9) 99.8 (98.7–100.0) 
Classification task and OCT deviceData setAUROC (95% CI)Sensitivity, % (95% CI)Specificity, % (95% CI)Accuracy, % (95% CI)
Presence vs. absence of DME      
 CIRRUS HD-OCT Primary 0.937 (0.920–0.954) 87.4 (82.7–91.6) 100.0 (100.0–100.0) 96.4 (95.1–97.6) 
 External 1 0.906 (0.947–0.968) 81.4 (69.8–93.0) 99.8 (99.6–100.0) 99.5 (99.2–99.7) 
 External 2 0.929 (0.907–0.947) 89.5 (85.1–93.0) 96.3 (93.9–97.9) 93.6 (91.5–95.4) 
 External 3 0.930 (0.894–0.965) 86.0 (78.5–92.5) 100.0 (100.0–100.0) 93.5 (90.0–96.5) 
 SPECTRALIS OCT Primary 0.958 (0.930–0.977) 92.7 (86.9–96.4) 98.9 (96.3–99.9) 96.3 (93.7–98.1) 
 External 4 0.956 (0.935–0.978) 100.0 (100.0–100.0) 91.3 (86.6–95.4) 92.6 (88.6–96.0) 
 External 5 0.936 (0.879–0.994) 97.6 (95.2–99.4) 89.7 (75.9–100.0) 96.4 (93.8–98.5) 
 External 6 0.949 (0.922–0.977) 96.4 (94.2–98.4) 93.4 (87.9–97.8) 95.7 (93.7–97.5) 
 Triton OCT Primary 0.965 (0.948–0.977) 94.3 (90.9–96.8) 98.6 (96.9–99.5) 96.9 (95.4–98.1) 
 External 7 0.954 (0.930–0.971) 99.3 (97.9–99.9) 91.7 (77.5–98.3) 98.6 (97.1–99.5) 
CI-DME vs. non-CI-DME      
 CIRRUS HD-OCT Primary 0.968 (0.940–0.995) 95.8 (92.3–95.6) 97.8 (93.3–100.00) 96.3 (93.1–98.9) 
 External 1 0.939 (0.851–1.000) 95.5 (86.4–100.0) 92.3 (76.9–100.0) 94.3 (85.7–100.0) 
 External 2 0.894 (0.847–0.931) 87.1 (81.1–91.8) 91.7 (81.6–97.2) 88.3 (83.5–92.2) 
 External 3 1.000 (1.000–1.000) 100.0 (100.0–100.0) 100.0 (100.0–100.0) 100.0 (100.0–100.0) 
 SPECTRALIS OCT Primary 0.951 (0.898–0.982) 92.3 (84.0–97.1) 97.9 (88.9–99.9) 94.4 (88.9–97.7) 
 External 4 0.929 (0.863–0.995) 88.9 (71.0–97.6) 100.0 (29.2–100.0) 90.0 (0.735–0.979) 
 External 5 0.899 (0.851–0.947) 94.0 (89.5–97.7) 85.7 (76.2–93.7) 91.3 (87.2–94.9) 
 External 6 0.934 (0.905–0.962) 94.2 (91.4–96.6) 92.5 (86.9–97.2) 93.7 (91.5–96.0) 
 Triton OCT Primary 0.975 (0.947–0.991) 98.9 (95.9–99.9) 96.2 (89.2–99.2) 98.0 (95.4–99.4) 
 External 7 0.975 (0.955–0.988) 100.0 (99.1–100.0) 95.0 (75.1–99.9) 99.8 (98.7–100.0) 

For identification of External 1, 2, 3, 4, 5, 6, and 7, see Table 1.

Table 3 shows the performance of the DL system in classifying the presence or absence of non-DME retinal abnormalities at volume scan level. For the primary data set, the AUROCs were 0.948 (95% CI 0.930–0.963), 0.949 (95% CI 0.901–0.996), and 0.938 (95% CI 0.915–0.960) for images obtained from the CIRRUS, SPECTRALIS, and Triton OCTs, respectively, with sensitivities of 93.0%, 93.1%, and 97.2%; specificities of 89.4%, 96.6%, and 90.3%; and accuracies of 89.9%, 96.3%, and 91.0%. The performance in external data sets remained excellent, with the ranges for AUROCs, sensitivity, specificity, and accuracy being 0.901–0.969, 84.2–99.6%, 80.6–98.8%, and 91.0–98.0%, respectively.

Table 3

Discriminative performance of the multitask DL system in the classification of presence vs. absence of non-DME retinal abnormalities across primary validation and external testing data sets

OCT deviceData setAUROC (95% CI)Sensitivity, % (95% CI)Specificity, % (95% CI)Accuracy, % (95% CI)
CIRRUS HD-OCT Primary 0.948 (0.930–0.963) 93.0 (86.8–96.9) 89.4 (86.7–91.7) 89.9 (87.6–92.0) 
 External 1 0.969 (0.941–0.996) 90.4 (83.6–97.3) 97.9 (89.4–96.3) 97.7 (89.6–99.3) 
 External 2 0.915 (0.891–0.935) 91.0 (85.2–95.1) 88.7 (85.4–91.3) 89.2 (86.7–91.5) 
 External 3 0.898 (0.830–0.966) 84.2 (71.1–94.7) 92.6 (88.3–96.3) 91.0 (87.0–94.5) 
SPECTRALIS OCT Primary 0.949 (0.901–0.996) 93.1 (82.8–100.0) 96.6 (94.3–98.7) 96.3 (94.2–98.2) 
 External 4 0.940 (0.901–0.979) 96.7 (0.900–100.0) 91.3 (82.6–95.4) 92.1 (85.2–95.5) 
 External 5 0.960 (0.912–1.000) 93.1 (82.8–100.0) 98.8 (97.0–100.0) 98.0 (95.9–99.5) 
 External 6 0.901 (0.867–0.935) 99.6 (98.9–100.0) 80.6 (73.9–87.3) 93.2 (90.5–95.5) 
Triton OCT Primary 0.938 (0.915–0.960) 97.2 (93.1–100.0) 90.3 (87.8–92.6) 91.0 (88.9–93.1) 
 External 7 0.926 (0.897–0.955) 90.5 (85.3–95.7) 94.7 (92.0–96.7) 93.6 (91.4–95.8) 
OCT deviceData setAUROC (95% CI)Sensitivity, % (95% CI)Specificity, % (95% CI)Accuracy, % (95% CI)
CIRRUS HD-OCT Primary 0.948 (0.930–0.963) 93.0 (86.8–96.9) 89.4 (86.7–91.7) 89.9 (87.6–92.0) 
 External 1 0.969 (0.941–0.996) 90.4 (83.6–97.3) 97.9 (89.4–96.3) 97.7 (89.6–99.3) 
 External 2 0.915 (0.891–0.935) 91.0 (85.2–95.1) 88.7 (85.4–91.3) 89.2 (86.7–91.5) 
 External 3 0.898 (0.830–0.966) 84.2 (71.1–94.7) 92.6 (88.3–96.3) 91.0 (87.0–94.5) 
SPECTRALIS OCT Primary 0.949 (0.901–0.996) 93.1 (82.8–100.0) 96.6 (94.3–98.7) 96.3 (94.2–98.2) 
 External 4 0.940 (0.901–0.979) 96.7 (0.900–100.0) 91.3 (82.6–95.4) 92.1 (85.2–95.5) 
 External 5 0.960 (0.912–1.000) 93.1 (82.8–100.0) 98.8 (97.0–100.0) 98.0 (95.9–99.5) 
 External 6 0.901 (0.867–0.935) 99.6 (98.9–100.0) 80.6 (73.9–87.3) 93.2 (90.5–95.5) 
Triton OCT Primary 0.938 (0.915–0.960) 97.2 (93.1–100.0) 90.3 (87.8–92.6) 91.0 (88.9–93.1) 
 External 7 0.926 (0.897–0.955) 90.5 (85.3–95.7) 94.7 (92.0–96.7) 93.6 (91.4–95.8) 

For identification of External 1, 2, 3, 4, 5, 6, and 7, see Table 1.

Figure 2 and Videos 1–3 show examples of images for eyes with DME for each of the three OCT devices and their corresponding heat maps, demonstrating our DL system’s ability to pay attention to features related to DME identification. In additional analysis (Supplementary Material), as the volume scan–level results for SPECTRALIS OCT and Triton OCT were made by 2D CNNs at the B-scan level with a presence-based strategy, we further tested the performances for the classification of any DME (Supplementary Table 1) and non-DME retinal abnormalities (Supplementary Table 2) at the B-scan level. We also tested the performances when only one scan from one eye was included (Supplementary Tables 3 and 4) in the primary data set. Furthermore, we tested the performances for classifying any DME among eyes with non-DME retinal abnormalities (Supplementary Tables 5 and 7) and for classifying non-DME retinal abnormalities among eyes with DME (Supplementary Tables 6 and 8) in the primary data set. Performances similar to those among the entire primary data set were demonstrated. Heat maps of each false-negative and false-positive case at the per-scan level of SPECTRALIS OCT and Trion OCT in the primary testing data set in classifying DME were reviewed, and examples are presented in Supplementary Figs. 1 and 2.

Figure 2

Examples of eyes with DME and corresponding heat maps. DME was identified among images from CIRRUS HD-OCT (A), SPECTRALIS OCT (B), and Triton OCT (C). In the heat maps, the colored area indicates the gradient of discriminatory power for classifying the presence or absence of DME. An orange-red color indicates the greatest relative discriminatory power, whereas a green-blue color indicates the least relative discriminatory power.

Figure 2

Examples of eyes with DME and corresponding heat maps. DME was identified among images from CIRRUS HD-OCT (A), SPECTRALIS OCT (B), and Triton OCT (C). In the heat maps, the colored area indicates the gradient of discriminatory power for classifying the presence or absence of DME. An orange-red color indicates the greatest relative discriminatory power, whereas a green-blue color indicates the least relative discriminatory power.

Close modal

Video 1. A case with DME on Cirrus OCT scan. Available from https://bcove.video/3kNBT8X.

Video 1. A case with DME on Cirrus OCT scan. Available from https://bcove.video/3kNBT8X.

Close modal

Video 2. A case with DME on Spectralis OCT scan. Available from https://bcove.video/3iELLPv.

Video 2. A case with DME on Spectralis OCT scan. Available from https://bcove.video/3iELLPv.

Close modal

Video 3. A case with DME on Triton OCT scan. Available from https://bcove.video/3y2eCnh.

Video 3. A case with DME on Triton OCT scan. Available from https://bcove.video/3y2eCnh.

Close modal

Regular screening for DR remains a cornerstone of the management of diabetic eye disease, having been shown to reduce blindness at the population level. A major shift in the last decade has been the widespread use of 2D retinal fundus photography for DR screening. However, following the understanding that DME is the primary cause of vision loss among DM patients, OCT has been suggested to utilize for a timely identification and treatment of DME, particularly CI-DME, to prevent vision loss in patients with DM (3,7,26). In the current study, we developed and validated a novel DL system for the fully automated classification of DME based on both 3D and 2D OCT images from three commonly used devices, yielding volume scan–level results for each eye. We externally tested the DL system using diverse, independent data sets collected from different centers, across different racial/ethnic groups, and in different settings (i.e., community-based screening and tertiary care settings). We showed that the proposed DL system had an excellent discriminative performance in both classifying any DME and distinguishing CI-DME from non-CI-DME, as well as identifying other non-DME retinal abnormalities on OCT images captured by the three widely used devices.

Our study substantially extends existing research and other studies as follows. First, currently, most reported DL algorithms for DME detection on OCT have mainly been trained and tested using cross-sectional B-scans (1620,27,28). In the current study, we trained a CNN model to detect DME using 3D volume scans obtained from CIRRUS OCT, achieving a comparable performance. Application of 3D volume scan has substantial merits for DL algorithms training (29,30). For instance, it is labored and time-consuming for experts to label numerous B-scans for conducting supervised system training. Using labeled images from the volume scan level to train the proposed CNN reduced the burden of labeling work while at the same time maintaining excellent performance. En face slab imaging is also available from OCT. However, en face images may not be informative for applying the DL system to detect DME, as DME is a 3D condition. It should be noted that the volume scan–level results for SPECTRALIS OCT and Triton OCT were obtained from predictions made for a series of B-scans according to the presence-based strategy described in Research Design and Methods, as only individual B-scans could be exported from the raw files. By contrast, entire volumetric cubes could be exported directly from CIRRUS OCT.

Our study’s second novel feature is the classification of DME into CI-DME and non-CI-DME subgroups by the DL system. This subgroup categorization of DME is clinically important for DR screening, as it determines the reexamination frequency, the necessity and timing for referral to ophthalmologists, and the treatment recommendations in different resource settings, according to the International Council of Ophthalmology (ICO) Guidelines for Diabetic Eye Care (6). For example, in low- to intermediate-resource settings, patients with CI-DME as identified by the DL system should be referred to ophtha-lmologists and considered for either intravitreal anti–vascular endothelial growth factor therapy or laser photocoagulation as soon as possible, while patients with non-CI-DME can be referred to ophthalmologists at less busy clinics (6). The subclassification of DME will help to triage patients more effectively in DR screening programs, reducing false positives, conserving resources—especially in low- or intermediate-resource regions or countries—and enabling ophthalmologists to prioritize patients who need prompt treatment for preventing vision loss, allowing for the better use of costly specialist care resources and shorter hospital wait times. However, it should be noted that DME is but one of many factors that determine whether a given patient requires treatment or what that treatment should be. The current DL system alone is not intended for making any therapeutic decisions. In addition to the subgroup classification of DME, we intend to further develop the DL system to include other outputs related to visual acuity and structure of retina (e.g., ellipsoid zone and the external limiting membrane). The system may further indicate the possibility of visual recovery and improve the prioritization of patients who need prompt treatment, making the system more valuable in screening and assisting clinicians.

Our third novelty is the ability of the DL system to detect non-DME retinal conditions. Most previously reported DL systems only focus on one disease besides DME (17,18,27,31). Our study goes beyond previously published work, with detection of non-DME retinal abnormalities among individuals with DM, enhancing the applicability of our DL system in real-world screening settings. Training such a system to achieve good performance, using OCT images with multiple diseases besides DME, may be difficult given that some diseases are uncommon and some ocular changes share similar features with DME. However, our current proposed approach to detecting multiple diseases is relevant and representative of the populations for DR screening. Our proposed DL system showed excellent performances in detecting DME, not only among all cases, but also among those with non-DME retinal abnormalities, and vice versa (detecting retinal abnormalities among eyes with DME [results shown in Supplementary Tables 3–6]). Both results suggest that the system has great potential real-world clinical utility.

The fourth novel aspect of our DL algorithm is its applicability to three commonly used OCT devices. Previous studies have focused on one or, at most, two commercial OCT devices (15,16,18,20,27,28,32). In this study, we trained CNNs to detect DME using images obtained from three comme-rcial OCT devices, making the screening more generalizable. A common research challenge is that while the Digital Imaging and Communications in Medicine (DICOM) standard ensures a reasonable consistency among OCT images from different manufacturers, OCT images are often stored in a compressed format that may result in loss of information. Therefore, we trained the DL system using raw data (i.e., IMG files from CIRRUS OCT, E2E files from SPECTRALIS OCT, and FDS files from Triton OCT) exported from each OCT manufacturer’s software. Our DL system thus represents a machine-agnostic platform applicable to a wider range of OCT modalities.

Finally, in most previous studies, DL systems were trained to detect retinal pathologies by focusing either on the segmentation of relevant pathological markers such as macular fluid (15,18,20) or on the classification of the presence of specific pathologies on OCT scans (16,17,19). Although the segmentation approach for fluid quantification provides visualized and quantitative outcomes, it requires a vast number of B-scan level ground truths (i.e., pathologies delineated from each B-scan) labeled by skilled technicians. Such detailed labels are often unavailable, limiting the usability of the approach. In the current study, we trained our DL system to classify DME and non-DME retinal abnormalities through applying a segmentation-free classification approach to reduce the time and personnel requirements for labeling work, making it possible to train and validate the DL system using large data sets (>100,000 OCT images) while maintaining performance standards.

Our DL system showed excellent performance in detecting DME in the primary validation data set and all the external data sets; all the AUROC values were >0.90, demonstrating high generalizability across the data sets. As in our study, sensitivities and specificities dropped slightly for certain external data sets. We identified two possible reasons for these decreases. First, drops might be due to inter–data set variation in OCT images, including differences in calibration and image intensity (e.g., background noise and brightness), although scanning protocols were kept the same. Second, discrepancies might arise due to the racial/ethnic diversity for the primary and external data sets. Previously reported ethnic variabilities in retinal structure (e.g., foveal architecture [33] and vascular morphology [34]) might influence the DL system in terms of its DME classification performance. Despite these issues, the performance of our proposed DL system, which has a high sensitivity value of >80%, should be adequate for scre-ening purposes (26,35).

The current study has several strengths. First, our DL system can analyze OCT images from three commercially available OCT devices and has been successfully tested using unseen multicenter data sets comprising images from different racial/ethnic backgrounds and geographical locations. Second, the DL system generated heat maps to visualize discriminative image regions, allowing for a better understanding of the features that the DL system used to perform classification tasks. Several limitations of the current study should also be considered. First, we trained and tested our DL system only on gradable OCT images. Indeed, an initial gradability assessment is essential prior to disease classification, as it combines both quality checks for the acquired images and decisions on image inclusion. We are beginning to develop a separate 3D CNN for the automated filtering of ungradable OCT volume scans obtained from CIRRUS OCT (36). In our preliminary analysis for a macular OCT scan, the CNN for CIRRUS OCT achieved an AUROC of 0.884, with an accuracy of 0.873 in distinguishing gradable OCT volumes from ungradable ones (unpublished data, A.R. Ran, Z.Q. Tang, F.Y. Tang, J. Shi, A.K. Ngai, V. Yuen, N. Kei, and C.Y. Cheung). We will incorporate this CNN into the next version of the DL system. Second, the types of non-DME retinal abnormalities considered were relatively limited, and their numbers were few in the primary training data set. In fact, the performance in external data sets slightly dropped. The DL system thus needs to be further expanded and refined to consider a greater variety of abnormalities. Third, distinguishing DME from macular edema caused by other retinal abnormalities based on OCT image alone may be difficult. Nevertheless, the aim of the proposed DL system was to be not a diagnostic tool but a second-line screening tool for more effective patient triage in DR screening programs. It is worth noting that all data sets were retrospectively collected from patients with DM in eye clinics or DR screening programs. We believe that the majority of macular edema was due to DM or DR and the confusion of the cause of macular edema would be minimized. Fourth, it should be noted that spectral domain OCT is still specialized equipment in most countries, and its availability in the community may limit the implementation of the DL system. Further evidence on how the DL system performs as a second-line screening tool, or even a first-line screening tool or diagnostic support tool, to improve clinical workflows significantly for eventual real-world clinical use in the next phase will be essential. In addition, future development to unify all types of OCT scans into one framework to achieve effective DME classification in a device-agnostic manner is needed.

In summary, we developed, validated, and externally tested a multitask DL system to identify any DME, the subtypes of CI-DME and non-CI-DME, and non-DME retinal abnormalities from images obtained using three commercial OCT devices. The system showed excellent performance across diverse study populations in different settings. Our study extends the promise of incorporating OCT into current retinal fundus photography–based DR screening programs as a second-line screening tool, allowing for the efficient and reliable detection of DME, which may lead to reductions in overreferrals and increased clinical use of DL systems.

This article contains supplementary material online at https://doi.org/10.2337/figshare.14710284.

Acknowledgments. The authors thank the study participants and staff of the following institutes: Department of Ophthalmology and Visual Sciences, CUHK; Department of Computer Science and Engineering, CUHK; Hong Kong Eye Hospital; Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital; AHNH; United Christian Hospital; Singapore Eye Research Institute; JSIEC, Aier School of Ophthalmology; Byers Eye Institute at Stanford; Department of Ophthalmology, Westmead Institute for Medical Research; and Macquarie University Hearing, Department of Linguistics, Macquarie University.

Funding. This study was funded by the Research Grants Council General Research Fund, Hong Kong (no. 14102418); and Innovation and Technology Fund, Hong Kong (MRP/056/20X); Research to Prevent Blindness; and National Institutes of Health (P30-EY-026877).

The funder had no role in study design, data collection, data analysis, data interpretation, or report writing.

Duality of Interest. No potential conflicts of interest relevant to this article were reported.

Author Contributions. F.T. and C.Y.C. contributed to literature search. F.T., C.P.P., C.C.T., and C.Y.C. contributed to the study design. X.W. developed and validated the DL system, under the supervision of H.C. and P.-A.H. F.T., A.-r.R., C.K.M.C., M.H., W.Y., A.L.Y., J.L., S.S., J.C., F.Y., R.W., Z.T., D.Y., D.S.N., L.J.C., M.B., V.C., K.L., T.H.T.L., G.S.T., D.S.W.T., H.H., H.C., J.H.M., T.L., S.K., S.S.M., R.T.C., G.L., B.G., T.Y.W., S.B.T., and C.Y.C. contributed to data collection. F.T. and A.-r.R. contributed to data analysis. F.T. and X.W. contributed to figure design. F.T., X.W., T.Y.Y.L., P.H.S., and C.Y.C. contributed to data interpretation. The manuscript was critically revised and approved by all authors. C.Y.C. obtained funding. C.Y.C. is the guarantor of this work and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

1.
Tan
GS
,
Cheung
N
,
Simó
R
,
Cheung
GC
,
Wong
TY
.
Diabetic macular oedema
.
Lancet Diabetes Endocrinol
2017
;
5
:
143
155
2.
Scanlon
PH
.
The English National Screening Programme for diabetic retinopathy 2003-2016
.
Acta Diabetol
2017
;
54
:
515
525
3.
Garvican
L
,
Clowes
J
,
Gillow
T
.
Preservation of sight in diabetes: developing a national risk reduction programme
.
Diabet Med
2000
;
17
:
627
634
4.
Nguyen
HV
,
Tan
GS
,
Tapp
RJ
, et al
.
Cost-effectiveness of a national telemedicine diabetic retinopathy screening program in Singapore
.
Ophthalmology
2016
;
123
:
2571
2580
5.
Wang
LZ
,
Cheung
CY
,
Tapp
RJ
, et al
.
Availability and variability in guidelines on diabetic retinopathy screening in Asian countries
.
Br J Ophthalmol
2017
;
101
:
1352
1360
6.
Wong
TY
,
Sun
J
,
Kawasaki
R
, et al
.
Guidelines on diabetic eye care: the International Council of Ophthalmology recommendations for screening, follow-up, referral, and treatment based on resource settings
.
Ophthalmology
2018
;
125
:
1608
1622
7.
Solomon
SD
,
Chew
E
,
Duh
EJ
, et al
.
Diabetic retinopathy: a Position Statement by the American Diabetes Association
.
Diabetes Care
2017
;
40
:
412
418
8.
Wong
RL
,
Tsang
CW
,
Wong
DS
, et al
.
Are we making good use of our public resources? The false-positive rate of screening by fundus photography for diabetic macular oedema
.
Hong Kong Med J
2017
;
23
:
356
364
9.
Jyothi
S
,
Elahi
B
,
Srivastava
A
,
Poole
M
,
Nagi
D
,
Sivaprasad
S
.
Compliance with the quality standards of National Diabetic Retinopathy Screening Committee
.
Prim Care Diabetes
2009
;
3
:
67
72
10.
Goh
JK
,
Cheung
CY
,
Sim
SS
,
Tan
PC
,
Tan
GS
,
Wong
TY
.
Retinal Imaging techniques for diabetic retinopathy screening
.
J Diabetes Sci Technol
2016
;
10
:
282
294
11.
Olson
J
,
Sharp
P
,
Goatman
K
, et al
.
Improving the economic value of photographic screening for optical coherence tomography-detectable macular oedema: a prospective, multicentre, UK study
.
Health Technol Assess
2013
;
17
:
1
142
12.
Leal
J
,
Luengo-Fernandez
R
,
Stratton
IM
,
Dale
A
,
Ivanova
K
,
Scanlon
PH
.
Cost-effectiveness of digital surveillance clinics with optical coherence tomography versus hospital eye service follow-up for patients with screen-positive maculopathy
.
Eye (Lond)
2019
;
33
:
640
647
13.
Bressler
SB
,
Edwards
AR
,
Chalam
KV
, et al.;
Diabetic Retinopathy Clinical Research Network Writing Committee
.
Reproducibility of spectral-domain optical coherence tomography retinal thickness measurements and conversion to equivalent time-domain metrics in diabetic macular edema
.
JAMA Ophthalmol
2014
;
132
:
1113
1122
14.
Giani
A
,
Cigada
M
,
Choudhry
N
, et al
.
Reproducibility of retinal thickness measurements on normal and pathologic eyes by different optical coherence tomography instruments
.
Am J Ophthalmol
2010
;
150
:
815
824
15.
De Fauw
J
,
Ledsam
JR
,
Romera-Paredes
B
, et al
.
Clinically applicable deep learning for diagnosis and referral in retinal disease
.
Nat Med
2018
;
24
:
1342
1350
16.
Kermany
DS
,
Goldbaum
M
,
Cai
W
, et al
.
Identifying medical diagnoses and treatable diseases by image-based deep learning
.
Cell
2018
;
172
:
1122
1131.e9
17.
Lemaître
G
,
Rastgoo
M
,
Massich
J
, et al
.
Classification of SD-OCT volumes using local binary patterns: experimental validation for DME detection
.
J Ophthalmol
2016
;
2016
:
3298606
18.
Roy
AG
,
Conjeti
S
,
Karri
SPK
, et al
.
ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks
.
Biomed Opt Express
2017
;
8
:
3627
3642
19.
Rasti
R
,
Rabbani
H
,
Mehridehnavi
A
,
Hajizadeh
F
.
Macular OCT classification using a multi-scale convolutional neural network ensemble
.
IEEE Trans Med Imaging
2018
;
37
:
1024
1034
20.
Schlegl
T
,
Waldstein
SM
,
Bogunovic
H
, et al
.
Fully automated detection and quantification of macular fluid in OCT using deep learning
.
Ophthalmology
2018
;
125
:
549
558
21.
De Fauw
J
,
Ledsam
JR
,
Romera-Paredes
B
, et al
.
Clinically applicable deep learning for diagnosis and referral in retinal disease
.
2018
;
24
:
1342
1350
22.
Tang
FY
,
Ng
DS
,
Lam
A
, et al
.
Determinants of quantitative optical coherence Tomography angiography metrics in patients with diabetes
.
Sci Rep
2017
;
7
:
2575
23.
Wells
JA
,
Glassman
AR
,
Ayala
AR
, et al.;
Diabetic Retinopathy Clinical Research Network
.
Aflibercept, bevacizumab, or ranibizumab for diabetic macular edema
.
N Engl J Med
2015
;
372
:
1193
1203
24.
Patrao
NV
,
Antao
S
,
Egan
C
, et al.;
Moorfields Diabetic Macular Edema Study Group
.
Real-world outcomes of ranibizumab treatment for diabetic macular edema in a United Kingdom National Health Service setting
.
Am J Ophthalmol
2016
;
172
:
51
57
25.
Wang
X
,
Tang
F
,
Chen
H
, et al
.
UD-MIL: uncertainty-driven deep multiple instance learning for OCT image classification
.
IEEE J Biomed Health Inform
2020
;
24
:
3431
3442
26.
Vujosevic
S
,
Aldington
SJ
,
Silva
P
, et al
.
Screening for diabetic retinopathy: new pers-pectives and challenges
.
Lancet Diabetes Endocrinol
2020
;
8
:
337
347
27.
Lee
CS
,
Tyring
AJ
,
Deruyter
NP
,
Wu
Y
,
Rokem
A
,
Lee
AY
.
Deep-learning based, automated segmentation of macular edema in optical coherence tomography
.
Biomed Opt Express
2017
;
8
:
3440
3448
28.
Tsuji
T
,
Hirose
Y
,
Fujimori
K
, et al
.
Classification of optical coherence tomography images using a capsule network
.
BMC Ophth-almol
2020
;
20
:
114
29.
Ran
AR
,
Cheung
CY
,
Wang
X
, et al
.
Detection of glaucomatous optic neuropathy with spectral-domain optical coherence tomography: a retrospective training and validation deep-learning analysis
.
Lancet Digit Health
2019
;
1
:
e172
e182
30.
Ran
AR
,
Tham
CC
,
Chan
PP
, et al
.
Deep learning in glaucoma with optical coherence tomography: a review
.
Eye (Lond)
2021
;
35
:
188
201
31.
Xu
Z
,
Wang
W
,
Yang
J
, et al
.
Automated diagnoses of age-related macular degeneration and polypoidal choroidal vasculopathy using bi-modal deep convolutional neural networks
.
Br J Ophthalmol
2021
;
105
:
561
566
32.
Sun
Y
,
Zhang
H
,
Yao
X
.
Automatic diagnosis of macular diseases from OCT volume based on its two-dimensional feature map and convolutional neural network with attention mechanism
.
J Biomed Opt
2020
;
25
:
09600432940026
33.
Ctori
I
,
Huntjens
B
.
The association between foveal morphology and macular pigment spatial distribution: an ethnicity study
.
PLoS One
2017
;
12
:
e0169520
34.
Wong
TY
,
Islam
FM
,
Klein
R
, et al
.
Retinal vascular caliber, cardiovascular risk factors, and inflammation: the multi-ethnic study of atherosclerosis (MESA)
.
Invest Ophthalmol Vis Sci
2006
;
47
:
2341
2350
35.
Burnett
S
,
Hurwitz
B
,
Davey
C
, et al
.
The implementation of prompted retinal screening for diabetic eye disease by accredited optometrists in an inner-city district of North London: a quality of care study
.
Diabet Med
1998
;
15
:
S38
S43
36.
Ran
AR
,
Shi
J
,
Ngai
AK
, et al
.
Artificial intelligence deep learning algorithm for discriminating ungradable optical coherence tomography three-dimensional volumetric optic disc scans
.
Neurophotonics
2019
;
6
:
041110
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at https://www.diabetesjournals.org/content/license.