OBJECTIVE

To describe the validity of recorded diabetic retinopathy (DR) and diabetic maculopathy (DMP) diagnoses, including edema (DMO) in The Health Improvement Network (THIN) database.

RESEARCH DESIGN AND METHODS

In two independent computer searches, we detected 20,838 patients with diabetes aged 1–84 years with a first DR computer Read entry in 2000–2008 and 4,064 with a first DMP entry. A two-step strategy was used to validate both outcomes as follows: 1) review of patient profiles including free-text comments from primary care practitioners (PCPs) (containing referral information and test results) of a random sample of 500 DR and all DMP computer-detected patients. We classified them in probable, possible, and noncase according to the diagnosis plausibility based on the manual review of the computerized information; and 2) review of questionnaires sent by PCPs and medical records in a random sample (N = 200 for each outcome including 36 diabetic macular edema [DMO]). Gold standard was PCPs’ confirmation.

RESULTS

After profiles review, we categorized 418 as probable/possible DR. In addition, 3,676 DMP were categorized as probable/possible (including 711 DMO). After review of information sent by PCPs, confirmation rates were 87.3 and 87.2%, respectively (90.3% for DMO). When we applied them to the whole sample of computer-detected patients, the weighted confirmation rate was 78.0% for DR and 78.8% for DMP (86.2% for DMO).

CONCLUSIONS

Read codes for DR, DM, and DMO are moderately accurate in identifying incident case subjects of these ophthalmologic complications. The validity improved when incorporating PCPs’ text comments to the patient’s profile. THIN database proved to be a valuable resource to study ophthalmological diabetes complications.

Diabetic retinopathy (DR), including diabetic maculopathy (DMP), is a microvascular complication of diabetes and a leading cause of visual impairment and loss of working days in middle-aged adults (1). In Europe, it has been estimated that a quarter of diabetic patients have DR (2). The incidence of blindness among the diabetic patient has been estimated to be over 20 per 100,000 person-years (3).

According to the National Institute for Health and Clinical Excellence (NICE) guidelines for the management of diabetes (4,5), diabetic patients should be annually screened for eye complications. In particular, in type 2 diabetes, screening should be implemented as soon as diabetes is diagnosed (6). Besides these general recommendations, a personalized follow-up and treatment is recommended and more frequent reviews may be warranted in patients with intermediate retinopathy or maculopathy (79).

The current study is part of a broader study that aimed at assessing the burden of retinopathy and maculopathy in a U.K. population of diabetic patients using The Health Improvement Network (THIN) database, a primary health care database. An important preliminary step in using automated health care databases for epidemiological research is to establish the accuracy of diagnoses recorded by the primary care practitioners (PCPs). In this context, we wanted to first assess the validity of DR and DMP records in the THIN database. The purpose of this article is to describe the case ascertainment and validation process used to evaluate the accuracy of recorded DR and DMP diagnoses, including diabetic macular edema (DMO).

Source of data

THIN is a longitudinal primary care medical records database of over 9 million patients in total, which currently covers around 6% of U.K. population (10). THIN database contains individual patient information recorded by PCPs as part of their routine clinical care such as demographic factors, PCP consultations, referrals, hospitalizations, laboratory test results, and prescriptions written by PCPs. Letters from specialist visits and hospital admissions (i.e., discharge letters) are also available. Diagnoses and test procedures are recorded using Read codes (11,12). Prescriptions written by PCPs are generated and recorded automatically in the database using a coded drug dictionary (Multilex).

Several validation studies have been conducted in THIN database reporting high confirmation rates of recorded diagnoses and completeness of data (1316). This primary care database has already been used for studies of diabetes (1719). All these data support the suitability of this source of information to epidemiological research.

The study research protocol was approved by the UK Research Ethics Committee (09/H0305/64).

Diabetic cohort ascertainment

The study period encompassed January 2000 through December 2007.

The source population was made up of all individuals aged 1–84 years between January 2000 and December 2007 who were enrolled at least 2 years with a PCP, with 1 year or more elapsed since the first recorded prescription and with some recorded health care contact in the previous 2 years. Only individuals with an enrollment status of “permanent” (currently enrolled with the PCP) or “died” were eligible. Start date was the date when an individual met all above eligibility criteria.

To select the study population of diabetic patients, all members from the source population were followed from start date until the earliest occurrence of first record of diabetes or antidiabetic treatment, death, or 31 December 2007. We then excluded all women with gestational diabetes (because of the specific idiosyncrasy of this diabetes), patients with type 1 diabetes without antidiabetic treatment (considered as possibly misclassified), and individuals with a first-ever diagnosis of diabetes within 30 days of death and no treatment ever recorded (considered as having incomplete information). Finally, we excluded all those with any diagnosis code for retinopathy or maculopathy recorded until the first diagnosis of diabetes resulting in a final study cohort of 121,834 diabetic patients (55% of them were men). Distribution of this cohort by sex, age, and diabetes type is shown in Table 1.

Table 1

Diabetes cohort distribution by sex, age, and diabetes type

Diabetes cohort distribution by sex, age, and diabetes type
Diabetes cohort distribution by sex, age, and diabetes type

DR and DMP case ascertainment

We followed our study population of diabetic patients from the date of first record of diabetes or antidiabetic treatment in the study period until first recording of DR or DMP including DMO. Two separate follow-ups were performed: 1) until the earliest occurrence of one of the following end points: DR, 85 years of age, death, or 31 December 2008; 2) until the earliest occurrence of one of the following end points: DMP, 85 years of age, death, or 31 December 2008.

Case ascertainment

For DR, we identified patients with a recorded code suggesting an incident diagnosis of retinopathy related to diabetes (Table 2). Diagnoses related to age-related macular degeneration or to any other cause of retinopathy not related to diabetes were excluded.

Table 2

Read codes used to identify DR and DMP

Read codes used to identify DR and DMP
Read codes used to identify DR and DMP

NOS, not otherwise specified; O/E, on examination.

For, DMP and DMO, we identified patients with a recorded code suggesting any incident diagnosis of maculopathy related to diabetes including macular edema, exudative maculopathy, or any other nonspecific maculopathy code (Table 2).

Validation of DR and DMP

A two-step validation strategy was used for both outcomes: DR and DMP (including DMO).

First, we identified all subjects in THIN database that fulfilled the eligibility criteria. We identified 20,838 DR patients (49.5% had only background/nonproliferative DR codes within the first month, 2.3% had a proliferative DR code, and 48.2% had only unspecific DR codes) and 4,064 DMP patients, based on automated diagnosis codes. We then requested from THIN all anonymized free-text comments recorded by the PCPs between 1 month before and 2 months after the recorded diagnosis for a random sample of 500 (2.4%) of DR patients (automated random numbers were generated to obtain the sample) and for all DMP patients. In the free-text entries, PCPs can provide additional information derived from referral letters, diagnostic procedures, and test results such as visual acuity, intraocular pressure, and fundoscopy, among others. We performed a manual review of computer profiles of patients with DR and assigned the event as probable, possible, doubtful, and non-DR case. We defined probable DR case subjects as patients with an objective diagnosis of retinopathy related to diabetes describing the grade of retinopathy or site affected. Possible DR case subjects were all patients with a specific diagnosis but no mention of grade of retinopathy or site affected. Doubtful DR case subjects were patients with only screening appointments or conflicting entries between procedure results and recorded diagnosis. Non-DR case subjects were patients where the diagnosis was explicitly excluded or the date of first DR diagnosis occurred before the date of diabetes ascertained in the study period.

Similarly, we reviewed the computerized profiles of patients with DMP and assigned the event as probable, possible, and non-DMP case. We defined probable DMP case subjects as patients with a diagnosis of maculopathy related to diabetes and recording of the type of maculopathy or site affected. When the diagnosis was not specifically mentioned in the Read code or confirmed in free text, we categorized them as possible DMP. Noncase subjects were patients with an unspecific code and/or confirmation of the absence of maculopathy. We defined the subgroup of probable DMO case subjects as patients with a Read code mentioning macular edema or an indication in the free-text comments of the site, the grade of edema, or retinal thickening.

Then, in a second step, a questionnaire was sent to the PCPs of 200 patients randomly sampled from the subgroup of the 500 DR patients whose computer profiles were manually reviewed and to the PCPs of a random sample of 200 patients of the 4,064 manually reviewed DMP patients. PCPs were requested to confirm the diagnosis of DR or DMP (and in particular DMO) and were also asked to send a copy of all records related to the event of interest, including referral letters, diagnostic procedures, test results such as visual acuity, intraocular pressure, or fundoscopy results. The researchers did not contact the practices directly, but did so via the Additional Information Services (AIS), which is part of the THIN organization. Once contacted and having received the questionnaires, each PCP sent the supplementary information and the response to the questionnaire to AIS, and AIS ensured that all personal data were removed before forwarding the information to the researchers. Physician and patient confidentiality were preserved at all stages. After the review of all this supplementary information, a final case status of DR or DMP was assigned for the random sample of DR and all DMP patients. Additionally, the percentage of confirmed case subjects in each of the categories defined for DR and DMP (i.e., probable, possible, and noncase) was determined. The date of first diagnosis of DR or DMP also was ascertained based on all information available, including that provided by the PCPs in their responses to questionnaires and copies of ophthalmology records provided.

The confirmation rate of DR and DMP assigned in our first step (review of computerized patient profiles including free-text comments) was computed using as numerator the number of final confirmed case subjects based on all information provided by the PCPs (gold standard) and as denominator the total number of patients considered case subjects after manual review of patient profiles.

Validation of DR

Five hundred DR computerized patient profiles with free-text comments were manually reviewed by one of the authors. We categorized 331 (66.2%) as “probable DR” and 87 (17.4%) as “possible DR.” In addition, 2.8% (N = 14) were categorized as “doubtful case subjects” and 13.6% (N = 68) as “noncase subjects.” Figure 1 shows the results of the review process and resulting categorization of the sample of DR patients reviewed. In the second step, we received 176 valid questionnaires of the 200 requested (PCP response rate was 88%): a majority of them (71.5%) included copies of anonymized records related to the retinopathy. Among these, 150 had been classified as probable or possible DR case subjects in the first step and 131 of them were finally assigned a diagnosis of DR based on the questionnaires and additional information received (confirmation rate of 87.3%). Among the 131 confirmed DR, 73.3% were background/nonproliferative DR, 2.3% were proliferative DR, and 24.4% had no specification on DR grading. We also received questionnaires for 26 diabetic patients who were originally classified as doubtful or noncase subjects, and the diagnosis of DR was finally confirmed in 30.8%: 50% of doubtful and 25.0% of noncase subjects, respectively. Figure 1 shows the flowchart of all validation steps.

Figure 1

Validation chart of DR detected codes during the years 2000 to 2008 in THIN database.

Figure 1

Validation chart of DR detected codes during the years 2000 to 2008 in THIN database.

Close modal

The final estimated confirmation rate among all initially computer-detected patients was 78.0%, resulting from the confirmation rate weighted by the corresponding percentage in each category (87.3% confirmed DR weighted by 83.6% categorized as “probable DR” and “possible DR” plus 30.8% confirmed DR weighted by 16.4% categorized as “doubtful DR” or “noncases DR”).

The confirmation rates for patients with background/nonproliferative DR, proliferative DR, and unspecified retinopathy grade within 1 month were 92.0, 100, and 65.1%, respectively.

Validation of DMP

We reviewed all 4,064 computerized patient profiles with free-text comments of patients automatically detected with a code of DMP. Of these, 2,460 (60.5%) were categorized as “probable DMP,” 1,216 (29.9%) as “possible DMP,” and 388 (9.5%) as noncases. Figure 2 shows this categorization. Similar to the process of validation of DR, 200 patients were randomly selected in a second step, and questionnaires were sent to their PCPs. Of these, 176 valid questionnaires were returned (response rate of 88%), and 73.2% of them enclosed copies of anonymized records related to the maculopathy. Among the valid questionnaires returned, there were 172 who had been considered probable or possible case subjects of DMP after the manual review of patient profiles. In this subgroup, 150 patients were assigned a final diagnosis of DMP based on the questionnaires, resulting in a confirmation rate of 87.2% (91.9% among probable DMP and 78.7% among possible DMP, respectively). We also received questionnaires for four diabetic patients originally considered as noncase subjects, and all of them were confirmed as noncase subjects.

Figure 2

Validation chart of DMP (and edema type) detected codes during the years 2000 to 2008 in THIN database.

Figure 2

Validation chart of DMP (and edema type) detected codes during the years 2000 to 2008 in THIN database.

Close modal

The final estimated confirmation rate among all computer-detected patients was 78.8%, resulting from the confirmation rate weighted by the corresponding percentage in each category (87.2% confirmed DMP weighted by 90.4% of categorized as “probable DMP” or “possible DMP”).

We also performed the validation of DMO. Among the 4,064 maculopathies reviewed in the first step, 17.5% (N = 711) were considered as “probable DMO.” In the second step, 36 patients were randomly sampled and questionnaires were sent to the PCPs. Of these, 31 valid questionnaires were returned (86%), and 81% of them enclosed copies of anonymized records related to the maculopathy. The PCPs confirmed 28 as incident DMO (90.3%). Figure 2 summarizes the flowchart of all validation steps. Computer codes that specifically included the term “oedema” in their literal presented a confirmation rate of 86.2%. There also were 33 patients registered as maculopathies without any other specification of type in their code or in free-text comments who were finally confirmed as macular edema in the second step, representing 54% of all finally confirmed DMO case subjects.

This is the first study to validate DR and DMP diagnoses in a primary care database. The validation strategy was successful thanks to the high proportion of PCPs who returned completed validation questionnaires (i.e., 90%). Furthermore, most PCPs provided with the completed questionnaire copies original clinical records that usually included specialist notes after ophthalmologic assessment, which was instrumental for the validation of diagnoses.

The validation of a sample of automatically identified DR patients showed that the proposed set of computer codes was moderately accurate (confirmation rate of 78%) in predicting the actual diagnosis. However, the confirmation rate (87%) increased after the inclusion of free-text comments from the PCP in the computerized patients’ profiles.

Similar results were observed for the maculopathy diagnosis. The selected set of computer codes predicted the diagnosis with moderate accuracy (confirmation rate of 79%), and this increased to 87% when reviewing the patients’ profiles after incorporating free-text comments.

We showed that specific codes of DMO have a high confirmation rate (86%). However, a high proportion of true DMO case subjects would not be identified based on the computer codes alone or with the addition of free-text comments (54%). To ascertain “all” DMO case subjects, additional information from the PCPs needs to be obtained.

Prior studies conducted using claims data have also assessed recorded diagnosis of diabetic eye complications. One study reported good agreement between incidences of DR estimated using claims data versus population-based studies (20), and another one assessed the validity of the diagnoses of DMO and found high sensitivity (88%) and specificity (96%) (20,21). Nevertheless, our study is the first one to test the validity of registration of ophthalmological complications of diabetes in a primary care database. Similar to what has been observed in a previous study with THIN database, the validity of the studied outcome increased with the addition of free-text comments to computerized patient profiles (15).

Our study has some potential limitations. First, we relied on the existing coding system (i.e., Read) in THIN. Yet, the Read dictionary in some instances may not be specific enough to enable the recording of the outcomes of interest, such as the severity scale for retinopathy or the type of maculopathy. PCPs also may not always enter the most appropriate code. However, there were specific codes of DR, DMP, and DMO that PCPs could enter to record the specific grade of diagnosis without having to resort to more unspecific ones (shown in Table 2). Some diabetic patients also could consult optician practices without having to first go through their PCP, and consequently PCPs would not have recorded the relevant information systematically. However, this potential limitation should have had a limited impact in our study because ophthalmologic screening care is offered to all U.K. diabetic patients since it was shown to be more cost-effective to detect cases of sight threatening DR than incidental screening (7). All diabetic patients are invited for Diabetic Eye Complication screening, a national screening program. PCPs participating in THIN database also are encouraged to record results from screening programs as well as information resulting from ophthalmologist referrals, as part of Diabetic Quality Outcomes Framework. According to a recent study, around 3% of all patients screened for DR in the National Diabetic Retinopathy Screening Committee require referral to ophthalmology department because of positive results and 79% of them are referred for maculopathy (22). It is our opinion that this system guarantees that THIN database is a rich data source for studies on ophthalmological diseases in patients with diabetes.

For feasibility and practical reasons we were not able to validate all DR potential case subjects (N = 20,838) identified with the initial computer search. However, it appears sensible to infer the results from our manual review of a random sample to the whole DR computer-detected population.

As shown by the current study, recording of Read codes for DR, DM, and DMO is moderately accurate in identifying true case subjects of these conditions. The validity of these conditions improved with the incorporation of free-text comments to the patient’s computerized profile when contrasted with additional information directly provided by the PCP. THIN database has proved to be a valuable resource to perform studies of ophthalmological diabetes complications in the general population when supplemented with information present in free-text notes.

This study was supported by an unrestricted research grant from Novartis Farmaceutica S.A.

E.M.-M. and L.A.G.-R. work in the Centro Español de Investigación Farmacoepidemiológica (CEIFE), Madrid, Spain. J.F. and E.R. are current employees of Novartis Farmacéutica S.A., Barcelona, Spain. No other potential conflicts of interest relevant to this article were reported.

E.M.-M. performed the validation and analysis, designed the study, and wrote the manuscript. J.F., E.R., and L.A.G.-R. conceived and designed the study and wrote the manuscript. All authors had full access to all of the data in the study. L.A.G.-R. is the guarantor of this work and, as such, had full access to all data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

The authors thank Ana Ruigómez of Centro Español de Investigación Farmacoepidemiológica of Madrid for reading earlier versions of the manuscript.

1.
Bunce
C
,
Wormald
R
.
Causes of blind certifications in England and Wales: April 1999–March 2000
.
Eye (Lond)
2008
;
22
:
905
911
[PubMed]
2.
Williams
R
,
Airey
M
,
Baxter
H
,
Forrester
J
,
Kennedy-Martin
T
,
Girach
A
.
Epidemiology of diabetic retinopathy and macular oedema: a systematic review
.
Eye (Lond)
2004
;
18
:
963
983
[PubMed]
3.
Genz
J
,
Scheer
M
,
Trautner
C
,
Zöllner
I
,
Giani
G
,
Icks
A
.
Reduced incidence of blindness in relation to diabetes mellitus in southern Germany?
Diabet Med
2010
;
27
:
1138
1143
[PubMed]
4.
National Institute for Health and Clinical Excellence. Type 1 diabetes: diagnosis and management of type 1 diabetes in children, young people and adults. Clinical Guideline 15. London: National Institute for Health, 2004. Available from http://guidance.nice.org.uk/CG15/NICEGuidance/pdf/English. Accessed 28 November 2011.
5.
National Institute for Health and Clinical Excellence. Type 2 diabetes: the management of type 2 diabetes. Clinical Guideline 66. London: National Institute for Health, 2008. Available from http://www.nice.org.uk/nicemedia/pdf/CG66NICEGuideline.pdf. Accessed 4 December 2011.
6.
Ockrim
Z
,
Yorston
D
.
Managing diabetic retinopathy
.
BMJ
2010
;
341
:
c5400
[PubMed]
7.
The Royal College of Ophthalmologists. Scientific Department. Guidelines for Diabetic Retinopathy. Screening for diabetic retinopathy. Pages 35–39. London: The Royal College of Ophthalmologists, 2005. Available from http://www.rcophth.ac.uk/page.asp?section=451&sectionTitle=Clinical+Guidelines. Accessed 18 October 2011.
8.
Younis
N
,
Broadbent
DM
,
Harding
SP
,
Vora
JP
.
Incidence of sight-threatening retinopathy in Type 1 diabetes in a systematic screening programme
.
Diabet Med
2003
;
20
:
758
765
[PubMed]
9.
Facey K, Cummins E, MacPherson K, Morris A, Reay L, Slattery J. Organisation of services for diabetic retinopathy screening. In Health Technology Assessment Report 1. Glasgow, Health Technology Board for Scotland, 2002.
10.
Data Statistics THIN. Cegedim Strategic Data Medical Research UK. Available from http://csdmruk.cegedim.com. Accessed 6 October 2011.
11.
O’Neil
M
,
Payne
C
,
Read
J
.
Read Codes Version 3: a user led terminology
.
Methods Inf Med
1995
;
34
:
187
192
[PubMed]
12.
Stuart-Buttle
CD
,
Read
JD
,
Sanderson
HF
,
Sutton
YM
.
A language of health in action: Read Codes, classifications and groupings
.
Proc AMIA Annu Fall Symp
1996
:
75
79
[PubMed]
13.
Bourke
A
,
Dattani
H
,
Robinson
M
.
Feasibility study and methodology to create a quality-evaluated database of primary care data
.
Inform Prim Care
2004
;
12
:
171
177
[PubMed]
14.
Lewis
JD
,
Schinnar
R
,
Bilker
WB
,
Wang
X
,
Strom
BL
.
Validation studies of the health improvement network (THIN) database for pharmacoepidemiology research
.
Pharmacoepidemiol Drug Saf
2007
;
16
:
393
401
[PubMed]
15.
Ruigómez
A
,
Martín-Merino
E
,
Rodríguez
LA
.
Validation of ischemic cerebrovascular diagnoses in the health improvement network (THIN)
.
Pharmacoepidemiol Drug Saf
2010
;
19
:
579
585
[PubMed]
16.
Margulis
AV
,
García Rodríguez
LA
,
Hernández-Díaz
S
.
Positive predictive value of computerized medical records for uncomplicated and complicated upper gastrointestinal ulcer
.
Pharmacoepidemiol Drug Saf
2009
;
18
:
900
909
[PubMed]
17.
González
EL
,
Johansson
S
,
Wallander
MA
,
Rodríguez
LA
.
Trends in the prevalence and incidence of diabetes in the UK: 1996-2005
.
J Epidemiol Community Health
2009
;
63
:
332
336
[PubMed]
18.
Gonzalez-Perez
A
,
Schlienger
RG
,
Rodríguez
LA
.
Acute pancreatitis in association with type 2 diabetes and antidiabetic drugs: a population-based cohort study
.
Diabetes Care
2010
;
33
:
2580
2585
[PubMed]
19.
Gunathilake
W
,
Song
S
,
Sridharan
S
,
Fernando
DJ
,
Idris
I
.
Cardiovascular and metabolic risk profiles in young and old patients with type 2 diabetes
.
QJM
2010
;
103
:
881
884
[PubMed]
20.
Sloan
FA
,
Brown
DS
,
Carlisle
ES
,
Ostermann
J
,
Lee
PP
.
Estimates of incidence rates with longitudinal claims data
.
Arch Ophthalmol
2003
;
121
:
1462
1468
[PubMed]
21.
Bearelly
S
,
Mruthyunjaya
P
,
Tzeng
JP
, et al
.
Identification of patients with diabetic macular edema from claims data: a validation study
.
Arch Ophthalmol
2008
;
126
:
986
989
[PubMed]
22.
Jyothi
S
,
Elahi
B
,
Srivastava
A
,
Poole
M
,
Nagi
D
,
Sivaprasad
S
.
Compliance with the quality standards of National Diabetic Retinopathy Screening Committee
.
Prim Care Diabetes
2009
;
3
:
67
72
[PubMed]
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.