We developed and validated a self-assessment score for diabetes risk in Korean adults and compared it with other established screening models.
The Korea National Health and Nutrition Examination Survey (KNHANES) 2001 and 2005 data were used to develop a diabetes screening score. After excluding patients with known diabetes, 9,602 participants aged ≥20 years were selected. Undiagnosed diabetes was defined as a fasting plasma glucose ≥126 mg/dL and/or nonfasting plasma glucose ≥200 mg/dL. The SAS Survey Logistic Regression analysis was used to determine predictors of undiagnosed diabetes (n = 341). We validated our model and compared it with other existing methods using the KNHANES 2007–2008 data (n = 8,391).
Age, family history of diabetes, hypertension, waist circumference, smoking, and alcohol intake were independently associated with undiagnosed diabetes. We calculated a diabetes screening score (range 0–11), and a cut point of ≥5 defined 47% of adults as being at high risk for diabetes and yielded a sensitivity of 81%, specificity of 54%, positive predictive value of 6%, and positive likelihood ratio of 1.8 (area under the curve [AUC] = 0.73). Comparable results were obtained in validation datasets (sensitivity 80%, specificity 53%, and AUC = 0.73), showing better performance than other non-Asian models from the U.S. or European population.
This self-assessment score may be useful for identifying Korean adults at high risk for diabetes. Additional studies are needed to evaluate the utility and feasibility of this score in various settings.
Type 2 diabetes is one of the most common and rapidly increasing chronic metabolic disorders in the world. It causes serious complications and mortality, with a large burden to the public health care system as well as to patients. More than 189 million people had diabetes in 2003 worldwide (1), and this number is expected to rise more rapidly in the future as obesity increases, the population becomes older, and physical activity levels of most people decrease. The overall prevalence of diabetes in Korea was ~9.1% (2.6 million people) in 2005 (2) and increased to 9.7% in 2008, according to an analysis of the 2008 Korea National Health and Nutrition Examination Survey (KNHANES) (3). More strikingly, 32% (0.8 million) of patients with diabetes are undiagnosed; 4.9 million subjects in 2005 were estimated to have impaired fasting glucose, accounting for 17.4% of Korean adults (2,4).
Among middle-aged Korean adults, more than one-half (56%) were first diagnosed with diabetes in the survey (2), indicating that a significant number of individuals potentially may be at risk for undiagnosed diabetes. Considering the large proportion of patients with impaired fasting glucose or undiagnosed diabetes, early screening and detection in these patients is essential to avoid diabetes-related morbidity, reduce the cost of health care, and prevent the deterioration of the quality of life. Many risk score questionnaires and algorithms have been developed and validated in various countries and ethnic groups to identify patients at high risk for diabetes (5–11). However, most have been designed for whites, and there are only a few scoring systems for Asian populations (6,9,11). Risk scores derived from certain populations may not be applicable to other ethnic groups (12,13); therefore, there is a need to establish a diabetes risk score for the Korean population. Moreover, having their own score may make people more motivated to use the method. In addition, the majority of models consist of diverse variables, including laboratory profiles and BMI, which require additional blood assays or mathematical calculations (6–10).
The aim of our study was to develop and validate a self-assessment score for diabetes risk in Korean adults using simple clinical parameters, including anthropometric and lifestyle risk factors, to provide a reliable and easy tool for the layperson without the need for a clinician’s input. We also compared the new algorithm with other existing screening models derived from different ethnic populations.
RESEARCH DESIGN AND METHODS
Data source and subjects
The KNHANES is a nationwide, population-based, and cross-sectional health examination and survey regularly conducted by the Division of Chronic Disease Surveillance, Korea Centers for Disease Control and Prevention, Ministry of Health and Welfare, to monitor the general health and nutrition status of South Koreans (14,15). To date, KNHANESs have been performed in the years 1998 (KNHANES I), 2001 (KNHANES II), 2005 (KNHANES III), and 2007–2009 (KNHANES IV). The KNHANES consists of four different surveys: a health interview survey, a health behavior survey, a health examination survey, and a nutrition survey. Similar to NHANES (for the U.S. population), each KNHANES consists of independent sets of individuals from the South Korean population. All individuals were randomly selected from 600 randomly assigned districts of cities and provinces in South Korea. Therefore, it is hardly possible to select identical person repeatedly in the consecutive surveys. Details of the surveys in the KNHANES have been previously described (16).
Data from KNHANES 2001 and 2005 were used to develop a risk score model for diabetes in Korea. In the KNHANES 2001 (which included 8,064 subjects aged ≥10 years), 6,601 individuals aged ≥20 years were selected as subjects for the current study, and 5,501 individuals aged ≥20 years were selected from the KNHANES 2005 (which included 7,597 subjects aged ≥10 years). Of 12,102 people who participated in the 2001 and 2005 surveys, subjects with missing data in key covariates were excluded (family history of diabetes [n = 35], smoking [n = 661], and alcohol consumption [n = 699]; fasting plasma glucose [FPG; n = 346]; and waist circumference [n = 69]). Among the combined sample (n = 10,202), 600 subjects had a previous diagnosis of diabetes by a health care professional and were excluded from the model development. As a result, 9,602 subjects were finally analyzed by logistic regression analyses for prediction modeling.
Data from KNHANES 2007–2008 were used for independent validation of the established model. Subjects aged ≥20 years were selected for the validation study; this included 8,391 of 9,792 individuals from KNHANES 2007–2008 after excluding those who were classified with “known diabetes” (n = 747) or who had unreported clinical variables (n = 654). Among them, 218 subjects were first diagnosed with diabetes in this survey and were classified as having “undiagnosed diabetes.”
Participant data and measurements
We used participant demographics and personal and family medical history data, including information on diabetes, social habits such as smoking and alcohol consumption, physical activity, and anthropometrics. Measurement of waist circumference was conducted by well-trained examiners using a nonstretchable standard tape after normal expiration with the subject standing and was obtained at the minimal point between the lowest rib and iliac crest, usually at the level of the navel. Laboratory parameters, including FPG, total cholesterol, triglycerides, and HDL cholesterol were measured after overnight fasting. Subjects who were identified in the health interview survey with a previous diagnosis of diabetes by a health care professional or who were taking insulin or oral antidiabetes agents were defined as having “known diabetes.” Subjects who were first diagnosed with diabetes by the survey were classified as having “undiagnosed diabetes.” The diagnostic criteria for diabetes were obtained from the 2011 revision of the American Diabetes Association (ADA) guidelines (17). Diabetes was diagnosed in subjects with FPG ≥126 mg/dL or nonfasting glucose ≥200 mg/dL; impaired fasting glucose was defined as an FPG of 100–125 mg/dL.
Family history of diabetes was confined to individuals whose first relatives, such as father, mother, or siblings, had diabetes. Subjects were divided into four physical activity classes: sedentary (nearly bedridden or unable to stand and walk), light (office workers, tech workers, and housewives who do less housework), moderate (housewives who do much housework, salesmen, teachers, workers at a manufacturer, or similar types of occupations), and vigorous (people engaged in agriculture, the fishing industry, civil engineering, the building industry, or similar types of occupations). Individuals who were more than moderately active were considered “active.” Individuals were classified into smoking categories of never smoked, ex-smoker, and current smoker by self-report. The questionnaire for alcohol consumption consists of two categories: assessment of frequency of drinking and amount of alcoholic beverages that the subjects consumed on average. The average daily number of drinks then was calculated regardless of the kind of alcoholic beverages, including beer, whiskey, or Soju, a Korean traditional liquor. One serving of these beverages contains ~8–9 g alcohol, although each drink has different volumes (250 mL for beer, 24 mL for whisky, and 40 mL for Soju). Alcohol consumption was stratified into three groups according to the daily amount of drinking: none or <1, 1–4.9, and ≥5 drinks daily. Patients were diagnosed as hypertensive if they were documented to have hypertension diagnosed by a physician, blood pressure ≥140/90 mmHg, or if they were taking antihypertensive medication.
Statistical analyses
Participant characteristics in different diabetes statuses are summarized by descriptive statistics. Continuous variables are expressed as means ± SE, and categorical variables are presented as percentages. For model development, we applied multiple logistic regression analyses with undiagnosed diabetes as the end point. Based on the development dataset (KNHANES 2001 and 2005), we included a comprehensive list of variables considered to be potentially associated with undiagnosed diabetes in an initial risk score model for diabetes. Because the number of predictors was large, we started with continuous variables and later categorized them in the final model. Backward elimination (deleting the covariate with the largest P value, one at a time) was performed from the initial model until we reached a final model with statistically significant covariates. We were guided by the statistical significance of the model building but also considered scientific and qualitative judgments to establish the risk score model by excluding less appropriate variables in a risk assessment questionnaire despite their statistical significance.
We double checked the final model to ensure that no important covariates were omitted in this sequential process. We intentionally used only categorized variables that captured easy but relevant and validated health information in the prediction model to develop a user-friendly and educational screening score. We created a weighted scoring system by rounding down odds ratios (ORs) in the final model to the nearest integer. For example, OR 1.52 was rounded to 1 and OR 3.19 was rounded to 3.
In the development dataset, we compared our new risk score with established screening models and other assessment algorithms for undiagnosed diabetes: the ADA diabetes risk questionnaire II (18), a U.S. screening score based on the U.S. population (5); the Qingdao diabetes risk score from China (11); the Thai risk score (9); and the Rotterdam model, derived from a European sample (19). We selected these models to evaluate the generalizability and transferability of validated Asian or non-Asian models for the Korean population. We computed standard validation measures, including the proportion of high-risk individuals, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), likelihood ratios for a positive test result (sensitivity/[1 − specificity]) and for a negative test result ([1 − sensitivity]/specificity), Youden index (= sensitivity + specificity − 1), and the area under the receiver operating characteristic curve (AUC) as a discrimination statistic (5,20,21).
To validate the self-assessment diabetes risk score using independent datasets, we evaluated our scoring system using the KNHANES 2007–2008, and the aforementioned evaluation measures also were calculated. Because KNHANES 2007–2008 did not collect the data on family history of diabetes, the data were imputed using a statistical technique (PROC MI in SAS; SAS Institute, Cary, NC). We repeated the analysis without and with imputation in the validation samples. Statistical analyses were conducted using SAS version 9.2. We used MedCalc version 11.1 for the receiver-operating characteristic analysis (MedCalc Software, Mariakerke, Belgium). For estimation and inference, two-sided hypotheses/tests were used, and a P value <0.05 was considered statistically significant.
RESULTS
Characteristics of subjects in the development dataset (KNHANES 2001 and 2005)
The development dataset comprised 10,202 participants in the KNHANES 2001 and 2005. The baseline clinical and biochemical characteristics of the participants are shown in Table 1, according to diabetes status. The crude prevalence of undiagnosed diabetes based on FPG or nonfasting glucose levels was 3.3% in this study population of adults aged ≥20 years. Participants with diagnosed or undiagnosed diabetes tended to be older, be hypertensive, and have a family history of diabetes compared with those without diabetes. These subjects also tended to have a higher BMI and waist circumference but decreased levels of HDL cholesterol.
Development of the self-assessment score for diabetes risk
After excluding subjects with known diabetes (n = 600) from the development dataset, multiple logistic regression analyses were performed to establish a diabetes risk score for the Korean population. Table 2 describes the final regression model derived from the KNHANES 2001 and 2005 development dataset. Age, family history of diabetes, personal history of hypertension, waist circumference, smoking status, and alcohol consumption were significant predictors of undiagnosed diabetes. Multiple categories (with scores of 0–3) were applied for variables including age, waist circumference, and alcohol consumption to capture the risk gradient, whereas other risk factors were binary (with a score of 0 or 1 assigned). Age range was divided into three levels (<35, 35–44, and ≥45 years), according to the logistic regression results to simplify the risk model. We stratified the subjects into three groups for waist circumference using the 50th and 75th percentile values of waist circumference in the study population to consider the potential impact of central obesity on the risk of diabetes. A logistic regression model was fitted, including both waist circumference and BMI together. BMI range was divided into three groups by the cutoff values of overweight (≥23 kg/m2) and obesity (≥25 kg/m2) based on the definition of obesity in the Asia-Pacific region. Contrary to waist circumference, BMI was not significantly associated with diabetes risk (β-coefficient; −0.105 for BMI ≥23 kg/m2 and −0.066 for BMI ≥25 kg/m2, both P > 0.5). Therefore, we used waist circumference in the final model. The risk score was assigned according to the OR for each risk factor in the final regression model. The maximum total score for this risk model was 11. The six risk factors jointly yielded an AUC of 0.730 in the development sample (Table 2).
Independent validation of the self-assessment score for diabetes risk
We investigated the diagnostic characteristics of different total score cut points in the KNHANES development and validation datasets. A cut point of ≥5 was selected because it results in the highest value for the Youden index to indicate an individual at high risk for undiagnosed diabetes. In the development dataset, the present model/cut point designated ~47% of participants at high risk for undiagnosed diabetes and yielded a sensitivity of 81%, specificity of 54%, PPV of 6%, and NPV of 99%, with an AUC of 0.73 (Table 3). We also assessed the performance characteristics of the established models and our new screening score. Our screening score (cutoff of ≥5 points) resulted in higher overall test accuracy (reflected in the Youden index) and a larger AUC compared with those of the other models. All models had high NPVs (≥97%). Among existing methods, the Thai risk score seemed to perform the best (Youden index = 31). Of note, the performance of Western models tended to be inferior to those for Asian in the Korean populations.
Performance of the new and existing diabetes screening methods in the development and validation datasets

Consistent results were observed when we applied this score to the KNHANES 2007–2008 validation dataset. Approximately 46% of the subjects were at high risk, with a sensitivity of 79%, specificity of 55%, PPV of 4%, NPV of 99%, and AUC of 0.73. After imputing the missing data of family history of diabetes, this cutoff defined ~48% of the adult population as being at high risk for undiagnosed diabetes and yielded a sensitivity of 80%, specificity of 53%, PPV of 4%, and NPV of 99%, with an AUC of 0.73. Based on these findings, if we assume that 1,000 new participants will be examined by the risk model and use the cutoff point of five, then 483 subjects (48.3%) would undergo diagnostic testing, 20 new cases of diabetes would be identified, and 4 to 5 subjects with diabetes would remain untested and undetected (22). If the lower cut point of 4 is applied, then ~596 people (59.6%) would undergo diagnostic testing, and we may expect 22–23 cases of diabetes to be newly diagnosed and <3 cases to go untested and unidentified. A critical issue is that PPV directly depends on the prevalence of the specific disease or condition in a population (23), which explains why our screening model, as with other risk scores, has lower PPV (3–8%) for these outcomes.
Figure 1 shows a sample of a self-assessment questionnaire that may be used by laypersons as well as health care providers to screen for undiagnosed diabetes or assess the risk. Figure 2 presents the prevalence of undiagnosed diabetes for individual total scores in the KNHANES. The prevalence of unidentified diabetes increased dramatically as the risk scores increased gradually to ≥5, indicating a nonlinear relationship. The average prevalence of undiagnosed diabetes was 2, 6, 12, and 19% in individuals with total risk scores of ≤ 4, 5–7, 8–9, and ≥10, respectively.
Self-assessment screening questionnaire for undiagnosed participants, recommended for use by health care providers and laypersons. y, years.
Self-assessment screening questionnaire for undiagnosed participants, recommended for use by health care providers and laypersons. y, years.
Estimated prevalence of undiagnosed diabetes according to the risk score. KNHANES 2001 and 2005 comprised 9,602 subjects. Proportions of subjects with scores of 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9, and ≥10 correspond to 0.5, 0.7, 0.9, 1.5, 2.3, 4, 4.7, 6.7, 11.1, 12.2, and 19.1% in KNHANES, respectively. The average prevalence of undiagnosed diabetes is 2% in individuals with a risk score ≤4, 6% in those with a risk score of 5–7, 12% in those with a risk score of 8–9, and 19% in those with total score >10, respectively.
Estimated prevalence of undiagnosed diabetes according to the risk score. KNHANES 2001 and 2005 comprised 9,602 subjects. Proportions of subjects with scores of 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9, and ≥10 correspond to 0.5, 0.7, 0.9, 1.5, 2.3, 4, 4.7, 6.7, 11.1, 12.2, and 19.1% in KNHANES, respectively. The average prevalence of undiagnosed diabetes is 2% in individuals with a risk score ≤4, 6% in those with a risk score of 5–7, 12% in those with a risk score of 8–9, and 19% in those with total score >10, respectively.
CONCLUSIONS
We developed and validated a simple and practical tool to identify high-risk subjects for diabetes in a Korean population. The model included age, family history of diabetes, hypertension, waist circumference, smoking status, and alcohol intake as significant variables. We intended to establish a simple risk score model based on clinical and anthropometric information without using laboratory tests or potentially difficult calculations (e.g., BMI), unless these variables are strongly indicated. Components in this model are easily comprehensible and underscore the importance of the modifiable risk factors related to an individual’s habits. Therefore, this risk score model would be easy and convenient for a layperson to perform a self-assessment of diabetes risk in real life. Although there is a guideline for defining high-risk individuals who need to have additional blood assays, including FPG and oral glucose tolerance tests (24), applying this simple diabetes screening score to the general population may serve as a first step to identifying high-risk subjects, who can be referred to additional blood assay and laboratory testing to reveal undiagnosed diabetes.
A recent study in Korea showed that >30% of people with diabetes were unaware of their illness in 2005 (2), implying that there still are a significant number of people who are left untreated. Consistent with this finding, 34% of subjects with diabetes in a U.S. community were reportedly unrecognized diabetic patients (25). Even subjects with prediabetes or undiagnosed diabetes have been reported to have increased mortality and risk for cardiovascular diseases (26,27). Therefore, if this simple risk score could effectively and efficiently screen high-risk individuals in the Korean population and help promote public health care with early lifestyle intervention, the burden of diabetes and its complications could be possibly reduced in Korea.
To date, most risk assessment scores for diabetes have been derived from white populations (5,7,28–30), and few risk score models are based on Asian ethnic groups (9–11). Diabetes risk assessment models developed in white populations tend to poorly predict high-risk subjects for diabetes in Asian populations (13), because each ethnic group has different and distinctive genetic and environmental characteristics, such as body shape, food and drink, culture, and other lifestyle factors. Therefore, we believe it is ideal to have different models for different populations to screen high-risk individuals for diabetes.
Our risk assessment model has several distinguishable features. First, it is a simple and easy to use. It does not require any blood assays or mathematical calculations to derive a diabetes risk score. Thus, laypersons can easily use this model and calculate their own risk score for diabetes without any help from medical caregivers. Because every Korean has national health insurance, access to medical care may not be a major problem. However, type 2 diabetes continues to be undiagnosed as a result of a lack of specific symptoms and limited interest in the public health care sector. More education and health guidance seem to be necessary. Our screening score may be used for an educational purpose in the clinical and community settings. Moreover, patients may initiate the discussion about diabetes with health care providers after self-assessment, which may be an example of the patient-centered care to empower patients and assist health care providers (http://www.ahrq.gov/qual/ptcareria.htm). Second, this model includes adjustable risk factors, such as waist circumference, alcohol, and smoking consumption. We tried to emphasize the importance of lifestyle intervention (31) regarding these modifiable risk factors. If subjects who are at high risk for diabetes are aware of these risk factors, such as central obesity, smoking, and heavy drinking, they could possibly reduce their risk by lifestyle modification. Third, age and waist circumference cutoff points proposed by our risk-score model are consistent with those suggested from the current consensus statements regarding type 2 diabetes and metabolic syndrome. For example, the ADA recommends universal screening for individuals at age ≥45 years (24), and the International Diabetes Federation defines central obesity of the South Asian and Chinese population as waist circumference ≥90 cm in men (32). Subjects who met both criteria were assigned a score of 6, indicating high risk for diabetes based on our model.
Waist circumference was selected in the present score instead of BMI because waist circumference was superior to BMI in terms of predicting undiagnosed diabetes in our model/datasets. Waist circumference belongs to the diagnostic criteria for metabolic syndrome and generally is accepted as a surrogate index for visceral obesity, which is a key contributor to the development of cardiovascular disease and disorders such as diabetes (33). Considering its substantial role in the development of diabetes, waist circumference might be a better indicator to reflect not only overall obesity but also visceral obesity, particularly for Asian populations. Usually, both waist circumference and BMI, among other anthropometric variables, are considered important risk factors, and some of existing models have included both (5,7,9,34). In contrast, researchers from China proposed a simple risk score using waist circumference but not BMI (11). Others have used only BMI in their models (19,29), whereas several risk scores included waist circumference and height but omitted BMI (10,28,30).
Unlike other risk models, our risk assessment tool considers the potential association of alcohol consumption and smoking habits with undiagnosed diabetes. Controversial evidence indicates that moderate alcohol consumption is inversely associated with diabetes risk, whereas the incidence of diabetes increases in more frequent drinkers (35). Smoking is considered an established risk factor for increased risk of diabetes (36), consistent with our findings that current smoking was associated with an increased prevalence of undiagnosed diabetes. A model from Germany suggests that moderate drinking is a protective factor and allocated both former and current smokers to strong risk elements for the development of diabetes (28). Likewise, Kahn et al. (30) proposed that smoking and nonuse of alcohol were significantly associated with the risk of diabetes, consistent with a report from an Australian group (37). However, use of cigarettes or alcohol was excluded in the Danish model because of statistical insignificance (29). According to the results of our model, the association between diabetes risk and alcohol consumption could be dose dependent. These conflicting findings might be a result of differences in the genetic and ethnic backgrounds of our study population. Yet, we cannot exclude the possibility that frequent drinkers might tend to be careless with their health.
The current study provides a self-assessment score for diabetes based on the Korean population for the first time. Our risk score consists of only six easily answerable questions, and it would take minimal time to finish and calculate the total score without assistance from health care providers. Among several published risk models, some excluded information from invasive blood tests and only used clinical factors (5,7,9,11,19,28,29,34,38,39). However, most require BMI values (5,7,9,19,29,34,39) or additional complicated calculations to assess risk scores (28,38). Of importance, validation tests confirmed that our risk score performed well in the prediction of diabetes in the independent sample, but additional external validation would be warranted.
The current study has some limitations, which could be addressed by additional investigations. Because this risk score was derived from a national cross-sectional study, the model might be unable to precisely predict the risk of future development of diabetes. Although cross-sectional studies are well suited for prevalent but undiagnosed disease, additional verification or development of a new model for incident diabetes based on prospective studies could be needed for the Korean population. Furthermore, we did not include potentially important risk factors, such as history of gestational diabetes or diet/nutrition (e.g., consumption of fruits, vegetables, and sodas) because of the lack of data. Our study considered only unidentified individuals with diabetes based on a high FPG as an outcome. Because of a lack of oral glucose tolerance data, the prevalence of diabetes might be underestimated in our study population (40).
The present results clearly showed that existing risk models derived from whites or other Asians did not perform well in the Korean population, which may justify the use of the Korean score for Koreans. In addition, our simple self-assessment score for diabetes risk can be applied not only by primary practitioners but also by laypersons to identify high-risk subjects who need additional evaluation and management. This risk model is an alternative approach that easily can be used in communities and clinical settings to (pre) screen individuals at high risk for diabetes. Future research is warranted to verify the usefulness and feasibility of our model and identify ways to improve the accuracy of this score in various practical settings.
Acknowledgments
This study was supported by a grant of the Korea Health 21 R&D Project, Ministry of Health & Welfare, Republic of Korea (A102065-1011-1070100).
No potential conflicts of interest relevant to this article were reported.
Y.-h.L. and D.J.K. researched data, wrote the manuscript, contributed to the discussion, and reviewed and edited the manuscript. H.B. researched data and wrote, reviewed, and edited the manuscript. H.C.K. and H.M.K. researched data and reviewed and edited the manuscript. S.W.P. researched data, contributed to the discussion, and reviewed and edited the manuscript. D.J.K. is the guarantor of this study and, as such, had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.