Cite

Introduction

Mental health problems are one of the most common causes of global burden of disease among adolescents and young adults aged 12–24 years (‘young people’ hereafter) (Patel et al., 2007). Most of the mental health problems begin during youth with approximately 75% of diagnosed mental disorders among adults having an onset before the age of 24 years (Ruiz and Primm, 2010). Poor mental health in young people is associated with poor educational achievements, substance abuse, and violence (Patel et al., 2007). The burden of the most common mental health problems such as depressive- and anxiety-disorders peaks during the age of 10–29 years (Whiteford et al., 2013). Given the early onset, it is crucial to start assessing and monitoring mental health problems of individuals when they are young.

The Kessler Psychological Distress Scale is a screening tool to assess the level of distress associated with non-specific psychological symptoms at the population level (Kessler et al., 2002). Two versions of the scale – K6, a 6-item scale embedded in a 10-item scale (K10) – were developed from a pool of 612 items from the then existing scales of psychological distress (Kessler et al., 2002). Because of the strong evidence on the predictive and screening power, and high factorial and construct validity (Kessler et al., 2003, Kessler et al., 2002), the K10 and K6 scales have become increasingly popular as a screening tool to assess nonspecific psychological symptoms (Andersen et al., 2011, Kessler et al., 2002) and have been widely used in large epidemiological surveys in many countries including Australia, Canada, and the United States (Sunderland et al., 2011, Kessler et al., 2010).

The K6 has demonstrated similar sensitivity to its longer version K10 in differentiating between cases and non-cases of serious mental disorders (Arnaud et al., 2010). The K6 is also comparable to the K10 in terms of screening DSM-IV mental disorders (Furukawa et al., 2003), and has very good concordance with the blinded clinical ratings of serious mental illness (Kessler et al., 2010). Given its popularity and brevity, the K6 has been translated into 14 languages and also included in the World Mental Health Surveys (Kessler et al., 2002; Kessler et al., 2010; Mewton et al., 2016).

In the adult population, the psychometric properties of the K6 is well-established and robust; however, there is limited evidence of reliability and validity of the scale among young people. Only a handful of studies in high-income countries have demonstrated the usefulness of the K6 in epidemiological surveys on youth mental health (Chan and Fung, 2014; Mewton et al., 2016). However, there is a paucity of research around the psychometrics of the scale with young people in low-and-middle-income countries (LMICs), where the majority of people have native language(s) other than English. There are about 261.8 million Bangla speaking people around the world (Ethnologue 2018) with limited tools available in Bangla to assess their psychological distress. A recent validation study of Bangla version of the K10 in Bangladesh reported that 10 items of the scale were not appropriate for measuring psychological distress in adults (Uddin et al., 2018). However, the psychometrics of the Bangla version of the K6, which performs equally well as the K10 with less burden on respondents (Arnaud et al., 2010; Furukawa et al., 2003), is yet to be determined. Hence, there is a need for a validated and reproducible Bangla version of the K6 to accurately identify people at risk of mental health problems whose native language is Bangla and who live in resource-poor settings with limited or no access to mental health services. This is particularly important for young people whose comorbidities of psychological disorders are common and whose mental health needs are unmet (Patel et al., 2007). The aim of the present study was to evaluate the psychometric properties (e.g., internal consistency, factor structure, test-retest reliability, and predictive validity) of the Bangla version of the K6 scale (hereafter referred as Bangla K6) in Bangla speaking young people of Bangladesh.

Methods
Participants

A self-administered questionnaire survey was conducted among students aged 13–24 years from two high schools and two universities in Dhaka, the capital of Bangladesh. These institutions were purposively chosen based on geographical convenience and connection with the researchers in order to maximize participation. Each of the participating schools and universities approved the survey prior to its administration and nominated their representatives (e.g., lecturers/teachers) to facilitate implementation of the study. For high school students, individual written informed consent was obtained from students and one parent of each student prior to administering the survey. The university students completed the informed consent form prior to completing the survey. Students completed the survey in classroom settings in the presence of one of the research team members and representative lecturers/teachers. The survey was repeated in a week. The survey was available in Bangla, the native language of the study participants. It took approximately 10–15 minutes to complete the survey. Data for this study were collected between August 2017 and April 2018.

Measures

Psychological distress of the study participants was assessed using the Bangla K6 scale, which asked participants to rate how often during the past four weeks they felt: (1) nervous; (2) hopeless; (3) restless or fidgety; (4) so depressed that nothing could cheer them up; (5) that everything was an effort; and (6) worthless. A five-point rating scale was used as the response options, indicating 0 (none of the time); 1 (a little of the time); 2 (some of the time); 3 (most of the time); and 4 (all of the time). The Bangla K6 scale, used in the present study, was translated by the New South Wells (NSW) Health, Australia (Transcultural Mental Health Centre 2012). In the translation process, the Australia’s NSW Health made efforts to produce a translated scale that is culturally competent and linguistically appropriate for the target Bangla speaking population. The Bangla K6 was developed following a standardized translation model developed by the Epidemiology and Surveillance Branch of the NSW Department of Health in collaboration with the NSW Multicultural Health Communication Centre. The model included three major steps: (i) pre-translation preparation, (ii) translation, and (iii) verbal back-translation. Translators accredited by the Australian National Association for the Accreditation of Translators and Interpreters (NAATI) conducted the translations. Verbal back-translation was performed by bilingual interviewers and/or interpreters to enhance the quality of the translation, which ensured that the translation was scientifically sound but was neither too formal nor too colloquial. A pilot survey was conducted to detect any obvious language difficulties with the translated version (Williamson et al., 2000). In addition to the Bangla K6, the students in the current study completed the 10-item Center for Epidemiologic Studies Short Depression (CES-D-10) scale twice with a week’s interval between the administrations. Furthermore, students provided information on some selected background characteristics including age and sex.

Data analyses

Cronbach alpha was computed to evaluate the internal consistency of the Bangla K6 scale. Cronbach alpha measures how closely related a set of items are as a group and examines whether the items measure an underlying construct. Higher alpha values indicate greater internal consistency with alpha ≥ 0.70 suggesting acceptable (Bland and Altman, 1997). Principal component analysis (PCA) was used to examine variability in the data and to determine whether the items on the Bangla K6 belong to one or more constructs.

Confirmatory factor analysis (CFA), as used elsewhere (Easton et al., 2017; Bessaha, 2017; Lee et al., 2012), was used to examine how well the Bangla K6 items represent one or more latent constructs. CFA offers the testing of hypothesis that a relationship between observed variables and their underlying latent constructs exists. In particular, CFA was conducted to determine the goodness-of-fit between the original factor structure of the K6 and the sample data collected using the Bangla K6. CFA was also used to examine whether factor structure of the Bangla K6 could be refined to improve the model fit by addition of pathways. The determination of a path addition to a CFA model is based on the examination of modification indices (MI). If a MI between two items is high in relation to other MIs, it suggests that the addition of a path is expected to improve the overall fit of the model. If the addition of a path does not make theoretical or logical sense, then the path should not be included. The following goodness-of-fit indices of CFA, along with their threshold values, were used to assess the degree of fit between the model and the sample: Chi-square test; Tucker Lewis Index (TLI: > 0.90 acceptable, > 0.95 excellent); Comparative Fit Index (CFI: > 0.90 acceptable, > 0.95 excellent); Root Mean Square Error of Approximation (RMSEA: < 0.08 acceptable, < 0.05 excellent); and Standardized Root Mean Square Residual (SRMR: < 0.08 acceptable, < 0.05 good-fit) (Hooper et al., 2008).

Furthermore, differential item functioning (DIF) analysis (Crane et al., 2006) was conducted to examine whether the relationships between psychological distress and the Bangla K6 item responses were influenced by any participant factor (e.g., student group). DIF is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups. Thus, DIF analyses examine potential bias of the scale items and evaluate whether different population groups perform differently on items of the scale. In presence of DIF (which was not desirable), participants in different groups (school students vs. university students), but with equal underlying levels of psychological distress, would differ in the probabilities of assenting to the Bangla K6 items and their categories. In absence of DIF (which was desirable), the probability of assenting to the Bangla K6 items and their categories would not differ across the student groups.

Prior to computing test-retest reliability coefficients, Bland-Altman (BA) plot was used to assess and display agreement and systematic differences between the measurements at two time points. The BA plot examines heteroscedasticity between the measurements and identifies possible outliers. The 95% confidence interval (CI) of limits of agreement (LOA) was used to examine the magnitude of systematic differences, with a narrower interval indicating a higher stability (Bland and Altman, 1999). The intra-class correlation coefficient (ICC) with two-way mixed effects model was used to estimate the test-retest reliability coefficient of the Bangla K6. In examining test-retest reliability, while the Pearson correlation coefficient can be used to measure of the correlation of measurements, and paired t-test can be used to measure agreement; none of them measures both correlation and agreement. ICC is such an index that measures both correlation and agreement between the measurements, and as such, it has become a widely used measure of test-retest reliability. The ICC values can be interpreted as 0.75–0.90 as good reliability, and > 0.90 as excellent reliability (Koo and Li, 2016).

The predictive performance of the Bangla K6 scale was examined by Receiver Operating Characteristic (ROC) curves (Fassaert et al., 2009) with CES-D-10 as the reference criterion. ROC curve tells how much a fitted model is capable of distinguishing between classes and gives an idea about the benefit of using the test in question. An ROC curve is a plot of true positive rate (sensitivity) against the false positive rate (100-specificity) for different cut-off points of a diagnostic test. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. An ROC curve demonstrates the trade-off between sensitivity and specificity as any increase in sensitivity is accompanied by a decrease in specificity. We used CES-D-10 because it is considered as one of the most common measures of depressive symptoms (Andresen et al., 2013; Radloff, 1977). Participants with a total score of ≥ 10 on CES-D-10 were classified as showing depressive symptoms, and those with scores < 10 were classified as non-depressive (Radloff, 1977). The area under the curve (AUC) was calculated as a measure of the extent to which the Bangla K6 scores predicted depressive symptoms using the dichotomized CES-D-10. The AUC is used as a measure of test accuracy, and it shows how well a test can distinguish between two diagnostic groups (positive/negative). An AUC of 0.50 suggests that the Bangla K6 is no better than chance at predicting depressive symptoms, whereas an AUC of 1.0 would indicate that the Bangla K6 predicts depressive symptoms perfectly. AUCs are typically interpreted as AUC: 0.90–1.00 excellent, 0.80–0.89 good, 0.70–0.79 fair, and < 0.70 poor (Cicchetti, 2001). Furthermore, the relationships between the Bangla K6 and the CES-D-10 scores were examined using the Pearson’s correlation coefficients. CFA was conducted by using the statistical software AMOS 25, and the rest of the analyses were performed by Stata SE 14.

Results
Sample characteristics

Of the 941 students approached, a total of 718 students completed the survey at time 1 (week 1; response rate = 76.3%) and 712 completed at time 2 (week 2). Half of the participants (51%) were school students, aged 13–17 years, and the rest (49%) were university students, aged 18–24 years. The mean age of the study participants was 18.39 years (SD = 3.52) and 45% were female.

Internal consistency and factor structure

High values of Cronbach’s alpha demonstrated good internal consistency of the Bangla K6 with alpha = 0.87 at time 1 and alpha = 0.88 at time 2. An initial PCA of Bangla K6 revealed that the six items of the scale represented a single factor solution at time 1 with 51% of the total variance explained and the first eigenvalue being 3.04 (second eigenvalue = 0.75). A single factor was also extracted for the Bangla K6 at time 2 with 52% of total variance explained and the first eigenvalue being 3.11 (second eigenvalue = 0.74). The PCA results provided evidence of uni-dimensionality in the Bangla K6 items.

The CFA provided further evidence that the Bangla K6 scale consisted of one factor at both time 1 and 2, after adjusting for relatively high MIs. The goodness-of-fit indices of the Bangla K6 at time 1 were: χ2(7) = 7.92, p = 0.34; CFI = 0.996; TLI = 0.997; RMSEA = 0.014; and SRMR = 0.012. The fit indices at time 2 were: χ2(6) = 9.97, p = 0.13; CFI = 0.997; TLI = 0.992; RMSEA = 0.03; and SRMR = 0.014. Insignificant Chi-square values suggested good fit of the CFA models. Both RMSE and SRMR were below the cut-off point of 0.05, suggesting satisfactory model fit. The values of CFI and TLI (> 0.95) represented an excellent fit. The item loadings ranged from 0.53 (item 6) to 0.68 (item 4) at time 1 (Figure 1), while 0.57 (item 6) to 0.70 (item 4) at time 2 (Figure 2).

Figure 1

The best fitted model, with standardized estimates, based on the results of confirmatory factor analysis of the items of Bangla K6 at time 1 (n = 718)

Note: Oval represents the latent construct or factor; rectangle represents the items; and small circle represents the relevant error terms. Item loadings are interpreted as correlation between the items and the construct, and ranged from 0.53 (item 6) to 0.68 (item 4).

Goodness-of-fit indices of CFA on the Bangla K6 items at time 1 were:

- Chi-square test: Chi-sq (7) = 7.92, p = 0.34 suggests a good fit;

- Tucker Lewis Index (TLI) = 0.997 > 0.95 suggests an excellent fit;

- Comparative Fit Index (CFI) = 0.996 > 0.95 suggests an excellent fit;

- Root Mean Square Error of Approximation (RMSEA) = 0.014 < 0.05 suggests an excellent fit;

- Standardized Root Mean Square Residual (SRMR) = 0.012 < 0.05 suggests a good fit.

Figure 2

The best fitted model, with standardized estimates, based on the results of confirmatory factor analysis of the items of Bangla K6 at time 2 (n = 715)

Note: Oval represents the latent construct or factor; rectangle represents the items; and small circle represents the relevant error terms. Item loadings are interpreted as correlation between the items and the construct, and ranged from 0.57 (item 6) to 0.70 (item 4).

Goodness-of-fit indices of CFA on the Bangla K6 items at time 2 were:

- Chi-square test: Chi-sq (7) = 9.97, p = 0.13 suggests a good fit;

- Tucker Lewis Index (TLI) = 0.992 > 0.95 suggests an excellent fit;

- Comparative Fit Index (CFI) = 0.997 > 0.95 suggests an excellent fit;

- Root Mean Square Error of Approximation (RMSEA) = 0.03 < 0.05 suggests an excellent fit;

- Standardized Root Mean Square Residual (SRMR) = 0.014 < 0.05 suggests a good fit.

Further analyses showed only a minor DIF for item 3 (restless or fidgety) across the student groups at time 1, which was graphically inspected. In case of similar levels of non-specific psychological distress, the university students had a slightly higher probability of affirming item 3 than the school students. Given that the DIF for item 3 was minor at time 1 with no DIF at time 2, the DIF findings were unlikely to influence the current analyses.

Test-retest reliability

The BA plot for test-retest reliability of the Bangla K6 demonstrated small differences on repeated measurements without any evidence of heteroscedasticity (Figure 3). The narrow range of the LOA and a few outliers in the Bangla K6 indicated a high stability and less systematic differences in the scale over time. ICC value for the scale was 0.80 (0.77 – 0.82), which suggested an acceptable reliability.

Figure 3

Bland-Altman plot of the Bangla K6 between test and retest sessions

Note: Differences between test and retest sessions were plotted against the average of the two sessions for each participant. It plots average measures of the two test sessions (x-axis) against difference between test and retest measures (y-axis). The centre line represents mean of differences, while the upper and lower lines indicate 95% limits of agreement (LOA).

- Narrow LOAs suggest that the test-retest measures essentially equivalent, while wide LOAs suggest that the measures are ambiguous.

- If the variability in measurements is consistent across the plot without any particular pattern or trend, then it is an indication of homoscedasticity. If the difference gets larger as the average gets larger, it suggests the presence of heteroscedasticity.

Predictive validity

The Bangla K6 provided good predictions of depressive symptoms, measured by the CES-D-10 scale, with AUC = 0.82 (0.79–0.85) at time 1 and 0.80 (0.76–0.83) at time 2. ROCs for predicting depressive symptoms by the Bangla K6 scores for school and university students are presented in Figure 4 for time 1 and in Figure 5 for time 2. These figures clearly show that the predictive validity of the Bangla K6 scores in predicting depressive symptoms were comparable across student groups at both time points. When investigating the associations, the Pearson’s correlation coefficient between the Bangla K6 and the CES-D-10 scores were considerably high (Pearson’s r = 0.68 at time 1; 0.67 at time 2).

Figure 4

ROC-curves for the Bangla K6 predicting depressive symptoms for students of schools (#1) and universities (#2) at time 1 (CES-D-10 as the reference criterion)

Note: The area under the ROC curve (AUC) shows how well a test can distinguish between two diagnostic groups (positive/negative).

- The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test.

- An AUC of 0.50 suggests that the Bangla K6 is no better than chance at predicting depressive symptoms, whereas an AUC of 1.0 would indicate that the Bangla K6 predicts depressive symptoms perfectly.

- AUC = 0.82 for school students and AUC = 0.80 for university students represent good prediction of depressive symptoms at time 1.

Figure 5

ROC-curves for the Bangla K6 predicting depressive symptoms for students of schools (#1) and universities (#2) at time 2 (CES-D-10 as the reference criterion)

Note: The area under the ROC curve (AUC) shows how well a test can distinguish between two diagnostic groups (positive/negative).

- The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test.

- An AUC of 0.50 suggests that the Bangla K6 is no better than chance at predicting depressive symptoms, whereas an AUC of 1.0 would indicate that the Bangla K6 predicts depressive symptoms perfectly.

- AUC = 0.85 for school students and AUC = 0.80 for university students represent good prediction of depressive symptoms at time 2.

Discussion

To our knowledge, the current study is the first psychometric evaluation of the translated version of 6-item Kessler psychological distress scale (K6) in Bangla, the sixth most spoken native language in the world by population (Ethnologue 2018), in a relatively large sample of young people in Bangladesh. Our study showed high internal consistency (alpha ≥ 0.87) of the Bangla K6 at two time points, which is consistent with previous research on translated K6 (Bu et al., 2017; Easton et al., 2017). This finding suggests that the translated items in fact assess the same overarching construct of psychological distress. The results also showed the existence of a single factor structure, as found elsewhere (Kessler et al., 2010), with absence of any item bias (DIF), suggesting robustness of the scale in assessing non-specific psychological distress among young people examined in this study. The AUCs ≥ 0.80 at both time points suggest good validity of the scale in predicting depressive symptoms. The findings also showed high test-retest reliability of the Bangla K6 with ICC being 0.80. Thus, our study demonstrated acceptable psychometric properties of the Bangla K6 in a Bangladeshi sample of young people.

An earlier research examined psychometrics of the Bangla version of 10-item Kessler Psychological Distress Scale, K10 (Uddin et al., 2018); however, that evaluation was based on a sample of adult population in rural Bangladesh. Our study provides support for the Bangla version of the 6-item scale (K6) in young people. Although the PCA in our study found a single factor solution with > 50% variability explained, there was some evidence of a possible second factor demonstrated by high eigenvalues, similar to what has been reported in U.S. and Australian adolescents (Mewton et al., 2016, Peiper et al., 2015). In contrast to a single factor solution, some earlier research has identified a two-factor structure in K6 (depression and anxiety) (Easton et al., 2017; Bessaha, 2017; Lee et al., 2012). Furthermore, the CFA in our study supported a one-factor model; however, optimal fit of the models was only achieved when correlated residuals were taken into account through inclusion of MIs, a similar approach adapted earlier to optimize model fit (Mewton et al., 2016; Peiper et al., 2015). This perhaps reflects the fact that Bangla K6 items may represent interrelated aspects of psychological distress. The items may tap different constructs when they are used with young people and adults, or the study participants might have interpreted the translated items differently than what they were supposed to mean. Hence, we cannot rule out the possibility that culturally appropriate alternative items may better capture the underlying spectrum of psychological distress in the study sample. Our study also found a minor DIF for item 3 (restless or fidgety) at time 1 (but not at time 2), which suggests that item 3 might be interpreted differently by adolescents and young adults. Further research is therefore needed to generate a better understanding about culturally adapted items that may be developed for optimal use of K6 in young people, as suggested elsewhere (Mewton et al., 2016).

Although the results of the current study seem promising, several limitations need to be taken into account when interpreting the findings. Firstly, the study was based on a nonrandom sample of students from four purposively selected educational institutions based in a metropolitan city. This limits the generalization of the study findings to other settings, including regional or rural areas. Only interested students of the participating schools and universities completed the survey, which may have resulted in a volunteer bias. Although the translation process was thorough (Williamson et al., 2000), it is difficult to assess how the study participants in Bangladesh perceived the translated scale items, especially when the survey was self-administered. Hence, the possibility of misinterpretation of the items from the study participants cannot be ruled out. Although CES-D-10 was previously used in Bangladeshi adolescents (Khan et al., 2017), the cut-off of ≥ 10 to represent depressive symptoms has not been validated in Bangladesh. This is likely to compromise the predictive validity of the Bangla K6 scale.

In conclusion, the findings of our study suggest that translated single-factor Bangla K6 scale has good psychometrics and appears to be an acceptable instrument to assess psychological distress among Bangladeshi adolescents and young adults. Given the brevity and good psychometrics, the Bangla K6 scale can be used in epidemiological surveys among young people to provide a quick and inexpensive screen for psychological distress, which could be followed up by clinical assessments. This approach can be particularly useful in resource-poor settings, such as Bangladesh, the Indian state of Tripura and West Bengal, where mental health service is poor or sometime non-existent and the native language is Bangla. Future research is, however, needed to better understand our ability to detect and identify vulnerable individuals whose native language is Bangla and who are in need of mental health support.