- Research
- Open access
- Published:
The psychometric properties of instruments measuring ethical sensitivity in nursing: a systematic review
Systematic Reviews volume 13, Article number: 87 (2024)
Abstract
Background
Recognizing and appropriately responding to ethical considerations is a crucial element of ethical nursing practice. To mitigate instances of ethical incongruity in healthcare and to promote nurses’ comprehension of their professional ethical responsibilities, it is imperative for researchers to accurately evaluate ethical sensitivity. Conducting a systematic review of the available instruments would enable practitioners to determine the most suitable instrument for implementation in the field of nursing.
Aim
This review aims to systematically assess the measurement properties of instruments used to measure ethical sensitivity in nursing.
Methods
A systematic literature search was conducted in July 2022 in the following electronic databases: Scopus, CINAHL, APAPsycINFO, Embase, Web of Science, and PubMed. Two reviewers independently screened and assessed the studies in accordance with the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. The updated criteria for good measurement properties are used to rate the result of measurement properties, and the modified Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach was used to grade the quality of the summarized evidence.
Results
This review encompasses a total of 29 studies that describe 11 different instruments. Neither cross-cultural validity nor responsiveness was examined in any of the included studies. Whereas the majority of the instruments were conducted with at least some type of validity assessment, nearly all of the reliability results rated were indeterminate. Two instruments were recommended, the Ethical Sensitivity Questionnaire for Nursing Students (ESQ-NS) and the Ethical Awareness Scale for nurses in intensive care units. It is recommended that new self-administration instruments for special nursing settings be developed in accordance with the item response theory (IRT)/Rasch model.
Conclusion
The selection of ethical sensitivity measurement instruments in nursing, and further research on the development, psychometric, and cross-cultural adaptation of these instruments, could be conducted in accordance with the findings and suggestions of this systematic review.
Strengths and limitations
• This review was conducted to assess 11 instruments that were used to measure ethical sensitivity in nursing in 29 studies.
• The Ethical Sensitivity Questionnaire for Nursing Students (ESQ-NS) and the Ethical Awareness Scale for nurses in intensive care units can be recommended, but further reliability and cross-cultural validity testing are needed.
• The IRT/Rasch model is also recommended to measure ethical sensitivity in nursing.
• The potential limitation of utilizing the COSMIN checklist for assessing methodological quality is worth considering.
• Test–retest was considered inappropriate; thus, the reliability testing of ethical sensitivity measurement instruments still needs to be explored.
Introduction
The complex ethical dilemmas arising from advancements in medicine and the increased demands of the healthcare industry are currently being come across [1, 2]. Medical professionals are required to possess a greater ethical awareness and responsibility; hence, the essential role of ethics in health professional education is on the rise [3, 4]. Ethical competence has been considered one of the professional components [5]. In addition, due to the unique culture of the healthcare profession, which encompasses its values [6], nurses who are responsible for coordinating the healthcare team and patients often encounter conflicting values, particularly ethical dilemmas [7,8,9]. In this process, nurses must improve ethical sensitivity to make ethically sound judgments in their application [10].
Ethical sensitivity has also been termed “moral sensitivity” in many studies of the concept [11]. Ethical sensitivity has been identified as a foundational component of ethical action according to Rest in 1976 [11, 12]. Subsequently, scholars have proposed various theories and models pertaining to ethical sensitivity. The prevalent model employed to cultivate and execute ethical sensitivity in nursing setting is the concept of moral sensitivity, which was developed by Lützén in 1993 [13]. Lützén defined moral sensitivity as “the ability to recognize a moral conflict” and “have insight into the ethical consequences made on behalf of the person.” In 2001, Ersoy and Goz [14] defined it as “the capacity or ability to recognize an ethical problem (or an ethical dimension when an ethical conflict is not present).” Weaver et al. [15] conceptualized ethical sensitivity in 2008 as well, i.e., that which enables professionals to recognize, interpret, and respond appropriately to the concerns of those receiving professional services. In their recently published concept analysis, Milliken et al. integrated the concept of ethical sensitivity by proposing that ethical awareness is a component of ethical sensitivity [11], which is the first step in the process of ethical action. Ethical sensitivity, in turn, is an important component of moral reasoning [16].
Ethical sensitivity is imperative in good (ethical) patient care [11] due to enhancing better ethical behavior [17]. However, evidence suggests the ethical import of everyday issues may go unnoticed by nurses in practice, putting patients at risk for harm, and nurses may, at times, feel underprepared to recognize and address ethical issues as they arise in practice [18, 19]. Moreover, diminished or absent ethical sensitivity can result in ethically incongruent care, which is inconsistent with the professional obligations of nursing. Social and technological developments, professional conflicts, and unawareness of individuals regarding their rights are known to create ethical dilemmas [20], and continuous exposure to ethical dilemmas in practice is associated with negative consequences for nurses, such as moral distress [21], which can also have adverse effects on patients, such as decreasing quality of patient care, decreasing confidence in nursing services, and prolonged hospital stay [22].
As such, identifying the most appropriate way to assess ethical sensitivity in nursing is imperative to design interventions to facilitate ethical practice and to ensure nurses recognize the nature and extent of professional ethical obligations. Although many instruments have been internationally developed to assess ethical sensitivity in nursing, it is still not easy to choose the most appropriate instrument for a specific purpose. In part, this is because a comprehensive summary of existing instruments and their measurement properties does not exist. A systematic review of measurement properties would effectively reveal which instruments have been tested and help researchers select an appropriate instrument [23]. Systematic reviews can provide a comprehensive overview of the measurement properties and support evidence-based recommendations in the selection of the most suitable instrument for a given purpose. These kinds of systematic reviews can also be conducted to identify gaps in knowledge regarding measurement properties, which can, in turn, be used to design new studies on measurement properties [24].
However, there are no studies that have conducted systematic reviews of measurement instruments of ethical sensitivity targeted to nursing groups. Although previous integrative reviews exist [11], these did not solely focus on the evaluation of psychometric properties. In-depth evaluations of all available reliability, validity, and responsiveness data for existing instruments used to measure ethical sensitivity in nursing have not been conducted, including assessing the methodological quality of studies and the psychometric properties and quality of evidence for measurement instruments.
Hence, this review aims to systematically identify and critically assess the psychometric properties of instruments used to measure ethical sensitivity in nursing [25].
Methods
Protocol and registration
This systematic review is based on PRISMA [26] (Preferred Reporting Items of Systematic Reviews and Meta-Analyses) reporting guidelines for project reporting. This systematic review has been registered in the PROSPERO [27] database (CRD42022325433).
Literature search
A search was conducted in the following electronic databases: Scopus, the Cumulative Index of Nursing and Allied Health Literature (CINAHL), APAPsycINFO, Embase, Web of Science, and PubMed. The search was limited to publications in the English, Chinese, Korean, Japanese, and Turkish languages (coverage: inception to June 2022). In addition, the references of the included studies and identified reviews were examined, and a manual search was performed with the Google Scholar web search engine. The search strategy used Medical Subject Headings (MeSH) and keywords and included a combination of the following five aspects in reference to the search construct developed by Terwee et al. [28]:
-
#1 construct search: "Ethical sensitivity"[tiab] OR “Moral sensitivity”[tiab] OR”Awareness”
-
#2 population search: "Nurses"[Mesh] OR "Students, Nursing"[Mesh] OR "Nursing"[Mesh]
-
#3 instruments search: Instrument[tiab] OR instruments[tiab] OR measure [tiab] OR measures[tiab] OR questionnaire [tiab] OR questionnaires[tiab] OR scale[tiab] OR scales[tiab] OR tool[tiab] OR tools[tiab] OR survey [tiab] OR test [tiab]
-
#4 filter for measurement properties
-
#5 exclusion filter
These filters were adapted for use when searching all the other databases and detailed in Supplementary material file 1 [28].
References identified by the search strategy were entered into EndNote bibliographic software to screen the selected articles [29].
Eligibility criteria
Inclusion criteria are as follows: (a) original research; (b) pertains to an instrument designed to measure ethical sensitivity in nursing, with a focus on the development and evaluation of psychometric properties; and (c) published in English, Chinese, Korean, Japanese, and Turkish.
Exclusion criteria are as follows: (a) research that employed the instruments solely to validate other instruments measuring similar or related constructs was excluded, and (b) research that ethical sensitivity was only included as part of a broader measure were also excluded.
Data selection
The search hits were inserted in EndNote, and duplicates were removed. Based on the established eligibility criteria for article selection, one author ran the reviewed retrieve strategy across all the databases, while another two independently screened the titles and abstracts. The search results were then screened with full-text review. Potential disagreements regarding the inclusion of an article were resolved through a discussion, but, in case of differences, a third researcher decided whether to include an article.
The methodological quality and result rating of each single study
The COSMIN risk-of-bias checklist [25] is utilized for assessing the methodological quality and the psychometric properties of each single study. The list is composed of 10 boxes: instrument development, content validity, structural validity, internal consistency, cross-cultural validity, reliability (test–retest, inter-rater, intra-rater), measurement error, criterion validity, hypothesis testing, and responsiveness [30]; every box is evaluated with items assessed on a 4-point rating scale (very good, adequate, doubtful, inadequate), and the final evaluation of each property is assessed on a “worst score counts” principle.
The updated criteria for good measurement properties [24] is employed to rate the result of each single study on a measurement property. The term “quality” pertains to the actual outcome of the measured property. Each attribute is rated as sufficient ( +), insufficient (-), or indeterminate(?).
Synthesis
The summary of evidence for each measurement property was conducted, and the quality of evidence for each property was graded as either “high,” “moderate,” “low,” or “very low” [24], in accordance with the modified Grading of Recommendations Assessment, Development and Evaluation (GRADE) quality of evidence method [24, 31]. The focus is here on each instrument. The results of all available studies on a measurement property are quantitatively pooled or qualitatively summarized and compared against the criteria for good measurement properties to determine whether overall the measurement property of the instrument is sufficient ( +), insufficient (-), inconsistent ( ±), or indeterminate (?) [24, 32].
The included instruments are categorized into three categories of recommendations: (A) instruments that have the potential to be recommended as the most suitable instrument for the construct and population of interest (i.e., instruments with evidence for sufficient content validity (any level) and at least low evidence for sufficient internal consistency); (B) instruments that may have the potential to be recommended, but further validation studies are needed (i.e., instruments categorized not in A or C); and (C) instruments that should not be recommended (i.e., instruments with high-quality evidence for an insufficient measurement property) [24, 33]. The step is formulated concerning the quality of the evidence, construct of interest, and study population.
The two authors conducted the aforementioned assessments autonomously, and the ultimate outcomes were attained via consensus.
Results
Study selection
A total of 2505 articles were retrieved, 1525 duplicate articles were removed, and 980 articles were rescreened. Two reviewers evaluated the title and abstract based on the inclusion criteria. A total of 68 articles are included for full-text review. Among them, 29 studies met the search criteria shown in the PRISMA flowchart [26], as shown in Fig. 1.
Studies and instruments characteristics
Most studies aimed at instrument development were conducted in the USA [34,35,36,37]. Sweden [38, 39], Spain [40, 41], and Japan [42, 43] have each developed two instruments. Fourteen studies aimed at instrument development and adaptation in 30 studies, while the remaining 16 studies focused on cross-cultural revisions and adaptations. A cross-sectional design with convenience sampling was employed in all 28 studies, while a few studies used purposive sampling [34, 36, 37, 40, 44] and random sampling [35, 45, 46]. The target population for each study was registered nurses or nursing students, with sample sizes ranging from 6 to 1465. Most of the study settings were hospitals or universities, but a few studies were conducted in primary healthcare centers [40, 41, 47]. Among the 12 instruments, 6 instruments were used to measure nurses’ ethical sensitivity across clinical settings, 4 instruments were targeted mainly at nursing students [34, 42, 43, 48], and the remaining 2 instruments were targeted at registered nurses in intensive care units [36] and primary health care professionals [41]. Two instruments were unidimensional models in accordance with the item response theory (IRT)/Rasch [36, 40]. There is one structured qualitative instrument [14]. The number of items in the measurement instruments ranged from 9 to 35. The most common instrument was the Moral Sensitivity Questionnaire (MSQ) [13]. The characteristics of the included studies and instruments are detailed in Table 1.
The methodological quality of each single study
Instrument development was rated as doubtful or inadequate for 10 instruments in 12 studies due to a lack of evidence of data collection saturation, cognitive interviews, or pilot tests; only 2 studies were rated as adequate for methodological quality of instrument development [40, 41]. The content validity was assessed for 19 instruments in 29 studies, and doubtful data collection and analysis methods were the main reason for poor COSMIN scores, so that only 7 studies had adequate or very good content validity [36, 41, 46, 57, 59,60,61].
Construct validity was assessed using EFA in 19/26 classical test models, of which only 5 studies performed both EFA and CFA [43, 46, 56, 58, 59], but 1 study incorrectly selected the Rasch model [47]. The structural validity of the included classical test models was rated as inadequate or questionable due to inappropriate rotation methods and unclear CFA estimators (robust maximum likelihood (MLR)/diagonally weighted least squares (DWLS)). The methodological quality of the three IRT/Rasch models in terms of structural validity was rated adequate or very good [36, 37, 40]. The structural validity of the structured qualitative instrument was not performed.
Internal consistency was the most frequently assessed psychometric characteristic among the included studies and was of the best overall methodological quality, as most studies calculated internal consistency statistics for each unidimensional scale or subscale. Only 11 studies assessed reliability using test–retest methods [34, 35, 37, 39, 42, 45, 46, 55, 57, 58, 61], and all had time intervals greater than or equal to 2 weeks. Two studies did not use the retest method to measure reliability due to reflexivity, but did not mention other approaches to validation [34, 39].
The measurement invariance of the model was validated in 5 studies in 13 instrument development studies, but the methodological quality was rated as inadequate or doubtful due to insufficient sample size in the subgroup analysis [36, 37, 40, 42, 47] or inappropriate analysis methods (a linear regression analysis) [42]. None of the instruments for translation or cross-cultural revision validated the cross-cultural invariance of the model.
Measurement error was assessed in 1/29 studies [61], but COSMIN scores were rated as inadequate because it is unclear if SEM was calculated based on Cronbach’s alpha.
Criterion validity was assessed in the 5/29 study [35, 42, 58, 60, 61], using the MSQ and ESSCN as the gold standard.
Two studies that assessed convergent validity [43, 61] and 4/12 studies that assessed discriminative validity [40, 42, 48, 61] were rated as doubtful because only data presented on a comparison with an instrument that measures another construct or the statistical method applied was not optimal.
The responsiveness properties were not evaluated in any studies.
The results of the ratings for assessing measurement properties using the COSMIN risk-of-bias checklist are detailed in Table 2.
The rating of results for instruments in each single study
Results of content validity for only three instruments [34, 41, 57] were rated as sufficient ( +) due to the fact that a clear description of reviewing instruments is lacking, or only target population was involved or doubtful design or method.
The structural validity of 3/26 classical test model studies [46, 54, 58] was rated as sufficient ( +), whereas the rest were mostly rated as insufficient (-) and indeterminate (?) due to not meeting model fit criteria, and factors explain < 50% of variance. There is 1/4 IRT/Rasch study [47] rated as insufficient (-) due to violation of unidimensionality.
Thirteen instruments with Cronbach’s alpha(s) < 0.70 for each unidimensional scale or subscale were therefore rated as insufficient (-).
Measurement invariance was rated as indeterminate (?) in 1/5 study [42] due to minimal important change (MIC) not being defined.
Reliability was assessed for 11 instruments, but the results of only 1 instrument were rated as sufficiently ( +) [61], benefiting from the fact that it reported intraclass correlation coefficient (ICC) and ICC > 0.70.
Measurement error results for ESSCN were rated indeterminate (?) because MIC and smallest detectable change (SDC) were not defined.
The correlation results of all 5 instruments [35, 42, 58, 60, 61] with the gold standard were < 0.70.
Only three classical test theory (CTT) models [43, 47, 49] and two Rasch models [36, 37, 40] defined specific hypotheses for construct validity, and the result is in accordance with the hypothesis, so the construct validity of the five instruments is sufficient ( +).
Using the updated criteria based on Terwee et al. [23] and Prinsen et al. [62] assessed the result of each single study on a measurement property as detailed in Table 3.
Synthesis
This systematic review has identified that six instruments were subjected to testing in multiple studies, while the remaining instruments were tested in only one study. The modified GRADE approach was applied to grade the quality of the evidence while summarizing and evaluating the measurement properties of these instruments in accordance with the criteria for good measurement properties. Additionally, categories of recommendations were formulated concerning the quality of the evidence, construct of interest, and study population. The results of the evidence synthesis are presented in Table 4.
-
Moral Sensitivity Questionnaire (MSQ) (1994): The overall rating for content validity and reliability were indeterminate, so there will be no grading of the quality of the evidence [24]. Moderate quality for inconsistent internal consistency was found, since unexplained inconsistency of results across studies. The low quality of evidence showed inconsistent structural validity. However, sufficient construct validity was confirmed by the high quality of evidence. Thus, MSQ-1994 may have the potential to be recommended for various clinical settings, but further validation studies are needed.
-
Moral Sensitivity Questionnaire (2006): The moderate and low quality of evidence showed inconsistency and insufficiency of content validity, structural validity, and internal consistency. Compared to MSQ-1994, it is shorter, so the recommended category is also B.
-
Modified Moral Sensitivity Questionnaire for Student Nurses (MMSQ-SN): Except for internal consistency, which has been demonstrated to be inconsistent by moderate quality evidence, the remaining psychometric properties are uncertain and require further testing before the instrument can be recommended for measurement in nursing students.
-
Ethical Sensitivity Questionnaire for Nursing Students (ESQ-NS): Structural validity and cross-cultural invariance are questionable, which also reduces reliability. Fortunately, there is no high-quality evidence for these insufficient measurement properties. In addition, the criterion validity and construct validity of the ESQ-NS need to be validated by further studies. However, it is undeniable that the ESQ-NS is currently the most preferred and most common instrument targeted at nursing students, as well as with evidence for sufficient content validity and high evidence for sufficient internal consistency. Thus, the recommended category is A.
-
Ethical Awareness Scale also has the potential to be recommended as the most suitable instrument for the unidimensional model and registered nurses in intensive care units, as there is moderate and above evidence of sufficient and relatively comprehensive measurement properties, especially content validity and internal consistency. Furthermore, the Rasch model [63] is ideally suited to address some of the limitations that have been problematic in attempting to measure the related construct of ethical sensitivity [36].
-
Ethical Sensitivity Scale: High-quality evidence demonstrated sufficient structural validity, measurement invariance, and construct validity. However, due to indeterminate structural validity and insufficient internal consistency of low-quality evidence, the recommended category for this instrument is B. The Rasch model of nurses’ ethical sensitivity may have the potential to be recommended for various clinical settings.
-
Ethical attitudes questionnaire for PHC professionals needs to be further validated for structural validity and internal consistency. Noted that the CTT model violates unidimensionality when assessing structural validity using the Rasch model.
-
Vignettes did not report any psychometric properties except for content validity, and the comprehensiveness of the instrument may be insufficient.
-
Ethical Sensitivity Scale in Undergraduate Nursing Students (ESS-UNS): Low-quality evidence showed insufficient structural validity and internal consistency. Therefore, after refining sufficient measurement properties supported by high-quality evidence, the ESS-UNS has the potential to be recommended for measuring ethical sensitivity of nursing students.
Due to high-quality evidence for insufficient structural validity and internal consistency, Moral Sensitivity Questionnaire for Nursing Students (MSQ-ST), Byrd’s Nurse’s Ethical Sensitivity Test (Byrd’s NEST), and Ethical Sensitivity Scale for Clinical Nurses (ESSCN) should not be recommended.
Discussion
This review systematically assesses studies of the psychometric properties of instruments to measure ethical sensitivity in nursing, the methodological quality of the studies, the quality of the measurement properties, and the grading of the available evidence to formulate recommendations. Twenty-nine studies and 12 instruments based on a reflective model were included in this systematic review. No studies were found in which the cross-cultural validity or responsiveness of included instruments was tested. Only two instruments used Rasch analysis to test the psychometric properties of the newly developed instruments, respectively, Ethical Awareness Scale and Ethical Sensitivity Scale. According to the COSMIN scoring system, two instruments were recommended, Ethical Sensitivity Questionnaire for Nursing Students (ESQ-NS) and Ethical Awareness Scale for nurses in intensive care units.
Content validity and internal consistency were considered priority measurement properties [24, 64]. In this systematic review, only one of the included studies [41] could be given at least adequate ratings for the methodological quality and sufficient results of content validity (detailed in Tables 2 and 3). The main reason for the insufficient content validity of most instruments found in the results of this study is that the comprehensiveness of content validity was omitted, i.e., the method of data collection and whether data saturation was achieved were not clarified [24], and also due to the reason that the relevance, comprehensiveness, and comprehensibility are assigned the same weight when evaluating content validity [24, 30]. Furthermore, according to Terwee et al. [23], Prinsen et al. [62], and Gignac et al. [65], we suggest that future studies use a method combining Cronbach’s alpha and McDonald’s omega coefficient (Ω) for testing internal consistency of CTT models, as Cronbach’s alpha has been shown to have a bias in underestimating internal consistency.
For the structural validity of CTT models, EFA was the preferred method in the included studies, but several studies [8/26] [38, 39, 48, 50, 53, 54, 60, 61] utilized an inappropriate rotational method (i.e., orthogonal instead of oblique rotation). It violates hypotheses for the correlation of the reflective model. Moreover, EFA alone is inadequate [24, 66]. Unfortunately, a minority of the studies [5/26] [43, 46, 56, 58, 59] performed CFA, which is a shortcoming in the level of evidence for structural validity. In terms of CFA estimator, one of the included studies [59] used a robust maximum likelihood (MLR) method; none of the rest mentioned it. However, the recent literature suggested that the diagonally weighted least squares (DWLS) may perform uniformly better than MLR in factor loading estimates for ordinal observed variables [67] since the use of MLR assumes that the observed indicators follow a continuous and multivariate normal distribution.
Classical test theory remains the most commonly employed approach for assessing the psychometric properties of instruments, providing evidence of scale-level properties [68], and a total of 9 instruments and 25 studies based on the CTT were included in this study. In contrast, Rasch analysis, which compares samples and items on the same equal scale in a log-transformed format, can offer more thorough data and evidence regarding item- and person-level traits [69], thus compensating for two shortcomings of the CTT: (a) the Likert scale is not equidistant, and (b) the same items may have different meanings on the same scores [70]. Moreover, objective measurement is one of the characteristics of the Rasch model [69], which may be more helpful in distinguishing the degree of ethical sensitivity of different nursing groups. Note that while reporting measurement properties of an instrument based on the IRT, the unidimensionality must not be violated, and a sample size of at least 100 per group is required for measuring invariance [25], but none of the four Rasch model studies [36, 37, 40, 47] included in this study fully met the appropriate sample size.
Given that ethical sensitivity is a latent variable [11], it is challenging to develop a gold standard for measuring ethical sensitivity. Nevertheless, none of the five studies in which evaluated criterion validity was conducted had measurements that met the criterion. Therefore, criterion validity is insufficient.
Resilience Measurement Scale, Behavioral Control targeted at Preventing Harm (BCPH) scale, Ethics Advocacy Scale (EAS), Moral Disengagement Scale (MDS), and the level of moral reasoning (DIT-N2) were used to validate convergent validity of included instruments [49]. Note that there is a need for authors to describe in detail the comparator instruments as well as demonstrate their reliability and validity in the study population as failure to do so affects both internal and external validity [25, 71]. However, 3/3 studies that tested convergent validity did not report the specific psychometrics of the comparator instruments, and 1/3 study refer to the psychometrics of the comparator from another population which again is questionable [43, 71]. Further, without demonstrating the psychometric robustness of the comparator instrument and making specific hypotheses, it is not possible to judge its construct validity [71]. None of the studies formulates specific hypotheses in comparison between subgroups to test discriminative validity.
Future directions, recommendations and advice
While the aim of several instruments included in this review suggests measuring only one specific property, in-depth profiling of the defined dimensions being measured and reporting comprehensive psychometric properties are essential for instrument selection [24]. Further, cross-cultural validity testing is needed for the recommended models [36, 42] as well as for potentially recommended models with good content validity and internal consistency. It is also important to continuously update statistical data analysis methods for measurement characteristics to reduce bias (e.g. [24, 67]).
The included instruments differ in terms of the complexity and integrity relating to their items and dimensions. It is necessary to give a clear and operational definition of the construct when selecting instruments [64]. There is no standardized definition of ethical sensitivity exists, and the included instruments measure the general concept of ethical sensitivity or its different components. Seven of all included instruments were developed in accordance with the construct of ethical sensitivity proposed by Lützén et al. [13]: one was developed in accordance with the component of ethical sensitivity-Ethical Awareness Scale [36], one was developed in accordance with the concept proposed by Ersoy and Goz [14], and another two instruments [40, 41] were constructed based on the concepts defined by Weaver [15]. Overall, all included instruments reported conceptual models, except for the ESS-UNS [48]. Future research is needed to constantly integrate and develop an operational definition of ethical sensitivity in nursing and its different components.
New technologies and medical progress increasingly affect nursing globally [72], such as artificial intelligence (AI), robotic medical systems, precision healthcare, and increasing dependence on telehealth and other virtual models of nursing, which may bring new ethical problem [73] including algorithmic bias and fairness, equity in access to technology, ethical implications of robotics, and automation. The development of instruments requires a meticulous focus on ethics, with integrating established and evolving ethical frameworks, fostering interdisciplinary collaboration, and implementing robust education programs for nurses. Focus on the changes in ethical norms, as well as on global standards and regulations, is also essential. These provide the foundational principles and guidelines for the development of instruments.
Significantly, given a clear definition of the construct of interest when selecting an appropriate instrument, particular attention should also be paid to whether the instrument development study was conducted in the target population and whether other studies were conducted to test the psychometric properties of the instrument on the target population [24]. The instruments recommended by this systematic review were targeted at nursing students — ESQ-NS [42] and nurses in intensive care units — Ethical Awareness Scale [36]; additional testing of the cross-cultural validity of the two instruments is warranted. Further validation of available instruments is needed for groups of nurses in general clinical settings. Given the adequate availability of instruments that can be used with the abovementioned groups, the development of new self-administration instruments is not suggested. Especially, the Ethical Sensitivity Scale [40], if its internal consistency and content validity can be additionally validated in future studies, may address the measurement limitations of the MSQ [13, 39] due to insufficient structural validity. Finally, if the items are based on nursing groups in other special settings (e.g., pediatrics, obstetrics and gynecology, operating room, infectious disease, geriatrics, psychiatry, hospice), the development of new self-administration instruments in accordance with IRT/Rasch is suggested.
Overall, ethical sensitivity is a multidimensional construct [74], posing a formidable challenge in its assessment of construct validity, primarily attributed to the absence of a universally acknowledged gold standard. Therefore, whether engaging in the development of novel instruments or undertaking cross-cultural adaptation research, explicit delineation of an ethical sensitivity framework, the precise formulation of the initial structural definition, and the articulate expression of various dimensions play vital roles in the initial phases of instrument development. Further related research needs to cover a more comprehensive range of ethical literature, policies, and regulations, and conduct qualitative research, such as grounded theory [75], to ensure that the emergent themes and the operational definitions are comprehensive and accurate.
Limitations
The potential limitation of utilizing the COSMIN checklist for assessing methodological quality is worth considering [32] as the checklist came into effect in 2011 and was updated in 2018, and some of the instrument development and translation were performed before the publication of the checklist.
Previous work suggested that in instruments of this type of psychological construct, there are reliability limitations. Test–retest was considered inappropriate as the type of questions utilized may stimulate respondents to reflect on the topic, which, in turn, may lead to new perspectives or attitudes toward the topic, causing inconsistent responses in multiple testing situations. Furthermore, it is not considered that the correlations found in this type of study are stable over time. Thus, the reliability testing of ethical sensitivity measurement instruments still needs to be explored.
Conclusions
The present study conducted systematic reviews of instruments measuring ethical sensitivity in nursing in a transparent and standardized way. The findings can contribute to the development of ethical sensitivity instruments in nursing with cross-cultural adaptation, as well as an evidence-based selection of these instruments.
Availability of data and materials
Supplementary material associated with this article can be found in the online version.
References
Sperling D. Ethical dilemmas, perceived risk, and motivation among nurses during the COVID-19 pandemic. Nurs Ethics. 2021;28(1):9–22.
Goethals S, Gastmans C, de Casterlé BD. Nurses’ ethical reasoning and behaviour: a literature review. Int J Nurs Stud. 2010;47(5):635–50.
Carrese JA, Malek J, Watson K, et al. The essential role of medical ethics education in achieving professionalism: the Romanell Report. Acad Med. 2015;90(6):744–52.
Kollemorten I, Strandberg C, Thomsen B, et al. Ethical aspects of clinical decision-making. J Med Ethics. 1981;7(2):67–9.
Murphy F. International Council of Nurses Ethics in Nursing Practice: a guide to ethical decision making by ST Fry & MJ Johnstone. J Renal Care. 2008;34(4):218–218.
Schroeder R, Morrison E, Cavanaugh C, West M, Montgomery J. Improving communication among health professionals through education: a pilot study. J Health Adm Educ. 1999;17(3):175–98.
Corley MC. Moral distress of critical care nurses. Am J Crit Care. 1995;4(4):280–5.
Gutierrez KM. Critical care nurses’ perceptions of and responses to moral distress. Dimensions Crit Care Nurs. 2005;24(5):229–41.
Morley G, Field R, Horsburgh CC, Burchill C. Interventions to mitigate moral distress: a systematic review of the literature. Int J Nurs Stud. 2021;121:103984.
Chen Q, Su X, Liu S, Miao K, Fang H. The relationship between moral sensitivity and professional values and ethical decision-making in nursing students. Nurse Educ Today. 2021;105:105056.
Milliken A. Nurse ethical sensitivity: an integrative review. Nurs Ethics. 2018;25(3):278–303.
Rest J. New approaches in the assessment of moral judgment. Moral development and behavior. NY: Holt Rinehart & Winston; 1976.
Lützen K, Nordin C. Structuring moral meaning in psychiatric nursing practice. Scand J Caring Sci. 1993;7(3):175–80.
Ersoy N, Göz F. The ethical sensitivity of nurses in Turkey. Nurs Ethics. 2001;8(4):299–312.
Weaver K, Morse J, Mitcham C. Ethical sensitivity in professional practice: concept analysis. J Adv Nurs. 2008;62(5):607–18.
Jaeger SM. Teaching health care ethics: the importance of moral sensitivity for moral reasoning. Nurs Philos. 2001;2(2):131–42.
Jia Y, Chen O, Xiao Z, Xiao J, Bian J, Jia H. Nurses’ ethical challenges caring for people with COVID-19: a qualitative study. Nurs Ethics. 2021;28(1):33–45.
Truog RD, Brown SD, Browning D, et al. Microethics: the ethics of everyday clinical practice. Hastings Cent Rep. 2015;45(1):11–7.
Milliken A. Ethical Awareness: What It Is and Why It Matters. Online Journal of Issues in Nursing. 2018;23(1):2. https://doiorg.publicaciones.saludcastillayleon.es/10.3912/OJIN.Vol23No01Man01.
Palazoğlu CA, Koç Z. Ethical sensitivity, burnout, and job satisfaction in emergency nurses. Nurs Ethics. 2019;26(3):809–22.
Rathert C, May DR, Chung HS. Nurse moral distress: a survey identifying predictors and potential interventions. Int J Nurs Stud. 2016;53:39–49.
Schluter J, Winch S, Holzhauser K, Henderson A. Nurses’ moral sensitivity and hospital ethical climate: a literature review. Nurs Ethics. 2008;15(3):304–21.
Terwee CB, Bot SD, de Boer MR, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60(1):34–42.
Prinsen CA, Mokkink LB, Bouter LM, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27(5):1147–57.
Mokkink LB, De Vet HC, Prinsen CA, et al. COSMIN Risk of Bias checklist for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27(5):1171–9.
Moher D, Altman DG, Liberati A, Tetzlaff J. PRISMA statement. Epidemiology. 2011;22(1):128.
Booth A, Clarke M, Dooley G, et al. The nuts and bolts of PROSPERO: an international prospective register of systematic reviews. Syst Rev. 2012;1(1):1–9.
Terwee CB, Jansma EP, Riphagen II, de Vet HC. Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res. 2009;18(8):1115–23.
Gutiérrez-Sánchez D, Gómez-García R, Cuesta-Vargas AI, Pérez-Cruzado D. The suffering measurement instruments in palliative care: a systematic review of psychometric properties. Int J Nurs Stud. 2020;110:103704.
Osmancevic S, Schoberer D, Lohrmann C, Großschädl F. Psychometric properties of instruments used to measure the cultural competence of nurses: a systematic review. Int J Nurs Stud. 2021;113:103789.
Guyatt GH, Oxman AD, Vist GE, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924–6.
Belayneh T, Gebeyehu A, Adefris M, Rortveit G. A systematic review of the psychometric properties of the cross-cultural adaptations and translations of the Prolapse Quality of Life (P-QoL) questionnaire. Int Urogynecol J. 2019;30:1989–2000.
de Freitas GR, Abou L, de Lima A, Rice L, Ilha J. Measurement properties of clinical instruments for assessing manual wheelchair mobility in individuals with spinal cord injury: a systematic review. Arch Phys Med Rehabil. 2022;104(4):656–72.
Comrie RW. Identifying and measuring baccalaureate and graduate nursing students’ moral sensitivity: Southern Illinois University at Carbondale; 2005.
Byrd LM. Development of an instrument to identify the virtues of expert nursing practice: Byrd’s Nurses Ethical Sensitivity Test (Byrd’s NEST): the University of Southern Mississippi; 2006.
Milliken A, Ludlow L, DeSanto-Madeya S, Grace P. The development and psychometric validation of the Ethical Awareness Scale. J Adv Nurs. 2018;74(8):2005–16.
Milliken A, Ludlow L, Grace P. Ethical awareness scale: replication testing, invariance analysis, and implications. AJOB Empirical Bioethics. 2019;10(4):231–40.
Lützén K, Nordström G, Evertzon M. Moral sensitivity in nursing practice. Scand J Caring Sci. 1995;9(3):131–8.
Lützen K, Dahlqvist V, Eriksson S, Norberg A. Developing the concept of moral sensitivity in health care practice. Nurs Ethics. 2006;13(2):187–96.
González-de Paz L, Kostov B, Sisó-Almirall A, Zabalegui-Yárnoz A. A Rasch analysis of nurses’ ethical sensitivity to the norms of the code of conduct. J Clin Nurs. 2012;21(1920):2747–60.
González-de Paz L, Devant-Altimir M, Kostov B, Mitjavila-López J, Navarro-Rubio MD, Sisó-Almirall A. A new questionnaire to assess endorsement of normative ethics in primary health care: development, reliability and validity study. Fam Pr. 2013;30(6):724–33.
Muramatsu T, Nakamura M, Okada E, Katayama H, Ojima T. The development and validation of the Ethical Sensitivity Questionnaire for Nursing Students. BMC Med Educ. 2019;19(1):1–8.
Takizawa M, Ota K, Maeda J. Development of a questionnaire to measure the moral sensitivity of nursing students. Nagoya J Med Sci. 2021;83(3):477.
Nora CRD, Zoboli E, Vieira MM. Validation by experts: importance in translation and adaptation of instruments. Revista Gaúcha de Enfermagem. 2017;38(3):e64851.
Tosun H. Moral Sensitivity Questionnaire (MSQ): Turkish adaptation of the validity and reliability. J Contemp Med. 2018;8(4):316–21.
Huang FF, Yang Q, Zhang J, Zhang QH, Khoshnood K, Zhang JP. Cross-cultural validation of the Moral Sensitivity Questionnaire-Revised Chinese Version. Nurs Ethics. 2016;23(7):784–93.
González-de Paz L, Kostov B, López-Pina JA, Zabalegui-Yárnoz A, Navarro-Rubio MD, Sisó-Almirall A. Ethical behaviour in clinical practice: a multidimensional Rasch analysis from a survey of primary health care professionals of Barcelona (Catalonia, Spain). Qual Life Res. 2014;23(10):2681–91.
Macale L, Scialò G, Masi P, et al. Development of the Ethical Sensitivity Scale in undergraduate nursing students. Prof Inferm. 2015;68(4):244–50.
Kuilman L, Jansen GJ, Mulder LB, Middel B, Roodbol PF. Re-assessing the validity of the Moral Sensitivity Questionnaire (MSQ): two new scales for moral deliberation and paternalism. J Eval Clin Pract. 2020;26(2):659–69.
Dalla Nora CR, Zoboli EL, Vieira MM. Validation of a Brazilian version of the moral sensitivity questionnaire. Nurs Ethics. 2019;26(3):823–32.
Nakamura M, Ishikawa M, Hiejima S. Examination of the reliability and the validity of Moral Sensitivity Test (Japanese version)(the 1st). Univ Yamanashi Acad Repository. 2000;17:52–7.
Nakamura M, Nishida F, Hiejima YS, Ishikawa M, Date K, Nishida Y. Examination of the reliability and the validity of the Moral Sensitivity Test (Japanese Version)(2nd Report). Niigata Seiryo University. 2001;18:41.
Han S-S, Kim J, Kim Y-S, Ahn S. Validation of a Korean version of the Moral Sensitivity Questionnaire. Nurs Ethics. 2010;17(1):99–105.
Bayoumy HMM, Halabi JO, Esheaba OM. Translation, cultural adaptation, validity and reliability of the moral sensitivity questionnaire for use in Arab countries. Saudi J Health Sci. 2017;6(3):151.
Maeda J, Konishi E. Development and validation of a Japanese version of the revised moral sensitivity questionnaire: a preliminary study. J Jpn Nurs Ethics. 2012;4(1):32–7.
Jiménez-Herrera MF, Font-Jimenez I, Bazo-Hernández L, Roldán-Merino J, Biurrun-Garrido A, Hurtado-Pardos B. Moral sensitivity of nursing students. Adaptation and validation of the moral sensitivity questionnaire in Spain. Plos one. 2022;17(6):e0270049.
Yilmaz Sahin S, Iyigun E, Acikel C. Validity and reliability of a Turkish version of the modified Moral Sensitivity Questionnaire for Student Nurses. Ethics Behav. 2015;25(4):351–9.
Yu H, Tong T, Gao Y, Zhang H, Tong H, Liang C. Reliability and validity evaluation of the chinese version of the Ethical Sensitivity Questionnaire for Nursing Students. BMC Nurs. 2021;20(1):1–10.
Shengnan W, Zhaobin J, Pingping D, Xiumu Y. Chinesization and validity test of Ethical Sensitivity Questionnaire for Nursing Students. J Bengbu Med College. 2022;47(5):692–5.
Min HY, Kim YJ, Lee JM. Validity and reliability of the Korean version of the Ethical Sensitivity Questionnaire for Nursing Students. J Kor Acad Comm Health Nurs. 2020;31(4):503–13.
Joung M-Y, Seo JM. Development of an Ethical Sensitivity Scale for Clinical Nurses. J Korean Acad Fundamentals Nurs. 2020;27(4):375–86.
Prinsen CA, Vohra S, Rose MR, et al. How to select outcome measurement instruments for outcomes included in a “core outcome set”–a practical guideline. Trials. 2016;17(1):1–10.
Rasch G. Probabilistic models for some intelligence and attainment tests: ERIC. 1993.
Terwee CB, Prinsen C, Chiarotto A, et al. COSMIN methodology for assessing the content validity of PROMs–user manual. Amsterdam: VU University Medical Center; 2018.
Gignac GE, Reynolds MR, Kovacs K. Digit Span subscale scores may be insufficiently reliable for clinical interpretation: distinguishing between stratified coefficient alpha and omega hierarchical. Assessment. 2019;26(8):1554–63.
Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res. 2012;21(4):651–7.
Li C-H. Confirmatory factor analysis with ordinal data: comparing robust maximum likelihood and diagonally weighted least squares. Behav Res Methods. 2016;48(3):936–49.
Stolt M, Kottorp A, Suhonen R. The use and quality of reporting of Rasch analysis in nursing research: a methodological scoping review. Int J Nurs Stud. 2022:104244.
Bond TG, Fox CM. Applying the Rasch Model: Fundamental Measurement in the Human Sciences, Second Edition (2nd ed.). Psychology Press. 2007; https://doiorg.publicaciones.saludcastillayleon.es/10.4324/9781410614575.
Bortolotti SLV, Tezza R, de Andrade DF, Bornia AC, de Sousa Júnior AF. Relevance and advantages of using the item response theory. Qual Quant. 2013;47(4):2341–60.
Dambi JM, Corten L, Chiwaridzo M, Jack H, Mlambo T, Jelsma J. A systematic review of the psychometric properties of the cross-cultural translations and adaptations of the Multidimensional Perceived Social Support Scale (MSPSS). Health Qual Life Outcomes. 2018;16(1):1–19.
Booth R G, Strudwick G, McBride S, O’Connor S, López Solano AL. How the nursing profession should adapt for a digital future. BMJ. 2021;373:n1190. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmj.n1190.
FernándezFernández JL. Ethical considerations regarding biases in algorithms. 2022.
Kraaijeveld MI, Schilderman J, van Leeuwen E. Moral sensitivity revisited. Nurs Ethics. 2021;28(2):179–89.
Mohajan D, Mohajan HK. Constructivist grounded theory: a new research approach in social science. Res Adv Educ. 2022;1(4):8–16.
Acknowledgements
Appreciating the research librarian for reviewing the search strategies.
Funding
The study was supported by the Yunnan University of Chinese Medicine Research Program (BE22047).
Author information
Authors and Affiliations
Contributions
ELC, LZ, and YMW conceived the study idea, and LZ was responsible for developing and writing the first draft of the systematic review protocol and manuscript. GL and LW contributed to data curation. ELC, LZ, YMW, LXB, GL, and LW provided critical insights at all stages. All authors approved and contributed to the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1: Supplementary material file 1: Table 1.
Search strategy.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Zhou, L., Bi, L., Wu, Y. et al. The psychometric properties of instruments measuring ethical sensitivity in nursing: a systematic review. Syst Rev 13, 87 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13643-024-02473-9
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13643-024-02473-9