Journal of Nursing Education

The articles prior to January 2012 are part of the back file collection and are not available with a current paid subscription. To access the article, you may purchase it or purchase the complete back file collection here

Using Feedback to Reduce Students' Judgment Bias on Test Questions

Laura T Flannelly, PhD, RN

Abstract

ABSTBACT

Judgment bias represents a common tendency of people to inaccurately gauge the extent of their own knowledge. While research has shown that people tend to overestimate their knowledge on hard questions and underestimate their knowledge on easy questions, overconfidence poses a more pernicious problem from an educational perspective since it can undermine students' ability to monitor their own learning effectively and interfere with their test performance. The findings of the present study show that students who perform poorly on a test are more overconfident about their answers to hard test questions, especially those they answer wrongly, than are students who perform better. The results also show that judgment bias can be reduced by providing feedback to students about their prior performance and confidence on specific test questions. This intervention was found to be effective in decreasing both underconfidence on easy questions and overconfidence on hard questions regardless of students' performance level.

Abstract

ABSTBACT

Judgment bias represents a common tendency of people to inaccurately gauge the extent of their own knowledge. While research has shown that people tend to overestimate their knowledge on hard questions and underestimate their knowledge on easy questions, overconfidence poses a more pernicious problem from an educational perspective since it can undermine students' ability to monitor their own learning effectively and interfere with their test performance. The findings of the present study show that students who perform poorly on a test are more overconfident about their answers to hard test questions, especially those they answer wrongly, than are students who perform better. The results also show that judgment bias can be reduced by providing feedback to students about their prior performance and confidence on specific test questions. This intervention was found to be effective in decreasing both underconfidence on easy questions and overconfidence on hard questions regardless of students' performance level.

A number of educators and researchers have discussed the role of confidence in nursing practice, and the importance of critical thinking in promoting confidence in clinical reasoning and decision-making (Copeland, 1990; Grundy, 1993; Haffer & Raingruber, 1998; Seldomridge, 1997). Complementary research has revealed a tendency among physicians, nurses, and other health professionals to be overconfident in their clinical judgment, which may be caused by a lack of critical thinking (Arkes, 1981; Bauman, Deber, & Thompson, 1991; Dawes, Faust, & Meehl, 1989).

Overconfidence in clinical judgment seems to reflect a general tendency of people to be overconfident about the accuracy of their knowledge and judgments (Brenner, Koehler, Liberman, & Tversky, 1996; Shanteau, 1992). Before one can reduce overconfidence in the clinical setting, it may be necessary to reduce it, first, in the educational setting, where future clinicians receive their training. The latter is the focus of the present paper.

Of particular concern from an educational perspective are findings that students overestimate their knowledge of materials they have read or studied (GIenberg, Wilkinson, & Epstein, 1982; Maki, Foley, Kajer, Thompson, & Willert, 1990; Morris, 1990). These findings are disturbing for at least two reasons. The first is that overconfidence is related to poor test performance (Flannelly, 1998; Zakay & Glickeohn, 1992). Students' overconfidence about their knowledge of an area of study has consistently been observed regardless of the type of test or testing methods used (Flannelly, 1998; GIenberg, Sanocki, Epstein, & Morris, 1987; Pressley & Ghatala, 1988; Pressley, Ghatala, Woloshyn, & Pirie, 1990; Zakay & Glicksohn, 1992). As Fischoff, Slovic, and Lichtenstein (1977) put it, people are "wrong too often when they are certain they are right" (p. 552).

The second, broader concern, raised by these findings is they undermine the common assumption that students are able to monitor their own learning accurately (Pressley & Ghatala, 1990). If students do not monitor their own learning accurately, they do not know what they do not know. Consequently, they will not know whether to spend more time on one study strategy rather than another, or even realize that they should read topical materials again (GIenberg et al., 1982; Pressley & Ghatala, 1988, 1990). GIenberg et al. (1982, p. 82) expressed the concern that overconfidence represented an "illusion of knowing" that undermines this process. As Maki, Jonas, and Kallod (1994) reported, more effective learners are more accurate in their estimates of how much they learned from the materials they read.

MEASURING JUDGMENT BIAS

Proponents of confidence as a meaningful measure of subjective certainty have used various scales to measure confidence. In all of their studies, Adams and Adams (1961) set the top of the scale as 100% confidence, but the lower limit of the scale was usually dictated by the actual probability of getting an item incorrect. If the test item only contained two options, participants were not permitted to give a confidence rating below 50%-the probability they would get it correct even if they did not know the answer. If the item contained four options, the lowest permissible confidence score was 25%. As a general rule, then, the lower limit of a scale was determined by the formula 100/K, with K being the number of options from which a participant can choose. Later researchers tended to follow this rule for measuring confidence on multiplechoice items (Fischhoff et al., 1977; Lichtenstein & Fischhoff, 1977, 1980), although Fischhoff et al. (1977) used a 0%-100% confidence scale for open-ended questions (e.g., sentence completion/fill-in).

Some researchers have used dichotomous instead of continuous scales to measure confidence. For example, Pressley and Ghatala (1988) used a 9-point rating scale of confidence; whereas Pressley and Ghatala (1990) and Pressley, Ghatala, Woloshyn, and Pirie (1990) used a 7point scale; and Zeleznik, Hojat, Goepp, Araadio, Kowlessar, and Borenstein (1988) used a 5-point scale (0%, 25%, 50%, 75%, 100%).

Regardless of the number of points on the scale, in these studies, each point represented a percentage measure of certainty or confidence. More recent research has favored the use of continuous 0%-100% scales to measure confidence regardless of the number of answer options (e.g., Pulford & Colman, 1997; Schraw & Roedel, 1994; Zakay & Glicksohn, 1992).

Although measures of confidence have varied widely, the standard of performance in all these studies was the same (i.e., the percentage of correct answers given on a test). Likewise, the measure of calibration or judgment bias in all these studies was the same- the difference between confidence and performance. In general, then, the differences among the measures have more to do with their measurement of confidence than with their measurement of performance, which was scored dichotomousIy, as right or wrong answers.

In all these studies, the difference between participants' subjective statements of confidence and their objective performance on each question was used as a measure of judgment bias or calibration. A person was described as overconfident to the degree his average confidence was higher than his performance. Conversely, a person was described as underconfident to the degree his average confidence was lower than his performance (Lichtenstein & Fischhoff, 1977, 1980; Schraw & Roedel, 1994; Soil, 1996). Despite the methodological differences among the studies, their findings are very consistent.

Overconfidence and Underconfidence

An extensive body of research has demonstrated that people are often overconfident about what they know when answering questions (see review by McClelland & Bolger, 1994). However, people may exhibit underconfidence, rather than overconfidence, depending on the difficulty of the questions asked (Griffin & Tversky, 1992). Recent research has consistently shown that people are underconfident about the accuracy of their answers to easy questions and overconfident about the accuracy of their answers to hard questions. While this effect was first reported by Lichtenstein and Fischhoff (1977), only recently has it received considerable attention (Brenner et al., 1996; Pulford & Colman, 1997; Schraw & Roedel, 1994; Soll, 1996),

The prevalence of overconfidence appears to pose a serious obstacle to effective learning, problem-solving, and decision-making. It is surprising, therefore, how few researchers have examined ways to reduce it.

Reducing Judgment Bias

Several studies have attempted to reduce judgment bias, with varying degrees of success, by providing feedback to subjects about the accuracy of confidence ratings. Paese and Spiezek (1991) and Pulford and Colman (1997) found no effect of feedback on bias to general knowledge questions. Glenberg et al. (1987) and Lichtenstein and Fischhoff (1980) provide limited evidence that feedback reduced overconfidence on a subsequent test, although this only occurred when the specific items on the two tests were nearly identical. Lichtenstein and Fischhoff (1980) only provided nonspecific feedback to subjects in that subjects were told how their overall confidence compared to their overall performance, but they were not told what questions they answered rightly or wrongly. Arkes, Christensen, Lai, and Blumer (1987) reported that overconfidence on general knowledge questions could be decreased significantly when subjects were told how they performed on specific questions.

The present study examined judgment bias about the accuracy of students' answers to test questions in relation to their overall test performance. Specifically, the study was designed to measure the effectiveness of feedback from a practice test (pretest) to reduce bias on hard and easy questions on a subsequent test (post-test) by students who performed well or poorly on the post-test (high and low performers). It was hypothesized: (1) feedback would decrease overconfidence on hard questions and underconfidence on easy questions; and (2) these decreases in bias would be more pronounced for those students who performed high on the test than for those who performed low.

METHODS

Participants

The participants were 66 senior-year, baccalaureate nursing students enrolled in an undergraduate course in psychiatric-mental health nursing- 57 (86.4%) female and 9 (13.6%) male. The ethnic composition of the sample was 33 students of European ancestry (50.0%), 32 students of Asian/Pacific Islands ancestry (48.5%), and 1 student of African ancestry (1.5%). All the students who participated in the study gave their informed consent. No student in the course declined to participate in the study.

Materials

Both the pre- and post-test instruments consisted of 28 multiple-choice questions in the field of psychiatric-mental health nursing. Each test question was presented with a choice of four possible answers from which the students were instructed to select the one they thought was correct. Beneath each question was a box for students to indicate their choice of answer, and a second box in which to indicate their confidence that the answer they chose was correct.

The instructions were similar to those used by Zakay and Glicksohn (1992) and Pulford and Colman (1997). Students were given verbal and written instructions to choose the correct answer to each question and rate their confidence that the answer they chose was correct. They were instructed to rate their confidence on a scale of 0% to 100%, where 0% meant they were "Not at all certain that this answer is the correct answer" to the question, and 100% meant they were "Certain that this is the correct answer" to the question.

The questions used on both of the tests were selected from a pool of test items that were developed for the course over the years. The questions on both tests covered the same content areas. Hence, the questions were similar but not identical. The instructions for both tests were identical.

The content validity of the practice and actual tests was established by three instructors in psychiatric-mental health nursing who helped to develop the tests. The tests were similar to others used in the course in preceding years. Although no attempt was made to measure the criterion validity of these specific tests, correlations of .55 to .65 between students' performance on comparable tests in the same course with their performance on the NI^N Achievement Examination on Nursing Care in Mental Health and Mental Illness indicate the criterion validity of the tests is fairly good (Kubiszyn & Borich, 1984). The internal consistency of the test was assessed by using the Kuder-Richardson formula (KR2Q) on the raw-score performance data (Ferguson & Takane, 1989). The result was a correlation of .62.

Procedures

The students in both conditions participated in a 2hour course review one week prior to taking the post-test. Students in the experimental condition (n = 36) were given a practice test as part of this review session. After taking the practice test, the students in the experimental group were given the answer key to provide them with feedback about their confidence and performance. Performance on the practice test was not scored. Students in the control condition (n = 30) reviewed the same course content, but did not take the practice test and, hence, were not given feedback.

Three measures were directly taken or calculated for the actual test: performance, confidence, and bias. Performance is, simply, the percent of questions answered correctly. Students' confidence in their answers was measured by the likelihood rating they assigned to their choice of answer on each test question. The calculation of judgment bias is explained below.

Item analysis was used to group the 28 questions of the actual test into two categories on the basis of their difficulty level. "Difficulty level is usually defined as the proportion of students responding correctly to an item. The higher this proportion is, the easier the item is" (Sax, 1974, p. 239). AU 28 questions were first ranked by difficulty and their rankings listed from high (easy) to low (hard). The list was then split in half to produce equal numbers of questions in each category: 14 hard and 14 easy items. Students were similarly divided into high and low performance on the basis of their scores on the actual test.

The test was only moderately difficult, with the difficulty level of easy items ranging from 87.9% to 98.5%, and 34.9% to 86.4% for hard items. Mean performance on easy questions was 93.3% (SD = 7.5) and 70.7% (SD = 16.1) on hard questions. The mean performance of students in the low group was 74.6% (SD = 7.5), whereas the performance of those in the high group was 89.6% (SD = 4.9).

Bias was calculated separately for hard and easy questions by calculating the mean confidence (expressed as a percent) and performance (percent correct) of participants on easy items and on hard items. Judgment bias on easy items was then calculated by subtracting performance on easy items from mean confidence on easy items. Likewise, judgment bias on hard items was calculated by subtracting performance on hard items from mean confidence on hard items. A positive difference between confidence and performance indicated positive bias, or overconfidence, whereas a negative difference indicated negative bias, or underconfidence (Lichtenstein & Fischhoff, 1977, 1980; Schraw & Roedel, 1994).

This method of measuring judgment bias is the same as that used by Schraw and Roedel (1994) and Zakay and Glicksohn (1992). It is comparable to the methods used by Glenberg et al. (1987), Lichtenstein and Fischhoff(1977, 1980), Paese and Sniezek (1991), and other researchers.

Design and Analyses

Mean confidence and bias were analyzed separately in a 2 between (experimental vs. control) × 2 between (high vs. low performance) ×2 within (hard vs. easy questions) repeated measures analysis of variance (ANOVA) design (Edwards, 1985; Keppel, 1973). A second, similar analysis was conducted to examine students' confidence on the questions they answered correctly and incorrectly, using a 2 (experimental vs. control) ×2 (high vs. low performance) ×2 (right vs. wrong answers) ANOVA design.

RESULTS

Table 1 presents the mean judgment bias and confidence scores of students in each feedback condition on hard and easy items. Since no significant interaction of performance and feedback was found for either measure (Tables 2 and 3), performance effects are presented separately in Table 4.

As hypothesized, a significant interaction of experimental (i.e., feedback) condition and item difficulty was found for bias, ÍU.62) = 11.28, p <-01. Students who received feedback exhibited less over-confidence on hard questions and less underconfidence on easy questions (Table 1). This result indicates that feedback helped students to make more realistic judgments about their knowledge when answering the questions on the test. It appears that feedback decreased students' confidence that they were right on hard questions and increased their confidence on easy questions, since a significant interaction of treatment and item difficulty was also found for confidence, FU.62) = 11.37, p <.01. This decreased the disparity between confidence and actual performance on each item. As expected, given previous research, a significant main effect of item difficulty was found for both judgment bias, F(1,62) = 73.42, p <.001, and confidence, it 1,62) = 75.60,p <.001.

Feedback did not differentially affect the judgment bias of the high and low performance groups, as had been hypothesized. However, a significant interaction of item difficulty and performance was found for bias, F(1,62) = 29.26, p <.001. Regardless of experimental condition, students who performed lower on the test were overconfident that their answers to hard questions were correct, whereas students who performed better were underconfident about their answers on hard questions (Table 4). The two groups showed similar levels of judgment bias on easy questions. Overall, confidence was significantly lower among students with low performance, F(l,62) = 4.10, p <.05, whereas their bias was significantly higher, F(1,62) = 8.61, p <.01.

Table 5 presents the ANOVA summary table of the effects of feedback, performance, and right and wrong answers (correctness) on confidence. The results indicate that the difference in confidence on hard questions evident in Tables 1 and 4, is partly because low performing students were more confident they were correct when, in fact, they were wrong (mean confidence - 60.2%, SD = 19.6), than were better students (mean confidence = 55.0%, SD = 24.0).

Table

TABLE 1Mean Bias and Confidence of Experimental and Control Groups on Hard and Easy Items

TABLE 1

Mean Bias and Confidence of Experimental and Control Groups on Hard and Easy Items

Conversely, low-performing students exhibited somewhat lower mean confidence on questions they correctly answered (81.2%, SD = 12.7), compared to high performing students (84.6%, SD = 9.2). Analysis of variance revealed a significant main effect of correctness of answers (right vs. wrong), F(1,62) = 99.87, p <.001, and a significant interaction of performance and correctness, F(l,62) = 4.14, p <.05. The interaction reflects that lowperforming students were more confident of their wrong answers and less confident of their right answers than students who performed better.

DISCUSSION

The results of the present study confirm previous research that providing students with feedback about the accuracy of their confidence and performance helps them to be more realistic in their assessments of their own knowledge when taking a test (Arkes et al. 1987; Glenberg et al., 1987; Lichtenstein & Fischhoff, 1980). However, the nature of the feedback used in the present study may limit the generalization of the findings. Since feedback was indirect, in that it relied on the self-initiative of students to use it as they saw fit, more direct feedback might have had a more pronounced effect (Arkes et al. 1987). Nevertheless, all students appeared to benefit to some degree from the feedback they did receive.

Table

TABLE 2ANOVA Summary Table of the Effects of Feedback, Performance, and Item Difficulty on Bias

TABLE 2

ANOVA Summary Table of the Effects of Feedback, Performance, and Item Difficulty on Bias

Table

TABLE 3ANOVA Summary Table of the Effects of Feedback, Performance, and Item Difficulty on Confidence

TABLE 3

ANOVA Summary Table of the Effects of Feedback, Performance, and Item Difficulty on Confidence

Several researchers have proposed that judgment bias largely results from a failure to properly weigh evidence and consider alternatives before making a decision (Griffin & Tversky, 1992; Koriat, Lichtenstein, & Fischhoff, 1980; McKenzie, 1997). In this sense, judgment bias appears to be related to, and may result from, a phenomenon called the feeling-of-knowing (Koriat, 1993). According to Koriat, people have a feeling of knowing the answer to a question "when some answer (any answer) comes to mind" (Koriat, 1993, p. 614). A feeling of knowing can occur even when incomplete information is retrieved from memory, and it does not depend on the correctness of the answer.

Since the feeling of knowing is based on the accessibility of memory rather that its accuracy, judgment bias may be accentuated in multiple-choice tests where the options for each question provide recall cues for wrong, as well as right answers. The feeling of knowing accounts, in part, for students' belief that their first answer to a question is usually correct (Gaskins, Dunn, Forte, Wood, & Riley, 1996; Ramsey, Ramsey, & Barnes, 1987). Unfortunately, it provides students with a false sense of confidence that can keep them from changing then· answer, even though changing answers has been repeatedly shown to improve performance (Gaskins et al., 1996), especially when confidence in the new answer is high (Ramsey et al., 1987). Since judgment bias may be enhanced on multiple-choice questions, the use of such items in the present experiment my be viewed as a limitation of the present study. If other kinds of questions are used, judgment bias may not be as pronounced.

Table

TABLE 4Mean Bias and Confidence on Hard and Easy ttems by Students with High and Low Test Performance

TABLE 4

Mean Bias and Confidence on Hard and Easy ttems by Students with High and Low Test Performance

The nature of multiple-choice tests may further contribute to overconfidence because of a phenomenon called anchoring (Block & Harper, 1991; Tversky & Kahneman, 1974). Anchoring, which is a tendency to become fixated on one of a number of listed alternatives, has been found to increase overconfidence (Block & Harper, 1991). If a student has a feeling of knowing an answer because they remember reading something like it in a book or hearing something like it in class, they may become anchored to that answer. A limited or superficial knowledge of the subject area might make such hasty judgments more likely. Since one of the alternative answers to a multiple-choice item has to be correct, students may be more over-confident that their answers are correct on this kind of test, than they would be if they had to provide the answer themselves (i.e., fill-in answers). Again, the use of multiple-choice questions in the present study may be viewed as a limitation, since multiple-choice questions tend to increase anchoring and, therefore, enhance judgment bias.

The present results agree with those of Zakay and Glicksohn (1992) that students who perform more poorly on tests of domain knowledge tend to be more overconfident about their performance. Zakay and Glicksohn (1992) interpreted their findings to mean that overconfident students made more mistakes because they did not look as critically at their answers as better students did. Indeed, compared to the students in the present study who performed better, the students who performed poorly were more certain their answers were right, when, in fact, they were wrong. Mistakenly believing they were right on a given question, they presumably did not give further attention to the question or further consideration to other possible answers to the question.

Table

TABLE 5ANOVA Summary Table of the Effects of Feedback, Performance, and Correctness of Answers on Confidence

TABLE 5

ANOVA Summary Table of the Effects of Feedback, Performance, and Correctness of Answers on Confidence

Gaskins et al. (1996) reported that most answers to questions are changed when students reconsider the question, and that most answer changes benefit the student. But they usually do so only when they feel uncertain about the answer.

Clearly, overconfidence poses a serious problem for students, since the ability to recognize that an answer may be incorrect is a prerequisite for reprocessing the question (Gaskins et al., 1996; Pressley & Ghatala, 1988; Ramsey et al., 1987). Zakay and Glicksohn (1992) recommended that students' performance should be enhanced through training to optimize their decision-making strategies (Zakay, 1985) and make them more "testwise." This is certainly a userai recommendation, but the implications of judgment bias are broader than test-taking, per se.

Students must be able to determine their degree of knowledge accurately while studying, so they can decide what actions are needed to remedy gaps in their knowledge and to correct misunderstandings. Hence, continuous and accurate monitoring of one's knowledge is essential for choosing effective study strategies. "Delusions about performance" (Pressley & Ghatala, 1988, p. 454) encompassed in overconfidence undermine this process (Glenberg et al., 1982; Pressley & Ghatala, 1990; Pressley et al., 1990).

Judgment bias reflects a lack of critical thinking to the extent it arises from inadequate self-reflection, the failure to recognize and examine assumptions, and unwillingness to evaluate adequately and weigh evidence (Chubinski, 1996; Flannelly & Inouye, 1998; KataokaYahiro & Saylor, 1994).

Nurse researchers and educators have made a number of recommendations for fostering critical thinking in clinical (Conger & Mezza, 1996) and classroom settings (Elliott, 1996), including different kinds of teaching strategies (Inouye & Flannelly, 1998; Rossignol, 1997), and various classroom and study exercises (Abegglen & O'Neill Conger, 1997; Chubinski, 1996; Miller, 1996; Neill, Lachat, & Taylor-Panek, 1997). Extensive research on judgment bias illustrates the need to foster critical thinking in test situations, as well.

REFERENCES

  • Abegglen, J., & O1NeUl Conger, C. (1997). Critical thinking in nursing: Classroom tactics that work. Journal of Nursing Education, 36(10), 452-458.
  • Adams, J. K1 & Adams, RA. (1961). Realism of confidence judgments. Psychological Review, 6S(I), 33-45.
  • Arkes, H.R. (1981). Impediments to accurate clinical judgment and possible ways to minimize their impact. Journal of Consulting and Clinical Psychology, 49(3), 323-330.
  • Arkes, H.H., Christensen, C., Lai, C., & Blumer, C. (1987). Two methods of reducing overconfidence. Organizational Behavior and Human Decision Processes, 39, 133-144.
  • Bauman, A.O., Deber, R.B., & Thompson, G.G. (1991). Overconfidence among physicians and nurses: The "microcertainty, macro-certainty" phenomenon. Social Science and Medicine, 32(2), 167-174.
  • Block, H.A., & Harper, D-R. (1991). Overconfidence in estimation: Testing the anchoring-and-adjustment hypothesis. Organizational Behavior and Human Decision Processes, 49,188-207.
  • Brenner, L.A., Koehler, D.J., Liberman, V., & Tversfcy, A. (1996). Overconfidence in probability and frequency judgments: A critical examination. Organizational Behavior and Human Decision Processes, 65(3), 212-219.
  • Chubinski, S. (1996). Creative critical-thinking strategies. Nurse Educator, 21(6), 23-27.
  • Conger, M.M., & Mezza, I. (1996). Fostering critical thinking in nursing students in the clinical setting. Nurse Educator, 22(3), 11-15.
  • Copeland, L.G. (1990). Developing student confidence: The post clinical conference. Nurse Educator, 15(1), 7.
  • Dawes, R.M., Faust, D., & Meehl, P.E. (1989). Clinical versus actuarial judgment. Science, 243, 1668-167.
  • Edwards, A.L. (1985). Experimental design in psychology and education. (6th ed.). New York: McGraw-Hill.
  • Elliott, D.D. (1996). Promoting critical thinking in the classroom. Nurse Educator, 21(2), 49-52.
  • Ferguson, G.A., & Takane, Y. (1989). Statistical analysis in psychology and education. (6th edition). New York: McGraw Hill.
  • Fischhoff, B., Slovic, P., & Lichtenstein, S. (1977). Knowing with certainty: The appropriateness of extreme confidence. Journal of Experimental Psychology: Human Perception and Performance, 3(4), 552-564.
  • Flannelly, L. (1998, April). Do nursing students think they know more than they really do when they answer test questions? Paper presented at the Annual Nursing Research Conference: Unleashing the Power of Diversity Through Nursing Research, Honolulu, Hawaii.
  • Flannelly, L., & Inouye, J. (1998). Inquiry-based learning and critical thinking in an advanced practice psychiatric nursing program. Archives of Psychiatric Nursing, 12(3), 169-175.
  • Gaskins, S., Dünn, L., Forte, L., Wood, F., & Riley, R (1996). Student perceptions of changing answers on multiple choice questions. Journal of Nursing Education, 35(2), 88-90.
  • Glenberg, A.M., Sanocki, T., Epstein, W., & Morris, C. (1987). Enhancing calibration of comprehension. Journal of Experimental Psychology: General, 116(2), 119-136.
  • Glenberg, A.M., Wilkinson, A.C., & Epetein, W. (1982). The illusion of knowing: Failure in the self-assessment of comprehension. Memory and Cognition, 10(6), 597-602.
  • Griffin, D., & Tversky, A, (1992). The weighing of evidence and the determinants of confidence. Cognitive Psychology, 24, 411415.
  • Grundy, S.E. (1993). The confidence scale: Development and psychometric characteristics. Nurse Educator, 18(1), 6-9.
  • Haffer, A.G., & Raingniber, B.J. (1998). Discovering confidence in clinical reasoning and critical thinking development in baccalaureate nursing students. Journal of Nursing Education, 37(2), 61-70.
  • Inouye, J., & Flannelly, L. (1998). Inquiry-based learning as a teaching strategy for critical thinking. Clinical Nurse Specialist, 12(2), 67-72.
  • Kataoka-Yahiro, M., & Saylor, C. (1994). A critical thinking model for nursing judgment. Journal of Nursing Education, 33, 351-356.
  • Keppel, G. (1973). Design and analysis: A researcher's handbook. Englewood Cliffs, NJ: Prentice-Hall.
  • Koriat, A. (1993). How do we know what we know? The accessibility model of the feeling of knowing. Psychological Review, 100(4), 609-638.
  • Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confidence. Journal of Experimental Psychology: Learning, Memory, and Cognition, 6(2), 107-118.
  • Kubiszyn, T., & Borich, G. (1984). Educational testing and measurement: Classroom application and practice. Glenview, Illinois: Scott, Foresman & Company.
  • Lichtenstein, S., & Fischhoff, B. (1977). Do those who know more also know more about how much they know? Organizational Behavior and Human Performance, 20, 159-183.
  • Lichtenstein, S., & Fischhoff, B. (1980). Training for calibration. Organizational Behavior and Human Performance, 26, 149171.
  • Maki, R.H., Foley, J.M., Kajer, W.K, Thompson, R.C., & Willert, M.G. (1990). Increased processing enhances calibration of comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16(4), 609-616.
  • Maki, R.H., Joñas, D., & Kallod, M. (1994). The relationship between comprehension and metacomprehension ability. Psychonomic Bulletin & Review, 1(1), 126-129.
  • McClelland, A.G.R., & Böiger, P. (1994). The calibration of subjective probabilities: Theories and models 1980-1984. In G. Wright & P. Ayton (Eds.), Subjective probability (pp. 453-482). New York: Wiley.
  • McKenzie, C.R.M. (1997). Underweighting alternatives and overconfidence. Organizational Behavior and Human Decision Processes, 71(2), 141-160.
  • Miller, L.H. (1996). Critical-thinking teaching strategy: Self-tutorial analysis. Nurse Educator, 21(6), 12 & 17.
  • Morris, C.C. (1990). Retrieval process underlying confidence in comprehension judgments. Journal of Experimental Psychology: Learning, Memory and Cognition, 16, 223-232.
  • Neill, K.M., Lâchât, M.F., & Taylor-Panek, S. (1997). Enhancing critical thinking with case studies and nursing process. Nurse Educator, 22(2), 30-32.
  • Paese, P.W., & Sniezek, J.A. (1991). Influences on the appropriateness of confidence in judgment: Practice, effort, information, and decision-making. Organizational Behavior and Human Decision Processes, 48, 100-130.
  • Pressley, M., & Ghatala, E,S. (1988). Delusions about performance on multiple-choice comprehension tests. Reading Research Quarterly, 23(4), 454-464.
  • Pressley, M., & Ghatala, E. S. (1990). Self-regulated learning: Monitoring learning from text. Educational Psychologist, 25(1), 19-33.
  • Pressley, M., Ghatala, E.S., Woloshyn, V., & Pine, J. (1990). Sometimes adults miss the main ideas and do not realize it: Confidence in responses to short-answer and multiple choice comprehension questions. Reading Research Quarterly, 25(3), 232-249.
  • Pulford, B.D., & Colman, A.M. (1997). Overconfidence: Feedback and item difficulty effects. Personality and Individual Differences, 23(1), 125-133.
  • Ramsey, P.H., Ramsey, P.P., & Barnes, M.J. (1987). Effects of student confidence and item difficulty on test score gains due to answer changing. Teaching of Psychology, 14(4), 206-209.
  • Rossignol, M. (1997). Relationship between selected discourse strategies and student critical thinking. Journal of Nursing Education, 36(10), 467-475.
  • Sax, G. (1974). Principles of educational measurement and evaluation. Belmont, CA: Wads-worth.
  • Schraw, G., & Roedel, T.D. (1994). Test difficulty and judgment bias. Memory and Cognition, 22(1), 63-69.
  • Seldomridge, E.A. (1997). Faculty and student confidence in their clinical judgment. Nurse Educator, 22(5), 6-8.
  • Shanteau, J. (1992), Competence in experts: The role of task characteristics. Organizational Behavior and Human Decision Processes, 51, 252-266.
  • Soil, J.B. (1996). Determinants of over-confidence and miscalibration: The roles of random error and ecological structure. Organizational Behavior and Human Decision Processes, 65(2), 117-137.
  • Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124-1131.
  • Zakay, D. (1985). Post-decisional confidence and conflict experienced in a choice process. Act Psychologica, 58, 75-80.
  • Zakay, D., & Glicksohn, J. (1992). Overconfidence in a multiplechoice test and its relationship to achievement. Psychological Record, 42, 519-524.
  • Zeleznik, C., Hojat, M., Goepp, C.E., Amadio, P., Kowlessar, O.D., & Borenstein, B. (1988). Students' certainly during course test-taking and performance on clerkships and board exams. Journal of Medical Education, 63, 881-891.

TABLE 1

Mean Bias and Confidence of Experimental and Control Groups on Hard and Easy Items

TABLE 2

ANOVA Summary Table of the Effects of Feedback, Performance, and Item Difficulty on Bias

TABLE 3

ANOVA Summary Table of the Effects of Feedback, Performance, and Item Difficulty on Confidence

TABLE 4

Mean Bias and Confidence on Hard and Easy ttems by Students with High and Low Test Performance

TABLE 5

ANOVA Summary Table of the Effects of Feedback, Performance, and Correctness of Answers on Confidence

10.3928/0148-4834-20010101-05

Sign up to receive

Journal E-contents