Mr. Heglund is Instructor, Undergraduate Department, and Dr. Wink is Professor, Graduate Department, College of Nursing, University of Central Florida, Orlando, Florida.
This manuscript was presented as a poster at the Baccalaureate Education Conference of the American Association of Colleges of Nurses, New Orleans, Louisiana, November 2007.
The authors have no financial or proprietary interest in the materials presented herein.
Address correspondence to Diane Wink, EdD, FNP-BC, ARNP, FAANP, Professor, NP Tracks Coordinator, UCF College of Nursing, 12201 Research Parkway, Suite 300, Orlando, FL 32826; e-mail: firstname.lastname@example.org.
Collaborative testing, allowing students to work together to complete an examination, is used in many disciplines (Wink, 2004). This study examined one outcome of collaborative testing: student learning. Learning gains were determined based on responses to items on a cumulative final examination to which they had or had not been previously tested in either a collaborative or individual testing experience. Learning was defined as “knowledge acquired by systematic study in any field of scholarly application” (Dictionary.com, 2010).
The exact structure of this testing approach varies widely. Some faculty have students submit an examination completed with one or more class peers; others have the students complete the test independently and then submit a second attempt of all or part of the examination after collaborating with other class members (Wink, 2004). Group size for collaborative tests varies, generally from two to six students. Group composition is determined by the teacher, by the students, randomly based on an existing student relationship (e.g., a clinical group), or by a unique student factor such as student grade on prior examinations or quizzes (Sandahl, 2009).
There is also variation in how students’ grades are affected by their score on the test completed collaboratively. Extra points may be given on a student’s individually completed examination if the collaborative examination received a higher score. The collaborative examination score may constitute some percentage (e.g., 20%) of their grade for the examination. As in this study, the examination grade may be an average of the individual and collaborative test scores (Sandahl, 2009; Wink, 2004).
Research on the outcomes of collaborative testing has primarily examined two factors: student and faculty perception of the experience and the impact of such testing on course grades. Wilder, Hamner, and Ellison (2007) examined student response to collaborative testing and found that 59% of the students thought it enhanced learning, 85% thought it was worth the extra time, 60% thought it developed critical thinking, and 83% thought it should be continued. They reported that faculty indicated the approach increased understanding, allowed the students to go beyond competition, and allowed more timely feedback on student learning needs.
Hickey (2006) used surveys to obtain student and faculty feedback about a collaborative testing experience in which students could have a small increase (0.25 to 1 point) in each course grade based on the letter grade of the collaborative examination. Approximately 90% (n = 78) reported they liked collaborative testing, 85% (n = 75) thought collaboration is an important skill for nurses, 76% (n = 67) thought that collaborative testing helped them understand the material better, and 70% (n = 62) indicated they felt collaborative testing allowed them to feel confident in the knowledge that they had about a certain subject so that they could explain it to peers. One student responded that knowing they would have a chance to complete the test collaboratively allowed them to study less. Conversely, 18% said it encouraged them to study more.
Faculty responses to a survey about their perceptions of collaborative testing were similar. They identified benefits such as reduced arguments during test reviews and the advantages of immediate discussion after the individual test as tools for increased understanding. Of note, when compared with course grades the previous year, student performance as indicated by examination average was slightly lower when collaborative testing was used.
Studies on the effects of collaborative testing on course grades are more limited. Research by Wink (2004) documented that examination and course grades when collaborative testing was used varied but resulted in higher grades for most students. In that study, 54% of students earned the same grade they would have earned if collaborative testing was not used in the course (26% an A, 28% a B). However, 40% moved from a B to an A, and 6% moved from a C to a B. No student had a C or lower as a final course grade. Additional data and discussion of issues related to collaborative testing are reviewed in the articles by Wink (2004) and Sandahl (2009), as well as in the studies cited on the extensive bibliographies of these publications.
What has not been reported is the effect of collaborative testing on knowledge. Do students who use collaborative testing learn from the experience? This study was conducted to help answer that question.
This study was conducted on three campuses of a large state university in the southeast United States. After the study received expedited human subject review and approval from the university’s institutional review board, potential participants were given a letter of introduction about the study instructing the students on the procedure to opt out of the research. No students opted out.
Study participants totaled 166 students (100% of each class) enrolled in three face-to-face sections of the same Health Care Issues, Policy and Economics course. Class enrollment varied, with 112 in the main campus section, 33 in one regional campus section, and 21 in a second regional campus section. All students in all sections were prelicensure students, with the exception of three licensed RNs in the University’s RN-to-BSN transition program. These students were enrolled in the same regional campus section.
Each course section was presented using a face-to-face modality. Classes met on the respective campus twice per week throughout an 8-week summer semester with course enhancement via identical Web-based learning modules. Students in all sections worked in groups to present debates, discuss scenarios about selection of health insurance policies, or develop a flyer advocating a specific change in health policy. Students also wrote a letter calling for action from a policy maker and wrote an analysis of an article on a health topic that had been published in the lay press. Course section instructors shared lecture notes, but actual presentation (e.g., lecture content and selection of issues for debate) was modified to meet the needs of the setting, account for the different class sizes, and reflect individual faculty presentation style.
All students took three identical examinations. The first two examinations tested content in preceding course units, and the final examination tested content delivered since the previous examination, as well as cumulative course content. All students had access to routine test review following the first two examinations.
Student scores earned individually were recorded as their grade. Student scores earned as part of the collaboration scheme had a possibility of three results. If a student did not earn a passing grade as an individual, that grade was recorded. If a student earned a passing grade that was less than the collaborative score, the average of the two scores was recoded as the grade. If a student scored higher on the individual attempt than the score earned by that student’s collaboration group, the individual score was recorded as the grade. This method eliminated any penalty or excessive reward.
On examination one, randomly selected students composing half of each course section completed the examination as individuals. The remaining students completed the examination as individuals and then collaboratively completed the same examination as part of a randomly assigned group of 5 to 6 students. The 3 licensed RN students were purposefully placed into separate groups to dilute the advantage of their experience as professional nurses.
On examination two, the previously randomly selected groupings were reversed so that each student functioned as his or her own control. Thus, each student completed one examination as an individual only and the other examination individually and collaboratively. All participants collaboratively tested on the final examination.
Questions from examinations one and two were selected for inclusion on the final examination based on a low rate of individual correct responses combined with a high rate of correct group responses on the examination on which they initially appeared. Item stems for these repeated questions were identical, but the order of the responses was scrambled from the original versions to control for the effect of item recognition. A total of 35 of 100 items on the final examination had been previously used on either examination one (n = 18) or two (n = 17).
Scores on the collaborative final examinations were higher by an average of 5 to 10 points, depending on the course section. In addition, 20% of the students realized a letter grade improvement from B to A, and 32% of the students improved from C to B. However, this was noted in only one of the course sections. The others showed no letter grade improvement. As previously stated, students who did not pass an examination as an individual were not awarded the increased score for the collaborative effort.
Mean scores were determined for the repeated questions with separate scores calculated for questions on a prior test when students had an opportunity to collaboratively test on those items and for questions on a prior test when students did not have an opportunity to collaboratively test on those items. The mean scores for collaboratively tested items increased 3.7 points on the third examination. The mean score for items that had not been double tested increased only 2.7 points on the third examination.
A comparison of the group scores for collaboratively tested and not collaboratively tested items was conducted using paired t test. The resulting t score ranged from −5.976 to −17.262 (p < 0.000) when all groups were compared. However, scores on collaboratively tested and not collaboratively tested group results for the individual items readministered on examination three did not statistically differ, although there was a small improvement in overall score (increase of 3.7 versus 2.7 points). This single point difference on a hypothetical 35-item examination would translate to a grade approximately 3% higher, given equally weighted test items. Although not meeting the criteria to be statistically significant, we believe that it would be difficult to find a student unwilling to earn a 3% higher grade.
Anecdotal response of the students to the collaborative testing experience was similar to that reported in prior studies. Students expressed satisfaction with this approach to testing because it allowed for immediate review of answers and rationales with peers. Students also liked the fact it improved their grade and taught them negotiation skills. This was especially poignant when they realized that they needed to be more effective in convincing their peers of a correct answer.
Collaborative testing has been well received by students and results in higher course grades. However, evidence that this approach will increase student knowledge based on evaluation of performance on future examinations remains lacking. This study demonstrated that student learning did occur. Students were more likely to answer a test item correctly if they had been tested on that item using the collaborative testing approach on the examination on which it previously appeared. They were less likely to choose the correct response to items that they had not collaboratively answered previously when those items appeared on a latter examination. However, the increase in scores did not reach statistical significance.
In the sample of the population studied here, other factors such as the short duration of the class (8 weeks), the ability of the students to review prior examinations via usual test review procedures, and uncontrollable factors, such as student discussion of examination content as part of their personal interactions, could have also influenced results.
Suggestion for Further Research
Replication of this study in additional settings is needed. In addition, a stratification of the individual testing items may also increase the understanding of how learning could be augmented with this technique. For instance, items could be designed ranging from simple memorization of terms to the more complicated measure of a student’s ability to synthesize data. An investigation such as this would allow nurse educators to offer a collaborative learning assessment for material that shows the greatest affinity for such an approach, if any differences were discovered, associated with the type of test item being assessed. Study of additional measures of learning and comparison research (e.g., usual test review versus review only via the discussions at the time of the collaborative testing) would also extend understanding of the outcome of collaborative testing. Studies that address the impact of collaborative testing on student collaboration and teamwork would also be of interest.
- Hickey, B.L. (2006). Lessons learned from collaborative testing. Nurse Educator, 31, 88–91. doi:10.1097/00006223-200603000-00012 [CrossRef]
- Learning. (2010). In Dictionary.com. Retrieved from http://dictionary.reference.com/browse/learning
- Sandahl, S. (2009). Collaborative testing as a learning strategy in nursing education: A review of the literature. Nursing Education Perspectives, 30, 171–175.
- Wilder, B., Hamner, J. & Ellison, K.J. (2007). Student perceptions of the impact of double testing. Nurse Educator, 32, 6–7. doi:10.1097/00006223-200701000-00003 [CrossRef]
- Wink, D. (2004). Effect of double testing on course grades in an undergraduate nursing course. Journal of Nursing Education, 43, 138–143.