Journal of Nursing Education

The articles prior to January 2012 are part of the back file collection and are not available with a current paid subscription. To access the article, you may purchase it or purchase the complete back file collection here

Continuing Education Program Evaluation for Course Improvement, Participant Effect and Utilization in Clinical Practice

Lynne G Faulk, RN, PhD

Abstract

ABSTRACT

An evaluation of a single continuing education (CE) program was conducted by the presenter to assess impact of the offering and gather information for course improvement. The purpose of this evaluation was to document if and to what extent the learners attained the program's objectives and also to systematically examine the program to see how it might be improved. The third purpose was to document if participants' interest was stimulated by the program and if they actually used the information in clinical practice or at least found it helpful. Fifty-five nurses participated in completing a pretest and posttest, reactionnaire at the conclusion of the program and follow-up questionnaire.

Participants made significantly (p < .05) better scores on the posttests as compared to the pretest. On the evaluation form, 98% of the nurses responded that they had learned new facts, and 75% indicated that information was moderately new. In addition, a slight majority (58%) responded that they changed their beliefs about asserti veness. Of returned follow-up questionnaires, 95% told others about the program, and 59% read articles. All nurses indicated they had found the information useful, and 87% had actually used the information in clinical practice.

This study documents a method for CE administrators and educators to evaluate the impact of CE and provide information for course improvement. The evaluation supports the program's worth. Participants benefited in terms of knowledge and interest. In addition, program strengths/ weaknesses were identified.

Abstract

ABSTRACT

An evaluation of a single continuing education (CE) program was conducted by the presenter to assess impact of the offering and gather information for course improvement. The purpose of this evaluation was to document if and to what extent the learners attained the program's objectives and also to systematically examine the program to see how it might be improved. The third purpose was to document if participants' interest was stimulated by the program and if they actually used the information in clinical practice or at least found it helpful. Fifty-five nurses participated in completing a pretest and posttest, reactionnaire at the conclusion of the program and follow-up questionnaire.

Participants made significantly (p < .05) better scores on the posttests as compared to the pretest. On the evaluation form, 98% of the nurses responded that they had learned new facts, and 75% indicated that information was moderately new. In addition, a slight majority (58%) responded that they changed their beliefs about asserti veness. Of returned follow-up questionnaires, 95% told others about the program, and 59% read articles. All nurses indicated they had found the information useful, and 87% had actually used the information in clinical practice.

This study documents a method for CE administrators and educators to evaluate the impact of CE and provide information for course improvement. The evaluation supports the program's worth. Participants benefited in terms of knowledge and interest. In addition, program strengths/ weaknesses were identified.

Introduction

An evaluation of a single continuing education program for nurses was conducted to measure the impact of continuing education (CE) on nurses' achievement and implementation in clinical practice. Furthermore, data were gathered relative to participants' reaction to the program, and evidence sought to assist in improving the course. In order to assess these aspects, program participants completed pretests and posttests related to the program objectives, evaluation forms and follow-up questionnaires.

A review of the literature precedes presentation of the study's questions and methods used to answer the questions. Next, the data are presented with a quantitative and qualitative analysis. The implications are then discussed, and followed by a summary of the study.

Literature Review

Historically, continuing education has moved from postgraduate hospital-based courses to programs offered by universities. These have included workshops, conferences, refresher courses for inactive nurses, inservice education, short-term courses, and off-campus credit courses. It has only been since 1968 that faculty responsible for CE have been meeting together (Cooper, 1973).

Now, with the advent of legislation requiring CE for license renewal in certain states comes the mandate to scrutinize CE offerings. Consequently, program administrators are being held responsible for evaluation of these programs. Their focus, generally, is on the "offering as a whole (rather) than on its specific elements" (Mitsunaga & Shores, 1977, p. 9).

Bolte (1979) identifies five trends which have increased the need for evaluation in continuing education. These are 1) increased number of providers of (CE), 2) increased variety of formats, 3) increased diversity of nursing population, 4) lack of definitive CE goals, and 5) increased emphasis on accountability in (CE).

Administrators realize the need for planned evaluation. They don't want to "leave evaluation of the effectiveness of their continuing nursing education offerings to chance" (Bolte, 1979, p. 50).

For the most part, evaluations by program administrators or coordinators have been aimed at learner satisfaction obtained using standard forms completed at the end of a program. Generally, those who hire presenters do not expect them to evaluate the program, and most of the educators do not formally participate in this. However, perceiving evaluation as the administrators or coordinators task does not relieve the program presenters of the responsibility for evaluation.

Stufflebeam's definition of evaluation, as cited by Worthen and Sanders (1973), is "the process of delineating, obtaining and providing useful information for judging decision alternatives" (p. 129). Program evaluation is making a decision about the program's effectiveness and efficiency based on organized collection and analysis of information. This information is useful for program management, external accountability and future planning (Attkisson, Hargreaves, Horowitz, & Sorensen, 1978).

A major concern for nurse educators who present CE is deciding what facilitated or interfered with learning in order to improve the program. Evaluation provides information for the faculty-instructor who is concerned with planning learning experiences to assist learners' achievement of objectives. The following are components which may be included:

1. Format

2. Setting

3. Relevancy of content

4. Effective instructional strategies

5. Appropriate time and emphasis on objectives (Mitsunaga, 1977).

As pointed out by Arney (1978) whether or not a program works, and for what reasons, may be determined (in part) by the evaluator's perspective. The evaluator is influenced by what the individual values as important to assess and, thus, the evaluation is affected by what the evaluator selects to measure.

For an evaluation to be successful, it must have influence. The determinants of influence are if the evaluation is needed and if someone cares about the evaluation (Alkin, Daillak, & White, 1979).

In light of the Arney (1978) and Alkin et al. (1979) statements, it would seem important for both administrators and presenters of CE programs to be involved in evaluation. They each would conduct the evaluation to provide information relevant to their contributions and concerns.

In evaluating the effects of multiple CE programs, Arney, Little, & Phillip (1979) report that participants who attended several programs attained higher test scores that those who did not. He assumed that this information is "worth knowing." He concluded that it is more desirable for participants to attend multiple programs rather than a one-time program. However, those involved in continuing education realize the ideal as opposed to the practical aspect of this recommendation.

Staropoli and Waltz (1978) support conducting both formative and summative evaluation of nursing educational programs. This means evaluation which occurs during and following a program. They posit that an optimal evaluation occurs at three points in time: before the program, during the program, and following the program. This is similar to Stufflebeam's (1973) widely recognized CIPP (Context, Input, Process, Product) model for evaluation.

Billie (1976) writes that formative exams during a program help the learner recognize differences and provides opportunity to correct these. The program's goals are communicated and the participant given direction. Therefore, participants learning is promoted.

All who are involved in continuing education need to be involved in evaluation. There are a number of models for evaluation to follow. Which one an evaluator chooses may be a function of why the evaluator is conducting the study and what she/he wants to find out (Worthen & Sanders, 1973).

Purpose

The purpose of this evaluation was to document if, and to what extent, the learners attained the program's objectives, and also to systematically examine the program to see how it might be improved (correct weaknesses) for future offerings benefit. The third purpose was to document if participants' interest was stimulated by the program, and if they actually used the information in clinical practice or, at least, found it helpful. These purposes are consistent with Cronbach's (1973) premise of studying the effects of the course, and Stufflebeam's (1973) idea of enlightened decision making.

Problem

An evaluation of a one-time CE offering on asserti veness was conducted by the program's presenter. The evaluation was summative (at the completion) rather than formative (during the program) since summative evaluation is more applicable to a one-time offering. The questions posed were:

1. Do participants meet objectives of the program as a result of the learning experience?

2. Do the participants make higher test scores on a posttest as compared to their individual pretest scores?

3. Do the participants score on the tests equally as well on each of the three content areas of the program?

4. Do participants score higher on the lower level cognitive test items than on the higher level items?

5. How do the participants judge the program's impact?

6. Do the participants use the information received from the program when they go back into their clinical agency?

7. Do the participants have an increased interest in the subject as a result of the program?

Design

To answer these questions there were three overall components of the evaluation.

1. Attainment of the program's objectives was measured by an objective test given before and after the program. Items were taken from a test blueprint, which was a matrix of higher and lower cognitive level objectives by three content area objectives. Data from the test were intended to provide information to answer the first four questions.

2. Demographic information and participant reactionnaire collected on an evaluation form at the conclusion of the program were used to assess participants' judgment of program impact, question number five.

3. The benefits of clinical application and increased subject interest were documented on a questionnaire sent to participants one month after the program. Interest was operationalized as nurses telling others about the program and further reading about the topic. These are related to the last two questions.

One hypothesis relative to the second question was tested. The Null Hypothesis was: Participants will score similarly on the posttest as compared to pretest. A correlated t-test for mean differences was used to test the hypothesis. The other data are handled using descriptive statistical analysis, correlation and qualitative content analysis. The analysis was carried out using a computer test Scoring and Analysis program and SPSS. The test analysis program was developed by Cole (1974) and available through the University of Kentucky.

Table

TABLE 1DEMOGRAPHIC INFORMATION ON PROGRAM PARTICIPANTS

TABLE 1

DEMOGRAPHIC INFORMATION ON PROGRAM PARTICIPANTS

Program Description

Broad behavioral objectives for the program were developed and distributed along with an agenda before the program. The instructional methods employed included lecture using overhead projector, group discussion, and role-playing in large group and small group exercises. Also a film on "Assertiveness" was shown.

Content for the conference was selected from literature review, colleague suggestions, and request by CE coordinator.

Sample

The same CE program on assertiveness was presented twice, one month apart. Instruments used were pretested by the first group of 16 nurses attending the program. This program was a hospital inservice conference for supervisory personnel. These nurses had similar characteristics to those who participated in the actual study.

The second program, and the one used for the evaluation, was sponsored by a district of the Kentucky Nurses' Association. The group attending was comprised of 55 nurses. The study was conducted with their informed consent.

Characteristics of the research group are presented in Table 1. The majority (70%) are at or above 36 years of age. Most are employed on a full-time basis, and the mean years of nursing practice is 15.8 years. As you may note, N = 32 (70%) of the nurses did not indicate their sex. Everyone in the pretest group responded as to sex. The form will be modified to correct for this.

Table

TABLE 2COMPARISON OF 25-ITEM PRETEST AND POSTTEST RESULTS MEANS, STANDARD DEVIATIONS AND ERROR, RELIABILITY AND ITEM DISCRIMINATION

TABLE 2

COMPARISON OF 25-ITEM PRETEST AND POSTTEST RESULTS MEANS, STANDARD DEVIATIONS AND ERROR, RELIABILITY AND ITEM DISCRIMINATION

Interestingly, most of the nurses are in managerial positions and a majority hold less than a bachelor's degree (these include LPN, ADN, and diploma).

Data Collection

Participants attending the program were requested to complete the pretest at the beginning and a posttest at the completion of the program. In addition, they were asked to respond to an evaluation form after the program. Participants completed all instruments during program time. The purpose of the testing was explained, and they were informed that the test keyed for the correct responses would be reviewed after the program was over. The group was asked to fill out a card with their name and address for the purpose of receiving a follow-up questionnaire, which was mailed one month after the program. Out of the 55 participants, some did not attend the entire program. There were 44 who answered both pretest and posttest, and 46 the evaluation form. Forty-four were sent follow-up questionnaires.

Objective Test

A 25-item cognitive test was completed by participants before and after the program. Tests were coded for data analysis and comparison of individual scores. The questions were developed from the behavioral objectives of the program. The mean percent score on the pretest was, as seen in Table 2, lower than the mean posttest score. A correlated t-test for the difference between pretest and posttest mean scores was significant p < .001. Participants made better scores on the posttest as compared to pretest.

Results of the Kuder-Richardson Formula 20 for the test were low. The Kuder-Richardson Formula 20 (KR-20) is an estimate of internal consistency or degree to which all items measure a common characteristic of the person (Thorndike & Hagen, 1977). However, the aim of the program was to assist everyone to attain the objectives, as much as possible, so that if everyone made 100%, test reliability would be zero.

In light of the high mean scores and the low visibility, the reliability is explainable. This estimate of KR-20 is meaningful in norm-referenced testing, and must be considered differently in criterion-referenced evaluation such as used in this study.

Content validity was obtained by matching test items with course objectives (Martuza, 1977).

The pretest and posttest scores were modestly correlated; Pearson Correlation = .592. This demonstrates that individuals' scores ranked similarly on pretest and posttest performance. In other words, nurses who scored high on the pretest tended to score high on the posttest and vice versa.

Test questions were developed according to the first four levels of Bloom's (1973) taxonomy of cognitive educational objectives. Knowledge and comprehension questions (N= 15, 67%) were designated as lower level items, and application and analysis as higher level questions (N=IO, 33%). As seen in the bar graph in Figure 1, those attending the program did better overall on higher and lower questions on the posttest as compared to pretest. However, in both pretest and posttest, they answered more lower level questions correctly than higher level ones. This supports the assumption that test items which were designated as lower and higher level are probably functioning in that way. You expect people to do better on easier (lower level) questions and not so well on harder (upper level) questions. During the program the same amount of time was spent on each of the three content areas. Content was partitioned into three approximately equal amounts of material to cover. Higher and lower level questions were fairly evenly distributed in these content areas.

In looking at the content areas presented in Figure 2, participants scored similarly on I and III and in area II scored lower than either I or III content areas. Content area II was not mastered as well by the nurses. The mean item discrimination for this one area was .28 on the posttest and .26 on the pretest, supporting a type of reliability. Test items discriminated in the direction of the students who did better on the test.

FIGURE 1PARTICIPANTS' SCORES FOR COGNITIVE LEVEL TEST ITEMS

FIGURE 1

PARTICIPANTS' SCORES FOR COGNITIVE LEVEL TEST ITEMS

FIGURE 2PARTICIPANTS' TEST SCORES FOR THREE CONTENT AREAS

FIGURE 2

PARTICIPANTS' TEST SCORES FOR THREE CONTENT AREAS

Table

TABLE 3COMPARISON OF PRETEST AND POSTTEST SCORES IN THREE CONTENT AREAS

TABLE 3

COMPARISON OF PRETEST AND POSTTEST SCORES IN THREE CONTENT AREAS

Participants scored significantly higher on the posttest in each content area (Table 3). This supports that all three content areas contributed to the overall difference in pretest and posttest scores. An educator would be concerned if only portions of the program contributed to what learners received from the entire course.

Evaluation Form

Questions on the evaluation form were forced choice. They covered participants' perception of what they gained from the program, reaction to program content and speaker. Also, there were open questions asking participants to identify strengths and weaknesses of the program. Selected questions of interest are presented in Table 4. Even though the majority (98%) learned new facts from the program, most nurses indicated that the information presented was moderately new to them. In addition, a slight majority (58%) responded that they changed their beliefs about assertiveness.

Seventy percent of those completing the evaluation form wrote comments as to strengths and/or weaknesses of the program. The most frequent strengths referred to were the lecture, film, organization, and relevancy of topic. Small group exercises was the only weakness mentioned more than once. This also corresponds to their responses that the small group exercises were moderately helpful (Table 4).

Table

TABLE 4PARTICIPANTS' ANSWERS TO SELECTED QUESTIONS CONCERNING REACTION TO PROGRAM

TABLE 4

PARTICIPANTS' ANSWERS TO SELECTED QUESTIONS CONCERNING REACTION TO PROGRAM

Follow-up Questionnaire

Overall there were 44 follow-up questionnaires mailed to the research group. Format of the questionnaire was forcedchoice with some open-ended questions. Overall, 62% (N=27) of the nurses returned the form. Responses to the follow-up are seen in Table 5.

Responses indicated that they told others about the workshop, with colleagues being the greatest group they told. Less than half of the respondents had read articles (about the subject), since the workshop and the majority had actually used information from the program in clinical practice (one person was not engaged in clinical practice). Specifically, the information on assertiveness has helped them in interactions with colleagues, patients, and subordinates more than it did with superiors.

On the follow-up, participants identified the most helpful part of the program as the bill of rights and lecture. As on the previous evaluation form, they stated that the role playing was the least helpful part of the program.

Another aspect of the follow-up not anticipated was in response to the question "What aspect of the program would you like more information?" They listed areas which they realized after returning to work and trying out new behaviors, they needed more help. These will be incorporated in the next program.

Cost

One more component which certainly concerns evaluation is in order. As documented in Table 6, conducting this study did involve cost. This confirms what Mitsunaga (1977) said in relation to the expense of evaluation. She relates that it is too expensive to evaluate every single offering which a center conducts. Therefore, she suggests selecting a few CE programs during the year to evaluate further than the standard participant reactionnaire.

Most of the projected costs were absorbed by the College of Nursing; Office of Educational Research; and time given by University professors, experts in evaluation research. A stamped, self-addressed envelope was sent with the followup questionnaire. A great deal of time was spent reviewing literature, analyzing data, meeting with people, conferring with secretaries, and typing. Consultants' reactions to the projected budget were either it's "too conservative" or it's "too inflated." No one responded that the cost seemed appropriate.

Summary/Implications

The nurses attending the program left having mastered more of the subject matter than when they came. The data support that the participants met the programs objectives. They aia not achieve in one content area as well as the other two even though the pretest/posttest difference was significant. The reaaon for this was further investigated. After examining the high item discriminations for content area II, the test items seem reliable. The objectives and content presented were reexamined for inconsistencies. The notes and transparencies fer this portion of the content have been flagged. In fUtUfë presentations, the material Will be revised, and possibly more time spent on the content of concern.

Table

TABLE 5RESPONSES TO FOLLOW-UP QUESTIONNAIRE

TABLE 5

RESPONSES TO FOLLOW-UP QUESTIONNAIRE

Table

TABLE 6PROJECTED COST OF EVALUATION

TABLE 6

PROJECTED COST OF EVALUATION

Another avenue to explore would be looking at the reasons nurses are not entering the course with as much information in content area II as in the other two content areas.

A different type of cost is the loss of instructional time which occurred during completion of the instruments. It took a little less than one hour to fill out the test and form. For a five-hour program, this is a deterrent. A reliability analysis of the test items using SPSS update sub-program was conducted. The correlational matrices from this analysis can be used to reduce the number of test items and, therefore, reduce the amount of testing time without loss of an accurate index of how well participants achieve the program's objectives. This is done by using only one of a group of items which shows a high correlation with each other indicating measurement of a similar content. Of course, the instructor should also use sense and knowledge of the subject area in doing this.

When the group reviewed the test at the end of the day, almost all nurses stayed over to find out the answers and clarify misunderstandings. Reviewing prompted questions from the group ensuing discussions. Perhaps, this warrants further consideration as a method to facilitate learning in continuing education.

On the evaluation form, nurses indicated they had learned new facts which support the significant difference between pretest and posttest scores. Also, on the evaluation form, nurses indicated that the information was only moderately new. If all subject matter was new to the nurses, it might result in decreased learning since they would have little to relate to the new information. Hopefully, instructors strive to review material participants know and then introduce new information related to what they already know. However, being too redundant may provide no challenge, and interest may be lost.

Another component of the program which might be improved is the group exercises. Some of the nurses did not like participating in the small group exercises. Instead of having everyone being involved in one type of group exercise, the next time scripts could provide more structure for those who need this. With the pretest group, this part of the program came out as a definite strength. This might be attributable to the size of the group, which allowed the presenter more time with each small group. However, despite the fact that they do not like it, it may be desirable.

A reliability measure appropriate to criterion-referenced, rather than norm-referenced testing needs further investigation. As Martuza (1977) relates, these are available and would be a contribution to the study. Establishing a test/retest reliability would, perhaps, be more appropriate for the purposes of this testing.

This evaluation documents what the CE offering gave participants and how it helped the educator. The response rate to the follow-up questionnaire is encouraging. Continued interest and implementation in clinical practice as a result of CE is documented. The value of this particular program is substantiated. It shows that nurses benefited from the program. They did better on the test and told their colleagues and others. Furthermore, nurses responded that the information was useful and they used it in clinical practice. Also, the evaluation gave direction for improving the program for future presentations.

Even though the results of this study cannot be generalized to other CE programs, it documents a method for CE administrators and educators to evaluate the impact of CE, and provides information for course improvement. Therefore it will assist CE providers to keep up with what nurses need and want to learn. Through this type of evaluation we can keep CE programs current and relevant.

One of four roles of the university in CE is to assist in developing evaluation methods, and actually conducting evaluations. Evaluation is needed to improve practice and determine the degree to which the program is effective (McKenna, 1978).

Since nurses have such diverse backgrounds, education, and exposure to information, making CE relevant and a contribution to clinical practice is a challenge. The systematic information provided by evaluation can assist with this and is a worthwhile endeavor.

References

  • Alkin, M.C., Daillak, R., & White, P. (1979). Using evaluation/does evaluation make a difference. Beverly Hills: Sage Publications.
  • Arney, WR. (1978). Evaluation of a continuing nursing education program and its implications. Journal of Continuing Education in Nursing, 9(1), 45-51.
  • Arney, W.R., Little, GA., & Philip, A. (1979). Effects of multiple continuing education programs in perinatal nursing. Evaluation and the Health Professions, 2(3), 365-372.
  • Attkisson, C, Hargreaves, WA., Horowitz, M.J., & Sorensen, J.E. (1978). Evaluation of human service programs. New York: Academic Press.
  • Billie, D.A. (1976). An experience with formative evaluation. Journal of Continuing Education in Nursing, 7(4), 25-30.
  • Bloom, B. (1973). The taxonomy of educational objectives/use of cognitive and affect-domains. In B. Worthen & J.R. Sanders (Eds.), Educational evaluation: theory and practice (pp. 246-268). Belmont, CA: Wadsworth Pubhshing Co.
  • Bolte, I. (1979, June). CEU participant evaluation and quality assurance. In Proceedings: The first annual conference of the council on the continuing education unit. Memphis, TN: pp. 50-55.
  • Cole, C. (1974). Flexible test grading and item analysis system. Journal of Dental Education, 38(12), 691-696.
  • Cooper, S.S. (1973). A brief history of continuing education in nursing in the United States. Journal of Continuing Education in Nursing, 4(3), 5-13.
  • Cronbach, L.J. (1973). Course improvement through evaluation. In B. Worthen & J.R. Sanders (Eds.), Educational evaluation: theory and practice (pp. 43-58), Belmont, CA: Wadsworth Publishing Co.
  • Martuza, V. (1977). Applying norm-referenced and criterion-referenced measurement in education. Boston: Allyn and Bacon, Inc.
  • McKenna, M. (1978). A perspective on the impact of mandatory continuing education on public supported colleges and universities. Journal of Continuing Education in Nursing, 9(3), 15-20.
  • Mitsunaga, B., & Shores, L. (1977). Evaluation in continuing education: is it practical? Journal of Continuing Education in Nursing, 8(6), 7-14.
  • Starpoli, CJ, & Waltz, CR. (1978). Developing and evaluating educational programs for health care providers. Philadelphia: EA. Davis Co.
  • Stufflebeam, D. L. (1973). Educational evaluation and decision making. In B. Worthen & JR. Sanders (Eds.), Educational evaluation: theory and practice (pp. 128-142). Belmont, CA: Wadsworth Publishing Co.
  • Thorndike, R.L., & Hagen, E.P (1977). Measurement and evaluation in psychology and education. New York: John Wiley and Sons.
  • Worthen, B, R., & Sanders, JR. (1973). Education evaluation: theory and practice. Belmont, CA: Wadsworth Publishing Co.

TABLE 1

DEMOGRAPHIC INFORMATION ON PROGRAM PARTICIPANTS

TABLE 2

COMPARISON OF 25-ITEM PRETEST AND POSTTEST RESULTS MEANS, STANDARD DEVIATIONS AND ERROR, RELIABILITY AND ITEM DISCRIMINATION

TABLE 3

COMPARISON OF PRETEST AND POSTTEST SCORES IN THREE CONTENT AREAS

TABLE 4

PARTICIPANTS' ANSWERS TO SELECTED QUESTIONS CONCERNING REACTION TO PROGRAM

TABLE 5

RESPONSES TO FOLLOW-UP QUESTIONNAIRE

TABLE 6

PROJECTED COST OF EVALUATION

10.3928/0148-4834-19840401-04

Sign up to receive

Journal E-contents