Meeting the Challenges of Exam Development with Artificial Intelligence

Apr 9, 2025, 15:03 PM

|10-minute read| AI can now help nurse educators meet the challenges of test development. Find out how a new resource from ATI can produce effective assessments while saving you time and effort.

Struggling to write entire exams? Help is on the way

Developing meaningful assessments is a widespread and significant challenge for nurse educators, who must balance this responsibility with numerous other teaching and administrative demands. The pressure to produce exams that accurately gauge student knowledge can contribute to the stress many faculty experience in today’s academic programs — many of which are understaffed and struggling with issues including student retention.

This article provides an overview of the difficulties associated with test development and shares information on how an innovative application of artificial intelligence reduces the time and effort required to create reliable and valid exams. This new resource allows educators to focus more on teaching and less on the burdens associated with exam creation.

Research Documents the Burden of Item & Test Development

For years, nursing faculty have voiced struggles and concerns about developing test items and exams for nursing students. The stress associated with these tasks can increase considerably when added to the many demands on today’s faculty, who are experiencing high rates of burnout and attrition.

Research shows that during every semester or term, the typical nurse educator writes 3 to 5 examinations for each didactic course they teach.¹ If an educator teaches 3 courses, they could be tasked with writing as many as 15 unique tests each term or semester. Despite these significant requirements, many faculty have little to no formal training in item writing and test construction.^2-4

Looking for help writing test items? Meet Claire™️ AI

The research examining the impact of this lack of training is limited, but the literature reports that some nursing examinations contain linguistic errors and technical flaws that affect their ability to accurately evaluate student knowledge.^5-7

The linguistic errors identified in this research involve the use of unnecessarily complex language, culturally biased words, or grammatical errors — all of which affect the ability of students who speak English as an additional language to understand question content and purpose.⁶ These errors may also increase the time needed to complete an examination and lead to lower test scores, the authors wrote.⁶

In the area of technical flaws in summative exams, researchers found that they generally fall into two categories:^5,6

the unnecessary introduction of difficulty in test items, which can lead to questionable student failures
cues that make it easier for students to guess, which can lead to passing grades for students who have not actually mastered course content.

Much of the research about test writing focuses on multiple choice questions, which was the predominant question type for decades. The implementation of the Next Generation NCLEX by the National Council of State Boards of Nursing (NCSBN) in April 2023 is changing the way faculty develop tests. To best prepare students for the NCLEX today, assessments should now include other question types — which means faculty also need to acquire that skill.

At larger academic programs, faculty may have access to exam-writing software or similar tools that help them with assessment creation. In a study of exam item creation in nursing programs, De Haan⁸ found that faculty who don’t have access to exam software expressed a greater need for faculty development funds and recommended that programs invest in resources to equip educators to write assessments.

The Time Demands of Assessment Writing Are Significant

In addition to the skill development needed to write effective exams and test questions, the time required for this task is a stressor on faculty. Educators choose to enter academia because they want to connect with students and help them become great nurses, yet their time with students can be cut short by mounting demands including test writing.

Gene Leutzinger is a lead integration specialist for ATI “A big factor in test creation is the time required,” said Gene Leutzinger, DNP, MSN-Ed, RN, an experienced nurse educator who is a lead integration specialist for ATI Nursing Education. “It takes work to come up with a good item in which all the distractors are legitimate.”

The time it takes to write a single test item can vary significantly depending on the complexity of the question, the level of detail required, and the educator's familiarity with the content. Published reports state that it takes an average of 30 minutes to an hour to write a single well-constructed test item. This includes time for researching the content, ensuring the question aligns with learning objectives, and validating the accuracy and clarity of the question and its answer choices.

Developing a complete exam for a nursing course involves not only writing individual test items but also organizing them into an appropriate assessment of student knowledge. Depending on the length and complexity of the exam, a faculty member may spend several hours, sometimes spread over a few days, to write a complete exam. Factors influencing this time include the number of questions, the diversity of question types, and the need for peer review or validation.

“Writing a single question can approach an hour by the time you create, revise, administer then revise the item,” Dr. Leutzinger said. “Once you create an item, you have to give it to enough students to gather item validity … but the longer it’s used, the less secure it becomes.”

As a full-time educator, Dr. Leutzinger typically wrote 10-item quizzes and 50-item unit exams. His final exams usually contained 50 or 100 questions. Clearly, the time requirements of exam writing are significant.

Artificial Intelligence Can Simplify & Support Exam Creation

Given the importance of course assessments and the challenges faculty experience in developing them, providing faculty with high-quality resources for question and exam writing is a pressing need. Research and experience show that AI has the capability to streamline these tasks and achieve quality results.^2,9

Why should nurse educators embrace AI? Find out in this article.

Rachel L. Cox Simms, DNP, RN, FNP-BC, an assistant professor in the School of Nursing at MGH Institute of Health Professions, has published three papers on the use of AI in nursing education.^9-11 The first of these publications was a study that determined AI is capable of efficiently generating exam items.⁹ (Dr. Simms shared details on this research in a 2024 article on the ATI Educator Blog). Her subsequent papers focus on how to incorporate specific teaching techniques using AI¹⁰ and the need for faculty to embrace AI in general.¹¹

“I think it’s an incredible challenge to write test questions,” said Dr. Simms, whose PhD dissertation focuses on item writing.

When asked whether she thinks AI can write good questions and tests for nursing students, she said, “100% absolutely. I think it can write a really great test question with almost no editing. But of course, it still requires faculty oversight. Our expertise is still that really crucial component.”

In addition to efficiency and utility, Dr. Simms said that incorporating AI into question and test writing offers test security advantages. “Writing nursing exams requires constant change,” she said. “We want our test banks to be changing all the time because we don’t want students to get a hold of old tests and be able to cheat. AI can assist in that.”

Leading the Way in Evidence-Based AI Resources for Faculty

No matter how effective AI is now and might become over time, the need for nursing faculty oversight of AI-generated content will always be necessary. This core truth is at the center of ATI’s approach to developing AI resources that support program outcomes, faculty effectiveness and faculty well-being.

Claire AI™️ helps nursing faculty work more efficiently ATI Nursing Education took the first step in this mission in 2024, when it built Claire AI™️, the first AI-enabled technology specifically for nurse educators. ATI introduced Claire in Custom Assessment Builder (CAB) to help faculty generate individual test items.

Claire AI develops questions using content from ATI resources, which are updated regularly to reflect current practice and evidence. Claire AI in CAB also provides the option for faculty to draft questions from web-based content, using a simple toggle.

In the 12 months since Claire AI became available in CAB in April 2024:

Nearly 2,000 institutions started using Claire
Faculty generated more than 866,000 items using Claire
Claire reduced the time spent writing questions by as much as 50%.

Nurse educators who are writing test items using Claire AI in CAB praise the technology for its speed and accuracy.

“Building new tests is fast,” said David Everhart, MSN, RN, CEN, a nursing instructor at Caldwell Community College and Technical Institute who regularly uses Claire. “The information provided by Claire is accurate, and I rarely need to edit the questions. I know that ATI content is constantly updated, which is essential because textbooks can't keep up with the pace of current practice.”

Everhart added that he also views Claire AI as a deterrent to cheating. “I don't have to worry too much about test security because there's a vast pool of questions to choose from,” he said.

Other faculty users report that using Claire AI in CAB makes question writing more efficient. Their comments in focus groups include:

“Claire simplifies the process so that I don’t have to think of the stem and then the answers.”
“When I’m generating questions, I start with Claire almost 100% of the time.”
“I think it’s Important to say that Claire is almost always accurate; ChatGPT can hallucinate.”

Enhancing the Efficiency & Quality of Nursing Assessments Using AI

The next application of Claire AI will take faculty beyond item writing to complete test generation. This new resource, scheduled for release this spring to ATI Complete Customers, is called Assessment Generator. This Claire AI-powered solution can write entire assessments for faculty to vet and edit. When faculty remain within Claire AI, the questions on the generated assessment are based on content in evidence-based ATI resources. They can also toggle to search web-based information that does not reflect the content in ATI resources.

Assessment Generator will be seamlessly integrated into CAB, enabling faculty to rapidly create up to 50 questions at a time. Educators can create assessments containing a total of 250 items.

Assessment Generator delivers the essential test features educators need, including the ability to:

specify subject areas
select outcome categories
select from item types.

Another significant benefit is that Assessment Generator also writes the most common question types on the NCLEX, including complex item types:

multiple response
multiple choice
bowtie (including scenario)
matrix (including scenario).

ATI will introduce more NCLEX question types over time.

In addition to increased efficiency, Assessment Generator with Claire AI can help improve the quality of nursing assessments. When educators draw from evidence-based ATI Nursing Education content, Assessment Generator helps generate questions that are relevant and aligned with learning objectives identified in ATI course materials.

Faculty Can Now Generate Assessments Quickly & Seamlessly

Because Assessment Generator is integrated into CAB, faculty can access robust AI resources seamlessly, with a single login and within a single workflow. This educator-focused design can reduce the administrative load of writing items and tests, paving the way for faculty to spend more time focusing on why they entered education in the first place: shaping new nurses.

Everhart is one of many educators who are excited by the possibilities Assessment Generator presents.

“Can you imagine sitting down in your office at 8 am and being able to generate a quiz for a class at 11 am?” Everhart said. “This is a game changer.”

Dr. Leutzinger also emphasized the benefits of efficiency. “Assessment Generator will be a tremendous time saver,” he said. “By creating assessments complete with stems, key distractors and tagged outcomes, an instructor merely needs to validate the items and make minor edits to ensure exams are aligned to student learning outcomes. This will absolutely improve faculty workloads.”

References

Bristol, TJ, Nelson, JW, Sherrill, KJ, Wangerin, VS. (2018). Current State of Test Development, Administration, and Analysis: a Study of Faculty Practices. Nurse Educator. 43 (2), 68-72. https://doi.org/10.1097/nne.0000000000000425
Hensel D, Moorman M, Stuffle ME, Holtel EA. Faculty Development Needs and Approaches to Support Course Examination Development in Nursing Programs. An Integrative Review. Nurse Educator. 2024;49(6):315-320. DOI: 10.1097/NNE.0000000000001706
Moran V, Wade H, Moore L, Israel H, Bultas M. (2022). Preparedness to write items for nursing education examinations: a national survey of nurse educators. Nurse Educator. 2022;47(2):63-68. DOI: 10.1097/NNE.0000000000001102
Moore WL. Does faculty experience count? A quantitative analysis of evidence-based testing practices in baccalaureate nursing education. Nursing Education Perspectives. 2021;42(1):17-21. DOI: 10.1097/01.NEP.0000000000000754
Cox CW. Best practice tips for the assessment of learning of undergraduate nursing students via multiple-choice questions. Nursing Education Perspectives. 2019;40(4):228-230. doi:10.1097/01.nep.0000000000000456
Moore B, Waters A. The effect of linguistic modification on English as a second language (ESL) nursing student retention. International Journal of Nursing Education Scholarship. 2020. doi:10.1515/ijnes-2019-0116
Tarrant M, Ware J. A framework for improving the quality of multiple-choice assessments. Nurse Educator. 2012;37(3):98-104. doi:10.1097/nne.0b013e31825041d0
De Haan JA. Use of Best Practices in Exam Item Creation, Analysis, and Revision: Nursing Faculty’s Knowledge, Use, and Implementation. 2021. Doctoral dissertation, Bethel University Spark Repository. https://spark.bethel.edu/cgi/viewcontent.cgi?article=1712&context=etd
Cox RL, Hunt KL, Hill RR. Comparative analysis of NCLEX-RN questions: a duel between ChatGPT and human expertise. Journal of Nursing Education. 2023;62(12):679-687. https://pubmed.ncbi.nlm.nih.gov/38049305/
Simms RC. Work with ChatGPT, Not Against: 3 Teaching Strategies That Harness the Power of Artificial Intelligence. Nurse Educator. 2024. DOI: 10.1097/NNE.0000000000001634
Simms RC. Generative Artificial Intelligence Literacy in Nursing Education: A Crucial Call to Action. Nurse Education Today. 2025. https://doi.org/10.1016/j.nedt.2024.106544