Section outline

    • We're really pleased with this initial examples - demonstrating how the OpenAI .api can be used in a controlled and sensible way for some exponentially dynamic teaching & learning uses.

      We need to conduct more work to quantify our confidence in the responses and the most impactful approaches. The whole aim with this from a teaching & learning perspective is to augment the teacher, provide assistance for the teacher, not replace them!

      A lot more training and experimentation needs to take place to evaluate how confident we are with the AI responses and the best approaches to use. Nevertheless, there are huge opportunities here, and importantly - significantly beyond mere marking of exam questions. That should be an assumed capability, especially with GPT-4.

      Formative feedback opportunities?

        • AI teacher-style response making use of Cambridge mark scheme. Providing high quality formative feedback offering 'what works well' and 'even better if' 
        • AI generated responses to Cambridge questions - learners then review these and share their comments using the Cambridge mark scheme. The AI then provides feedback and suggestions
        • Interlinking other materials - based on the feedback responses, triggering suggestions to explore other Cambridge products in the pathway.

    • Further considerations:

      • There's a direct link to prior work with iECRs we produced for Resource Plus. There we had Example Candidate Responses where we should show/hide answers and commentary. This work shows how AI can offer something much more powerful.
        • however - will  Assessment be happy with AI suggested answers to past paper questions (re. ECR process)?

      • There's lots of potential for Professional Development / Assessment Specialist / Moderator training - we know about the ongoing issue of not having a enough scripts for PD attendees to train with. Might we reach a confidence level for automatically generated responses to be used for review and marking?

      • There are clear shortcomings - we're working with a language model that deals with text. Many of our questions have diagrams, notation and similar which don't play well with LLMs? We also have to identify how to best interpret our mark scheme - codes and notation challenges.

      • All of this work cements the need for a teacher in the loop -  we still need human input/review as it stands. Answers are not robust, and our approach with the .api and current guidance has been not to share Cambridge IP. We can improve the prompts, but without sharing mark schemes there is a limit to what we can expect.