26: Should You Use ChatGPT in Class? A Guide for Teachers and Students
When ChatGPT dropped in November 2022, it had teachers and students scrambling to figure out what to do with this new service. Seth outlines four main ways to handle ChatGPT in the classroom and notes some pitfalls of using ChatGPT uncritically.
Potentially Helpful Links:
Vectoring Words: https://www.youtube.com/watch?v=gQddtTdmG_8
AI Models & Language Models: https://www.youtube.com/watch?v=rURRYI66E54
GPT-4: https://www.youtube.com/watch?v=2zW33LfffPc&list=PL1v8zpldgH3pQwRz1FORZdChMaNZaR3pu&index=82
https://monkeylearn.com/natural-language-processing/
Credits:
Music: “Dreams” by Benjamin Tissot from Bensound.com
Episode/Podcast Art: Nicole Smith
Episode Transcript
In many ways, automated services have been a part of our lives for a long time. Many companies have chatbots and robot phone representatives already. I’m sure there are innumerable horror stories of trying to access help from a fully automated service. But when Chat GPT released in November of 2022, it really seemed like magic. Here was a chatbot that could seemingly understand you in conversational English. Not only that, but the range of questions that ChatGPT can answer, everything from summarizing books to writing code to creating fictional conversations between famous people. People were so amazed by ChatGPT that it became the fastest service in history to ever reach a million users, a feat it completed in January, just two months after release.
Welcome to Digethix, my name is Seth Villegas. Here on the podcast we use religion and philosophy to tackle ethical problems in technology. Today, we’ll be talking about ChatGPT and the challenges that it poses to both students and teachers. To start, I just want to touch on what Chat GPT is and the natural language models behind it, GPT-3 and GPT-4. I will explain briefly what these are and why they represent such a tremendous technical achievement.
Natural language processing is a branch of computer science dedicated to finding ways to allow computers to read text. If you’ve ever learned another language, you know a little bit about just how hard it can be to understand natural speech. People use words rather fluidly. Sentences and phrases have some rules, but many of these rules are regularly broken. Not only that, but people speak in idioms and common expressions that do not make sense if you interpret them literally. It can also be the case that some languages cannot be literally translated word for word with another. Just try translating a single sentence into other languages in Google translate and then translate it back into English. The results are usually quite funny.
For a long time, the goal was simply to get computers to understand basic speech. You can do this by getting the computer to label the different parts of a sentence, noun, verb, adjective, article, and that sort of thing. Then you might try to group words mathematically based on how closely related they are. For example, the word king should be closer in context and meaning to the word queen than to the word ballerina. However, since queen and ballerina are both gendered words, they might still have a relationship to one another. If you are interested in learning about how words can be represented as vectors, I highly recommend looking at Computerphiles and Yannic Kilcher’s channels on youtube. I’ll post a couple of videos in the description.
The ‘GPT’ in ‘ChatGPT’ stands for Generative Pre-trained Transformer. What this means is that the researchers at OpenAI were attempting to create an AI that writes out full sentences. In that way, it is generative. It is pre-trained in the sense that it has a huge amount of examples fed into it. When GPT 3 originally came out, they were able to incorporate 175 billion different parameters into it. It turns out that incorporating more training data helped the model to perform exponentially better. GPT 4, and by extension ChatGPT, is thought to have over a billion parameters, though I wasn’t able to find a firm answer in the documentation. Finally, GPT is a transformer in the sense that it is able to guess the next word in a sequence. Google’s autocomplete is one example of this kind of guessing process. When you ask ChatGPT a question, it guesses the word that comes next after your question and then all the words that follow, one at a time. If you allow the model to interact with people that can correct it, say by allowing millions of people to use the service, it stands to only get better at creating these responses.
When NLP models began to improve, researchers were interested to see what kinds of questions that it might be capable of answering. A test came out, called Massive Multitask Language Understanding (MMLU), that featured a variety of questions about math, geography, history, art, and the like. Most models did little better than chance, even GPT 3. However, when GPT 4 came out, it turned out to be very good at these questions. The researchers had it do everything from take the BAR, medical exams, and AP tests. The model did extraordinarily well, please in the top percentage of all scorers. It is hard to imagine that it was only a few years ago when language models had a difficult time answering most of these questions. This is partly why talking to ChatGPT is so astounding: it seemingly knows everything about everything. NEed to fix your code? It can help you. Need help understanding quantum physics? Here’s an explanation. Want an original poem written in the voice of Elon Musk? Here you go. The instantaneous results seemed to go well beyond anyone’s expectations.
But this same power raised a number of difficult questions. What if the model is biased in some way? What if people use the model to commit crimes? What if students are tempted to cheat on their assignments?
The researchers tried to respond to many of these concerns by preventing ChatGPT from answering certain questions. But even now people are looking for specific prompts that can get ChatGPT to answer questions it is not supposed to. These jailbreak prompts are likely to remain a significant problem for the foreseeable future.
If you’re a teacher, it’s easy to be scared of students cheating on their assignments. After all, ChatGPT creates original text, not something that is sitting out there on the internet. How will you know if a student is trying to submit ChatGPT text in their homework? While there are services like GPT zero that can detect AI generated text, technological solutions can still be fooled by transforming the text multiple times.
In a way, I think that ChatGPT just heightens our awareness of all the ways that there are for kids to cheat today, everything from AI tools to hiring someone to write an essay for you. If all we value is grades, then we shouldn’t be surprised that people do whatever it takes to get good grades. The role of a teacher in stressing the importance of knowledge is higher than ever given the competitive environment that we live in.
If you’re a teacher, I think that there are four main options for how ChatGPT might be used in the classroom. First, you might decide just not to allow ChatGPT at all. The main downside of this approach is that it assumes that students might not benefit using ChatGPT. Responsible usage of the service appears to be better than just trying to weed out its usage altogether. One of the things you might consider is that, at least among the students I teach, there is a certain cynicism that even if a particular student chooses not to use ChatGPT, that does nothing to stop other students from using it to get ahead. Students will not be happy with this policy unless they feel it is a fair playing field for everyone.
The second option is to allow students to use ChatGPT as a way to brainstorm ideas and to get information, but not to use any explicitly AI generated text. This option is a step towards integrating ChatGPT into the classroom. Some teachers have done some interesting work having students evaluate AI generated text for assignments. Does the output look right? Teaching students to use ChatGPT in this way may help them to avoid some of the pitfalls of its use, which I will explore later.
The third option is to allow students to use ChatGPT to edit text that the student has written themselves. In this way, ChatGPT acts much like grammarly or other similar tools that can help students improve their writing. This more inclusive approach also clarifies what is and what is out of bounds since grammarly and spellcheck can function similarly to using ChatGPT in this way.
The fourth and final option is to allow students to use ChatGPT in whatever way they want, so long as they document both the prompt they used to generate a particular bit of text and cite specific text as generated. This is the option deployed here at Boston University. It was created in part due to the efforts of my adviser, Dr. Wesley Wildman, who worked with his data ethics class on defining a fair policy. They decided that ChatGPT could be used responsibly so long as it was suitably cited. I personally like this approach because it doesn’t infringe on the freedom of the student and it doesn’t leave them in this weird situation in which they see other students using the service irresponsibly, that is without citing anything, but they aren’t sure where to draw the line themselves. Part of this new set of guidelines also means that teachers have to take into account that students who use ChatGPT and those who do not have to be graded slightly differently, with different baseline expectations of quality. The expectation is that students will have an improved product if they use ChatGPT, so if their own effort is not sufficiently transformative, it may not help them as much as they hope.
Part of my day job is teaching ethics to data science students. At the beginning of the semester, I polled my students on whether they felt that ChatGPT should be allowed. A little over half were for it, a quarter against, and the rest were somewhere in the middle, neither completely for nor against. But when I presented them with a situation in which chat GPT would be allowed so long as it was cited, the student opinions became more contentious. This time, over half felt that Chat GPT should not be allowed. Even the ones who were still okay with it expressed a kind of resignation that other students would be using it anyway.
This feeling is due to the fact that students are already under a lot of pressure to get good grades, and so every student can understand wanting an assignment to just disappear in a few seconds, in a way that Chat GPT seems to offer. However, I was very struck by how many of them wanted to say that they were in school because they needed to learn something that would be important to their career. They even expressed a worry that AI might come to replace them as coders. If computer science and data science students are feeling this way, how are other people feeling about it?
Truth be told, Chat GPT is just another in a long line of tools and services offering a quick go ahead. As I already mentioned, there are so many ways for students to cheat these days that teachers need to articulate why studying actually matters. If you are a student and you find yourself tempted to take one of these shortcuts, say by using Chat GPT. Here are three reasons why I think you shouldn’t:
First, there are real benefits to learning to do something yourself. If you are looking to use Chat GPT in the future, you’ll need to know whether it is correct or not. If you have no capacity to judge its output, it may be dragging you down more than you think. For example, ChatGPT sometimes falls victim to something called, hallucination, in which it will potentially make up descriptions and sources as if they were true. If you ask ChatGPT to describe the plot of a TV show that just came out, but that episode is not actually in its database, it will create a description out of thin air based off of the title. This occurs partly because ChatGPT has been trained to give answers that a human will accept, not to give out truthful information. Just like Wikipedia before it, ChatGPT is a tool that requires some additional research to use best. If you are not in a position to evaluate ChatGPT, you may unwittingly accept whatever it tells you.
Second, Chat GPTs accuracy depends on the past work of real people, that mimics, replicates, and builds upon. Without people building and generating that knowledge, ChatGPT wouldn’t work in the first place. This fact also implies that there are texts and topics that the model just won’t know anything about. Chat GPT knows the most about things that there are a variety of sources already covering. It is a distillation of the work of others and so it may be incredibly offbase if there are no such sources for it to use.
Third, I think that services like Chat GPT will eventually work best as a way to augment what we know. Often, when I am doing research, there might be things that I need to refresh my memory on or facts that I need to look up. I feel Chat GPT could simplify this process greatly. In fact, I see students using ChatGPT like this in class all the time. If it is treated as another resource rather than as the definitive source of information, that is when I think it will work best. By positioning ourselves to use ChatGPT in a thoughtful way, we may find useful support in our everyday tasks, like writing emails, digesting news, and the like. As these services improve their capabilities, it may be dangerous to get left behind without ever having tried them at all, probably in ways that we cannot imagine at the moment. As an educator, I hope I can prepare my students for a changing world.
To conclude, I don’t think that banning chat GPT outright can ever really solve the core problems behind cheating. Even in academic circles, you can pay someone to write an original essay for you. Why you would want a degree on the basis of this kind of fraud is confusing, because you won’t be able to do the work that your degree was supposed to prepare you for. But this also speaks to the need for educators to be responsive to the needs of today’s students. If people don’t feel prepared for the chaos of the real world and changes that technology presents to us, then perhaps we need to rethink how best to make education useful to future generations of students. Only if we can demonstrate that there is a real payoff will people bother to go through the trouble because, let’s face it, most people still see school as an obstacle to learning things that are actually interesting and useful.
But what about you? Have you had a chance to ask ChatGPT your question? Time will only tell how else Chat GPT has been trained to answer specific questions and what biases it might have, beyond some of the political biases that have already been found. At the same time, we have to acknowledge the tremendous technical achievements that have gone into Chat GPT, GPT-3, and the new GPT-4. An AI model that can pass the bar and do very well on so many standardized tests is simply amazing. With Microsoft’s multi billion dollar investment in OpenAI, chat GPT’s parent company, our direct contact with AI is only going to become greater over time.
If you’d like to respond to this conversation on Chat GPT, you can email us at digethix@mindandculture.org. You can find more information about digethix at digethix.org. You can find us on facebook and twitter, @digethix, and on instagram, @digethix future. The intro and outro song dreams was composed by Benjamin Tissot. I hope you’ll join this important conversation on digital ethics and AI.
This is Seth, signing off.