The Inversion

AI has flipped Bloom's Taxonomy upside down—the work that matters now is what students struggle with, not what they produce

Apr 21, 2026

I sat at my desk with a student paper in front of me and a knot in my stomach.

I was reading a reflection essay on our final unit about sweeping environmental changes after World War II. I hadn’t asked for a thesis-driven argument, just a reflective piece asking students to think through what they’d learned after spending the last several weeks reading and discussing. I’d designed it carefully, stood in front of the class and given what I thought was an impassioned speech about how I cared about their learning, not the product—that the prompt existed to help them engage, not to test them, and that the whole point was to spend time reflecting on the material. They knew that I graded things credit/no credit, with credit for B+ or higher work. The whole assignment was low stakes and high trust.

But the writing in the paper in front of me was slick, the insights were empty and strangely disconnected from the material, and the whole thing just made my Spidey senses tingle. This was a couple of years ago, and by that point I’d played around enough with AI to suspect that I already knew why things felt off. I pasted my assignment prompt into ChatGPT, and although what came back wasn’t identical to the paper in front of me, they shared the same DNA. The structural adherence to the prompt was similar, the language was similar, the level of “reflection” was similar. I knew in my bones that the student had plugged everything into a chatbot and relied on the machine to do, in thirty seconds, what I’d asked them to spend an evening doing.

I’d done everything I thought I was supposed to do, but it didn’t matter.

The knot turned out to mean something I didn’t expect—and it wasn’t really about the student.

My first reaction was anger, followed very quickly by grief—and then, more slowly, a sneaking suspicion about my own grief. I’ve spent my entire career caring about writing. Not just assigning it—caring about it, as a craft, as a vehicle for self-expression and clarity of thought. I’ve watched students arrive in college unable to build an argument and graduate able to write with precision and force. Watching a student treat that process as optional by using AI felt like watching something I’d devoted my life to get dismantled. That explained the grief.

But I kept having doubts. Was I just gatekeeping—defending a rare and valuable skill because I’d mastered it and had invested an inordinate part of my identity in being someone who could pass it along? Was my attachment to essays really about student learning?

I had to think it through. And what I found, when I was honest about it, is that my attachment to essays wasn’t sentimental. It was structural.

To understand what I mean, think about what a well-designed essay actually requires. To write a truly good one, students must read an entire unit’s worth of material, come to class regularly, synthesize complicated and often challenging ideas in conversation with other students, and somehow get a grasp of it all—both as a whole and as the sum of all the individual parts. And then they must wrestle everything they’ve learned into a very particular format: supporting a clearly articulated and insightful thesis with a logically structured interpretation of the available evidence.

Nothing about that is easy.

To do it well, they must organize everything they know into a disciplined argument. That’s what makes it hard. It’s also what makes it educational.

But here's the part I'd never had to articulate before: essays are extraordinarily efficient to evaluate. I can usually tell before I've finished the second paragraph what grade a paper will earn, because by this point in my career I am a highly skilled reader of thesis statements. An A quality thesis, delivered—that's an A paper. An A thesis with evidence or analysis that doesn't quite hold: A-. A B+ thesis with B+ insights? Flawless execution won't earn better than a B+. The quality of the thinking and the mastery of the material set the ceiling, not the polish of the prose—though they are often related.

The asymmetry between what an essay requires of students to write and of professors to grade is what made the essay the most powerful instrument in my toolkit for assessing students. Students must invest enormous cognitive effort just to complete it, but I can evaluate it quickly and with confidence that the grade reflects what they’ve actually learned and are able to do. Maximum student work, efficient evaluation, and a signal I could trust—what’s not to like?

There’s a name for the logic I’m describing, though I didn’t encounter it formally until well into my career. In the 1950s, educational psychologist Benjamin Bloom led a team that classified cognitive tasks into a hierarchy—“remembering” at the base, then “understanding,” “applying,” “analyzing,” “evaluating,” and “creating” at the top. Someone later turned it into a pyramid, and the pyramid took on a life of its own. (If you’ve ever been to a teaching center workshop on assignment design or learning outcomes, you’ve almost certainly seen this. You likely even got a handout with various verbs grouped by which level of the taxonomy they reflect.) It’s less a scientific model than a shared mental image, but it captures something real about how most of us learned to teach.

*Bloom's Taxonomy—the version most of us learned to teach with. (Image: UMBC DoIT.)*

Most professors I know, whether or not they’ve ever heard of Bloom’s Taxonomy, build their approach to teaching on something like this logic. We build from the ground up and scaffold the assignments: readings and lectures for foundational knowledge, discussions for analysis, and then—the capstone—we ask students to produce something that requires them to put it all together. Creating things—whether essays, test answers, or solutions to problem sets—sits at the top of the pyramid because traditionally it has been the hardest thing to do.

Using Bloom's Taxonomy has helped me sort out my own complicated feelings about what AI is doing to student essays because it captures this logic so cleanly: you start by building understanding, and once you’ve spent a sufficient amount of time reading and discussing and thinking together as a class, the student who can produce high-quality work has demonstrated that they can think.

And that logic actually worked! Asking students to write was the most reliable way to make them think—not because writing is magic, but because writing is hard. And the assignments we built on this logic did two things at once, which were so tightly coupled we rarely had to tell them apart. They forced students to do the difficult cognitive work that produces learning, and at the same time they gave us a window into whether that work had happened. The product proved the process. Creation wasn’t just the goal—it was the test.

AI is disruptive in this context because it has severed the symbiotic relationship between learning and making things, and between learning and its assessment.

What generative AI does better than anything else is produce. It summarizes, it synthesizes, it generates first drafts and competent reflections and plausible arguments. And it does exactly the tasks we spent decades placing at the top of our educational pyramid with apparent effortlessness—as long as we exclude the application of copious quantities of fossil fuels and the other requirements of data centers as “effort.” Regardless, one thing is now very clear: we can no longer assume that any task we ask students to do outside of class—if it can be done with AI—is a reliable proxy for understanding.

In other words, AI has flipped Bloom's Taxonomy upside down, leaving everything in a jumbled pile in front of us. The operations we always treated as foundational—remembering facts, understanding concepts, applying knowledge—now require deliberate human effort to preserve. And the operations we treated as advanced—creating, producing, generating—are the easiest things in the world now that intelligence is available on demand, like water from a tap. A student can create before they understand. They can produce a polished reflection without having done a single reading.

The pyramid is inverted.

I don’t claim to have invented this framing—I’ve encountered versions of it in multiple places—but it’s the single most clarifying lens I’ve found. It’s a way of seeing.

And what it lets you see is that the assignments didn’t break because students got lazier or less honest. They broke because AI collapsed the connection between the product and the process. My reflection assignment, run through AI, still produces a paper. With a sophisticated prompt, sufficient context, and some iteration it might even be good. But such a paper no longer proves anything at all about what a student is thinking—because the effort required to write a paper has always been the thinking, and for a student using AI the effort is gone.

So what are we left with? The form of the assignment is intact, but the function is compromised. That’s why it feels like something shattered but you can’t quite find the crack.

The cruelest part is that AI-assisted work often looks better than unaided work. The prose is cleaner, the structure is tighter, and (unlike students) the AI never seems to miss crucial portions of the instructions. The ideas possess a confidence and fluency that most undergraduates haven't yet developed on their own.

Learning scientists have long distinguished the gap between looking like you’ve learned something and actually having learned it. The distinction is between performance—what a student can demonstrate right now—and learning, the durable change in knowledge and skill that persists over time and transfers to new situations. The two are not the same, and can actually be inversely related: conditions that produce impressive immediate performance often produce weak long-term learning, and conditions that feel difficult and frustrating often produce the strongest retention. (This is why so many college teachers have moved from polished lectures, which can create the feeling of learning because they are entertaining and easy to follow, to active learning, which is often messier but produces more durable understanding.)

AI is a performance amplifier in the sense that it makes the output look like learning happened. Yet fluent performance and durable learning are not the same thing—and when we can’t tell them apart, we’ve lost more than the ability to detect AI use. We’ve lost the ability to see whether learning happened at all.

Think of the student who submits a beautifully structured reflection and then, in office hours, can’t explain a single idea in it. That’s both a cheating problem and a visibility problem. The window we used to look through has gone opaque.

And it’s going to get worse. There’s a logical trap that applies here called the toupee fallacy: people who confidently assert that they can always spot a toupee are confident only because they have as evidence their ability to spot bad toupees. But by definition a good toupee is one that goes undetected—that’s what makes it good.

The same logic applies to AI-generated student work. Professors who are confident they can tell are drawing on a biased sample—they’ve caught the clumsy attempts, the slick but hollow prose, the strangely disconnected insights. They’ve experienced their own Spidey-sense moments that have borne out. But as AI gets better and as students get more skilled in its use, there will be far fewer bad toupees for us to spot. Sniff tests, already unreliable, are going to become less and less useful—and AI detectors, which were never reliable to begin with, won’t save us either.

Here’s the good news, though—and it genuinely is good news: how humans actually learn hasn’t changed.

Three lines of research, developed independently over decades, converge on the same fundamental insight. Robert and Elizabeth Bjork’s work on desirable difficulties shows that conditions making learning feel harder in the moment—spacing practice over time, mixing up problem types, testing yourself instead of re-reading, trying to generate an answer before being shown one—produce dramatically better long-term retention and transfer than conditions that feel smooth and fluent. The key finding is that when retrieval feels effortful, when you have to work to pull something from memory, the act of retrieving it strengthens the memory far more than easy review ever could.

Manu Kapur, a learning scientist at ETH Zurich, has built an entire research program around productive failure—the finding that students who attempt to solve problems they can’t yet solve, before receiving instruction, learn significantly more than students who receive instruction first. His meta-analysis of fifty-three studies found the effect roughly doubled the learning gains of conventional teaching, and with high-fidelity implementation up to three times the effect. The failure is the mechanism: the struggle creates what Kapur calls a “cognitive need”—the desire to understand—that makes subsequent instruction land.

K. Anders Ericsson’s decades of research on expert performance reach the same conclusion: expertise isn’t primarily a product of talent, but of sustained, effortful practice that pushes past current ability.

All three reach the same conclusion. The effort is the learning.

For college professors, the essential point is that effort isn’t incidental to the assignments that we’ve leaned on for so long, and when AI removes the effort, it removes the active ingredient.

“Many people assume they are bad at writing because it is hard,” writes James Clear, putting it as well as anyone. “This is like assuming you are bad at weightlifting because the weight is heavy. Writing is useful because it is hard. It’s the effort that goes into writing a clear sentence that leads to better thinking.”

The good news is that there’s still lots of solid ground to stand on. Earlier this year, on a faculty development trip, I heard Lisa Andrew, CEO of the Silicon Valley Education Foundation, put it simply: AI does not change what we know about how students learn—how neural pathways form and are myelinated, how understanding is built through effortful engagement. AI hasn’t somehow repealed the science of learning. Desirable difficulties and productive failure still work. Effortful practice still translates into meaningful learning. If AI is breaking anything, it’s not learning itself—it’s the delivery mechanism, the assignments that used to guarantee the effort would happen.

We need new ones. And the inversion—Bloom’s taxonomy upended into a sprawling mess—tells us exactly where to look: not at what students produce, but at what they struggle with.

I think about that student paper—the reflective essay that I’m fairly certain ChatGPT wrote. What I felt in that moment was grief for an assignment I’d trusted—one I’d built with care, revised with purpose, and believed in.

But the inversion has changed what I grieve. I no longer mourn the assignment. Now I mourn the process that the product used to guarantee—the evening that students spent re-reading, the halting attempt to articulate the depth and breadth of what we’d discussed, the small frustrations of putting genuine thought into words. That’s what was valuable, and that’s what AI has made optional.

Thinking about our current situation in terms of an upended Bloom’s Taxonomy doesn’t just give us a clearer way to think about what broke. It tells us what we should protect: the difficulty, the struggle, and the conditions under which students have to do their own thinking, even—especially—when it’s uncomfortable.

As for my go-to assignment that now seems under threat, I'm trying to remind myself that the thousands of essays that I've accumulated over the years, including the most insightful and well-crafted of the lot, were always essentially disposable objects. Their purposes were to provoke and then measure student learning, and they served both purposes admirably. The assignment was never intended to produce pieces of writing with lasting value; its goal was always to produce critical thinkers and writers who could make sense of a complicated world.

I’m still working this out, and I’d love to know: In this moment of AI suddenly everywhere all at once, what are the assignments that you have stopped trusting?

Hit reply. I'm listening.

Know a colleague wrestling with this? Pass it along.

Interested in how I use AI in this newsletter? Read more here.

David Gibson

May 7

We've lost the ability to project classwork into students' time at home, and mourn that loss -- and are reluctant to accept that and struggle to internalize it. We can keep doing what we've been doing, to the benefit of students still willing to do that work, or give up entirely and rely on in-class assessment during the time we have each week, or radically expand class time. That last one will meet tremendous resistance as it'll mean cloistering students for long stretches, but will only seem like an overreaction to anyone who hasn't come to terms with the magnitude of the problem. (Call me an alarmist but we alarmists are having our day.)

4 replies by Chris Wells and others

Elizabeth MacBride

Apr 30

I see the breakdown happening earlier. Building sentences and choosing words to imperfectly match meaning is fundamental to developing critical thinking skills. When my students realized how much they were missing — which they did when I taught “how to write a sentence” to undergrads, they mourned and celebrated at the same time. Why we are robbing them of this experience and exercise is truly beyond me.

1 reply by Chris Wells

18 more comments...

Teaching Upside Down

Discussion about this post

Ready for more?