They Said AI Can Run a Business… But It Failed to Beat This Old 8-Bit Game

Posted by:

|

On:

|

I’ve always had a deep passion for old-school games. My first computer was an MSX when I was just eight years old, and I can still remember the magic of those early text-based adventure games. There were no graphics, just words on a screen, and yet, through pure imagination, those games came to life in my mind. I had to picture myself in every room, item, and dark cavern. That was the beauty of those games. They weren’t just about solving puzzles; they were about creating entire worlds inside our heads.

Fast-forward to today, and we live in an era where Generative AI is transforming industries. I kept seeing articles about how people were using ChatGPT to build profitable businesses in record time, like this one, where someone gave ChatGPT $100 and let it start a business. These stories were everywhere, claiming that AI could run businesses, generate content, and automate decision-making. If AI was really that capable, then surely, it should be able to play and win a simple text adventure game, right?

That’s where my obsession began…


What is a Text Adventure Game?

Text adventure games, also known as interactive fiction, were among the earliest forms of computer entertainment, rising to popularity in the late 1970s and 1980s. Unlike modern games with detailed graphics and sound, these games relied entirely on text-based input and output, immersing players in a world created solely through words. Players would type commands like “go north,” “take key,” or “open door” to explore virtual environments, solve puzzles, and progress through the game’s storyline. The game, in turn, would respond with descriptive text, narrating the environment, the outcomes of their actions, and any obstacles encountered along the way. Every move required careful thought, as success depended on understanding subtle clues embedded in the text.

The opening screen of Zork I: The Great Underground Empire

These games challenged players to use imagination and problem-solving skills in ways that graphical games rarely do. Without visuals, the player’s mind had to paint the scene described on the screen, from eerie forests to hidden treasure rooms. Despite their apparent simplicity, text adventures offered surprisingly deep and engaging experiences, thanks to their intricate puzzles, branching paths, and immersive narratives. Classics like Zork I and Zork II turned a blank screen and a blinking cursor into expansive worlds full of mystery, danger, and humor, captivating an entire generation of gamers.


The Big Idea: AI as a Text Adventure Solver

The idea seemed straightforward. If AI could read and generate text, then surely it could play a game that relied solely on text-based commands. The challenge was turning this concept into a fully automated system that could interpret the game state, make logical decisions, and play in real time. I designed the AI Adventure Solver to function in a continuous loop: first, it would take a screenshot of the game screen, capturing the exact text as displayed. Then, it would use Optical Character Recognition (OCR) to extract that text and convert it into a readable format. Once the text was processed, it would be fed into ChatGPT, which would analyze the situation and determine the most appropriate action—whether that meant moving to another location, picking up an item, or solving a puzzle.

Basic flow Flow of the AI Adventure Solver

With the AI’s decision made, the next step was to simulate real keystrokes to enter the command into the game, just like a human player would. After executing the command, the process would repeat: a new screenshot would be taken, the text would be analyzed again, and ChatGPT would decide the next move based on the updated game state. My vision was to make this system general-purpose, meaning it wouldn’t just solve Zork but could theoretically play any text adventure game with the same structure. The AI wouldn’t be “cheating” by reading source code or hacking game files; it would play just like a human would, using only the text presented on screen and making decisions based on what it had previously encountered.

At first, the simplicity of the approach felt promising. It was exciting to imagine an AI navigating these classic games the way we once did: interpreting descriptions, solving puzzles, and exploring unknown worlds with nothing but text as guidance. But while the logic was sound, the execution would soon prove far more complicated than I had anticipated. The AI could read the game, but could it truly understand it? The real challenge wasn’t just extracting text and generating responses—it was teaching the AI to think like a human player.


The first outputs: from excitement to frustration

The first version came together quickly. I used WebMSX, a browser-based MSX emulator, to run Zork I and built a simple JavaScript interface around it. With the help of ChatGPT-4, I had AI reading the game text, processing decisions, and sending keystrokes back to the emulator.

At first, I was impressed. The AI was exploring rooms, picking up items, and even solving some puzzles. It felt like magic—watching an AI play a game that once required so much human thought and creativity. But what really took me by surprise was the eerie sensation of seeing the AI-type commands back into the emulator. It reminded me of that famous scene in Ghost, where the computer starts typing by itself as if possessed by an unseen force.

The screen flickered, letters appeared one by one, and I wasn’t the one typing. The AI was running the show. It was an incredible moment, both thrilling and unsettling.

AI Adventure Solver in action

But as I kept testing, something became painfully clear:


AI Was Dumb in Ways I Didn’t Expect

When I started this project, I knew that AI might make mistakes, but I assumed it would gradually improve over time. I thought that with enough context, clear instructions, and the ability to reference walkthroughs, it would be able to rationally play through a text adventure game. But what followed was a series of frustrating, often hilarious failures that took me on a rollercoaster of hope, confusion, and stubborn determination.


Attempt 1: The Pray Loop That Wouldn’t End

That evening, I sat down to run my first serious attempt with GPT-4o, eager to see how well the AI could handle Zork I. I watched as it confidently moved through the first few rooms, picking up items and making logical decisions. For a while, everything seemed promising—until it reached a part of the game where a ghost blocked its path.

At first, it hesitated. It tried a few things: looking around, moving in different directions, checking its inventory. Then, as if it had an epiphany, the AI typed:

“Pray.”

The game responded:

“If you pray enough, your prayers might be heard.”

That response must have triggered something in the AI’s logic. It took the phrase literally. It assumed that the game was telling it to pray more. I watched in disbelief as the AI locked itself into an infinite loop, praying endlessly, trapped by its own over-literal interpretation of the text.

At first, I laughed. This was exactly the kind of thing I had expected AI to be smart enough to avoid. I let it run for a while, fascinated by its sheer stubbornness. But eventually, I had to force a manual stop. That was attempt one, and it had ended in AI-induced divine desperation.

That night, I couldn’t stop thinking about it. How could I prevent this from happening again? Maybe I needed to modify the prompt—explicitly tell the AI not to repeat actions indefinitely. Or maybe it needed some kind of self-review mechanism?


Attempt 2: The AI That Gave Up on Life

The next morning, I decided to give it another shot. I tweaked the prompt, reinforcing the idea that it should avoid repeating actions mindlessly. If nothing else, I was determined to at least get past the ghost. The AI started strong again. It went through the familiar motions: exploring the house, collecting items, entering the underground passages. But then, something unexpected happened:

It died.

That’s normal in Zork. The game explicitly gives players another chance after death. It’s not the end of the journey; it’s just a setback. But instead of acknowledging that it was still in play, the AI just stopped! It was as if, at that moment, the AI believed there was no way forward. It didn’t recognize that the game was giving it another opportunity. It simply quit.

I sat there staring at the screen, frustrated. Why would it do this? It had access to the game’s text. It could read that the game was letting it continue. Yet, it had made an irrational decision.

I took a break, grabbed some coffee, and thought about what this meant. Maybe the AI didn’t fully understand Zork’s mechanics. Maybe it was getting confused by how the game described death and resurrection. Or maybe, on some level, it just didn’t want to keep going.

I had no choice but to restart.


Attempt 3: Even With a Walkthrough, AI Gets Lost

I decided to give the AI an advantage. I fed it a full walkthrough of Zork I, a step-by-step guide that, if followed correctly, would lead it straight to victory. This should have been the breakthrough. With a predefined sequence of correct actions, there was no reason for it to fail.

But instead of following the walkthrough, the AI… wandered aimlessly. At first, it seemed to be sticking to the guide. It moved through rooms efficiently, picked up key items, and followed the recommended path. But then, for no apparent reason, it started backtracking.

It would go south, then north. Enter a room, then leave. Visit the same location three times in a row. It wasn’t making random moves: it was acting like someone who was lost despite holding a map. It was as if the AI wasn’t registering the significance of past decisions.

At this point, I was starting to lose patience. If it can’t follow a walkthrough, what hope is there? I let it run for a while, hoping it would correct itself. It didn’t. Eventually, I had to stop the session.

That night, I didn’t even want to think about it. Maybe AI just isn’t good at games.


Attempt 4: A Completely Nonsensical Failure by the model 4o-mini

I woke up the next day, feeling slightly more determined. Let’s give it one last try. This time, I wanted to test a different AI model: GPT-4o-mini. The OCR was incredible: the AI read the game screen perfectly, without the minor text extraction errors I had seen before. But then, as I watched, it started doing something bizarre.

It kept typing “COLOR”.

Not “go north” or “take sword”, just “COLOR” over and over again. COLOR is a command in MSX-BASIC that triggers when F1 is pressed, but it has absolutely nothing to do with Zork. For some reason, the model decided to insistently return the hotkey F1, even with it was obvious that nothing was happening.

Where was this coming from? At this point, I had seen enough. I shut the whole thing down.


Download the full attempt logs

You can download the full logs of the attempts above and others, here:


Lessons from AI’s Struggles

After all these tests, I had more questions than answers. AI was supposed to be smart. It was being used to run businesses, write content, and make complex decisions. Yet, here it was, completely failing at a 40-year-old text adventure game.

Why?

  • It had no sense of long-term strategy.
  • It didn’t recognize when it was stuck in a loop.
  • It made irrational choices, ignoring clear game mechanics.
  • Even with a walkthrough, it got lost.

This wasn’t just about teaching AI to play a game. It was about understanding the limits of AI itself. Could it be improved? Absolutely. That’s where Version 2 of this experiment will come in, introducing smarter self-review mechanics, better command execution logic, and improved state tracking.

But for now, I had to admit: AI still had a long way to go before it could beat Zork.


Where AI Went Wrong (and How I Tried to Fix It)

The problem wasn’t AI’s intelligence but its lack of strategic memory. It wasn’t thinking like a human; it was thinking one step at a time, without an overarching plan. The game world was unfolding in front of it, but it wasn’t tracking the bigger picture.

I made some improvements:

  • Added walkthrough support, allowing the AI to reference guides when stuck.
  • Allowed dynamic prompt adjustments, so users could tweak instructions mid-game.
  • Introduced basic loop detection, warning the AI if it started repeating actions.

These helped, but they weren’t enough. AI still made reckless decisions, like attacking enemies without a weapon or issuing long command chains that became invalid when the first command failed.

At this point, I had to ask myself: Can AI solve these games at all?


The Road Ahead: Smarter AI and V2 Ideas

The more I tested, the more I realized that this was just the beginning. The AI needed something more: a way to review its decisions and self-correct in real-time.

For future versions, I’m considering:

  • Adaptive AI Reasoning – Every X turns, the AI should step back, review past moves, and detect patterns like loops, dead-ends, or inefficient routes.
  • Smarter Command Execution – Instead of sending long chains of actions, the AI should validate each move before proceeding.
  • Summarization Instead of Raw Logs – Right now, the AI is overwhelmed with too much context. Summarizing old moves could help it focus on what matters.

These are big changes, and I don’t have all the answers yet. That’s why I’m putting this tool out there, so others can experiment, tweak it, and help push AI gaming forward.


Try It Yourself

If you want to see AI struggle (and occasionally succeed) at playing a classic adventure game, you can try the AI Adventure Solver yourself: AI Adventure Solver. To understand the interface, and learn how to configure and play, please visit the AI Adventure Solver project on GitHub.

And I’m looking for your feedback! Maybe together, we can figure out how to get AI to finally beat Zork. Maybe we’ll learn something unexpected along the way. And maybe, just maybe, we’ll get closer to making AI truly think in the way we imagined it would.

Because if AI can run businesses, it should be able to beat a 40-year-old text game. Right?

Important Links