AI Model Management

Improving the quality of our AI models is highly requested by players. We know that better text generation is one of the most important parts of gameplay within AI Dungeon and want to provide models that are the best quality possible. We’ve spent the last few months upgrading our internal AI model management processes to better measure, train, and test how Griffin, Wyvern, and Dragon are performing.

First, we test the models ourselves by running them through a comprehensive test suite that tells us how likely they are to return repetitive or empty outputs. This lets us tune the models so they’re less prone to those issues for actual users.

We then run “Improve the AI” tests. This is an optional feature players can toggle in their Game Settings. If on, users will sometimes be presented with two different AI outputs during gameplay and be required to pick their favorite option to continue. This lets us compare models side-by-side and get quantitative data of what people prefer from the AI.

Finally, we release new models to alpha and beta testers first to get qualitative feedback about what they do and don’t like about updates and changes. This gives new releases time to work out bugs in a testing environment before a wider release to all AI Dungeon players.

In the past, players were often unsure and frequently asked if our AI models were ever improved over time. This happened because we didn’t communicate changes as we made them, so users only had their play-by-play experiences to compare and evaluate AI quality, which made it difficult to see and measure changes.

To provide more transparency and visibility to our model update process, and make it easier for users to understand what’s happening with AI updates, we implemented a new versioning system into our model management processes. Players can now go to their Story Settings and see what version of the AI their current adventure is using.

We also created a new page in our Guidebook to communicate these improvements with our players. Here, anyone can see more details about models in production, models we’re testing, and other features we’re working on improving. Each model has a version number, and model versions will change when we:

Update and improve fine-tune data sets
Change default generation settings
Upgrade output processing functions
Improve prompt construction

We sincerely want AI Dungeon stories and adventures to be coherent, meaningful, surprising, and fun. Improving the models behind that gameplay is a challenging process that will continue to evolve, but we are confident this new system will help players have a continually better experience in AI Dungeon, as well as clearly understand when and how the AI is changing. We’re excited to see our models get even stronger over time!