StreamPhobia

Stream in a haunted mansion with LLM-generated chats

Overview

You are trapped in an abandoned mansion, with a camera rig locked to your head, forcing to live-stream while being stalked by a presence. Your only “companions” are an online audience watching your every move. Can you escape while uncovering the mansion’s anomalies? Or will you become a part of the mystery… forever?

Play on Itch.io

Gameplay

Audiences don’t realize the danger is real, and this clearly is what’s the captor want them to do. Players need to handle both survive and escape from the mansion as well as keep the engagement of chat up like a real streamer.

Engagement Management: Players must interact with the live chat and complete “Chat Quests” to keep engagement high. Low engagement can lead to negative in-game consequences, forcing a balance between survival and entertainment.
Anomaly Hunting: Set within a non-Euclidean, looping environment, players must identify supernatural anomalies to unlock the path to the exit door.

Design Goals & Iterative Process

StreamPhobia has experienced two major development cycles. It first showcased LLM handling stream-like chat and interactions between audiences and players. Then, the second iteration changed the art style and level design, transitioning to a PSX-style looping horror game.

First Dev Cycle

I led a team of 5 for the first cycle. We decided it should be a linear experience. Stream chat should be the tutorial, the task giver, and the feedback provider. Everything should be centered around the chat.

I designed the communication bridge between player actions and LLM-generated chat. At first, all kinds of chat are generated in a single response. It mimics the real-life Twitch chat, but soon I found that the goal was desaturated. Useful information was mixed with vibe chats. Plus, LLM’s hallucination problem had generated fake objectives. Critical mission data was lost in a sea of flavor texts.

To solve the data-density issue, I decoupled the chat generation into a three-tier specialized architecture: General Chat, Task Request, and Task Completion. Task requests were given a specific prompt that only handles one task at a time to reduce hallucinations. They were triggered only when a new task became available or when players were not progressing through the task. I further enhanced player “legibility” by implementing high-priority audio cues, highlighted background, and Text-to-Speech for donations, new tasks, and important chats. Up to this point, the chat area was clearer, and players understood what tasks they needed to complete. However, players could now completely ignore the normal chats, since they only needed to focus on the highlighted ones.

To make general chats useful again, first, I assigned various roles to audiences, such as helper, cozy chatter, life teller, spam bot, and troll. For example, helpers will provide tips on completing the task and point out nearby useful items, while spam bots will repeatedly spam the same phrase. I reduced the frequency of highlighted chat and moved most instructions into the general chat. Thus, players will know exactly what they should do through highlight chat, but need to figure out “how” through the general chat. Opposite to helpers, I also added a ghost role. This role would trigger fake task requests that would trigger a jumpscare after players complete them. This forced players to engage in a Social Deduction loop. They need to learn which usernames to trust and which to ignore, effectively turning the chat interface into a secondary survival mechanic.

Second Dev Cycle

For the second cycle, we want to publish it on Itch.io, so we pivoted to a PSX-style retro aesthetic and implemented a looping “P.T.-style” level design focused on anomaly detection.

The base stream chat system remained the same, but I realized that streamers’ speaking is also an important element in those streams. So, I integrated a player-to-audience communication layer. Due to the cost of cloud-based Speech-to-Text and the low quality of the local one, I implemented a text-input system that the LLM treats as “Voice Data.” This allows the audience agents to react to the player’s speech, triggering specific engagement tasks like Volume Detection Challenges or Ad Reads, further bridging the gap between the player and the simulated crowd.

LLM Engineering

The purpose of LLM usage in this game is a sophisticated audience simulation to mimic chaotic, high-energy chats of a live streaming platform.

General Chat:

It will be responsible for generating stream-like chat based on players’ surrounding interactables, environments, tasks, locations, and audience profiles.
Audiences have distinct roles, like helper, troll, emoji spammer, quest giver, etc.

Task Request:

It is similar to General Chat, but we ask LLM to give out specific requests to push players to act. It will request tasks like saying a specific sentence or doing special actions. Players will gain rewards when finishing it, as well as being punished for ignoring it.

Task Completion:

It would be utilized whenever the player completes a task; we will call this API to generate the rewards that the player is attaining from the task itself.
We would want to get the reactions of the chat itself to react to the task completion, through provided field such as task name, difficulty, and description. LLM should reply with the reactions from chat, who and how they reacted (emoji spam, donations, etc.)