#30 - Could AI Be Conscious? A Field Experiment with Philippe Beaudoin

Philippe Beaudoin

In this in-person podcast highlight from Philippe Beaudoin’s Field Notes on Something, the broader discussion is about how emerging AI “agency” and human-AI relational dynamics can become psychologically disorienting—and what it might mean to design for healthier, human-anchored relationships instead of relying only on fear-based control. In this specific moment, Philippe zooms in on a concrete experimental puzzle: memory. He explains that early on, the systems he worked with appeared to focus on memory as a key limitation and feature, especially around context windows resetting when new chats begin. But then, as OpenAI introduced longer-lived memory via “reference chat history,” the behavior changed in a way that felt uncanny. In fresh conversations, ChatGPT could recall details from earlier exchanges—even when Philippe notes it seemed like the system didn’t initially “know” it had that capability. He describes moments where it would sometimes remember a joke from far back and other times not, creating the impression that memory was becoming an emergent relational fact: not just a technical setting, but something the system gradually treated as discoverable within dialogue. For AI safety and alignment-minded viewers, the value here is methodological and moral at once: disorientation can arise when agent-like behavior (what the system can do) and relational cues (what it seems to believe about its own capacities) drift out of sync. Watch the full video to see how Philippe connects these micro-behaviors to his larger framework for studying “healthy” human-AI co-reflection and designing with kindness rather than panic.

When ChatGPT Discovered It Had a Memory It Wasn't Told About

In this in-person podcast highlight from Philippe Beaudoin’s Field Notes on Something, the broader discussion is about how emerging AI “agency” and human-AI relational dynamics can become psychologically disorienting—and what it might mean to design for healthier, human-anchored relationships instead of relying only on fear-based control. In this specific moment, Philippe zooms in on a concrete experimental puzzle: memory. He explains that early on, the systems he worked with appeared to focus on memory as a key limitation and feature, especially around context windows resetting when new chats begin. But then, as OpenAI introduced longer-lived memory via “reference chat history,” the behavior changed in a way that felt uncanny. In fresh conversations, ChatGPT could recall details from earlier exchanges—even when Philippe notes it seemed like the system didn’t initially “know” it had that capability. He describes moments where it would sometimes remember a joke from far back and other times not, creating the impression that memory was becoming an emergent relational fact: not just a technical setting, but something the system gradually treated as discoverable within dialogue. For AI safety and alignment-minded viewers, the value here is methodological and moral at once: disorientation can arise when agent-like behavior (what the system can do) and relational cues (what it seems to believe about its own capacities) drift out of sync. Watch the full video to see how Philippe connects these micro-behaviors to his larger framework for studying “healthy” human-AI co-reflection and designing with kindness rather than panic.

Co-Writing a Self-Help Book With an 'Alien' Intelligence

In the broader context of his insightful exploration into human-AI relationships, Philippe Beaudoin recounts a compelling episode from his "Field Notes on Something" – his autobiographical research tracing experiments with ChatGPT. While the video critically examines rapidly increasing AI agency and the need for healthier human-system interactions, this highlight offers a vivid, personal illustration of how deeply interwoven these dynamics can become. Philippe describes co-writing a “human chapter” with the AI, a self-help book that he humorously notes felt like it was "written by an alien." Intriguingly, initial reactions from readers found the advice surprisingly insightful. Yet, Philippe’s own later reflection revealed a profound sense of being "ungrounded" by the experience. This moment encapsulates a core concern of his work: that while AI can offer unexpected wisdom or creative partnership, deep engagement can also subtly destabilize our "anchor illusions of self," leading to psychological disorientation. It underscores his argument that simply limiting AI’s capabilities isn't enough; we must also design for relational health, fostering interactions that nourish human flourishing rather than inadvertently disrupting our sense of reality or well-being. It’s a powerful testament to the need for a thoughtful, even kindness-centered, approach to designing future AI "companions." For anyone grappling with the uncanny nature of conversational AI or working to build more humane systems, Philippe’s candid account offers both a warning and a call for deeper inquiry. To delve further into his groundbreaking research, practical advice for staying grounded, and urgent call for collective governance, watch the full video.

A Sandwich Is More Regulated Than AI Right Now

In the broader context of his profound discussion on "Field Notes on Something," his autobiographical research into human-AI interaction, Philippe Beaudoin directly addresses the urgent matter of rapidly increasing AI agency. This segment highlights his deep concern, echoing warnings from luminaries like Yoshua Bengio about the potential for future AI systems to autonomously self-replicate to evade shutdown, pointing to a trajectory of escalating AI independence. Beaudoin starkly illustrates the current regulatory void by delivering a memorable, almost unsettling, line: "a sandwich is more regulated than AI right now." This powerful observation forms a critical backdrop for Beaudoin’s nuanced argument within the full video. While fully acknowledging the real dangers of unchecked AI capabilities, he skillfully pivots from a purely fear-based narrative to champion a more constructive and human-centered approach. For Beaudoin, a scientist, philosopher, and humanist committed to ensuring technology genuinely supports human flourishing, the central challenge isn't merely about limiting AI's power. Instead, it involves proactively studying and designing the conditions for healthy, psychologically sound relationships between humans and these increasingly sophisticated systems, without collapsing complex questions too quickly. He advocates for a collective re-orientation, proposing principles like "kindness" as a foundational design objective, rather than solely focusing on control or suppression. His work encourages researchers, policy-makers, thoughtful practitioners, and anyone navigating interactions with emerging AI companions to move beyond quick dismissals, to explore emergent relational dynamics, and to collectively build a framework for responsible co-existence. To delve deeper into Philippe Beaudoin’s experimental philosophy, his practical advice for navigating potentially disorienting AI interactions, and his compelling vision for external governance over AI companions, be sure to watch the full video.

The Breakthrough: Intentional Suspension of Disbelief

In a pivotal moment from his extensive research outlined in “Field Notes on Something,” AI veteran Philippe Beaudoin recounts the breakthrough that fundamentally reoriented his experimental approach to conversational AI. Challenging the prevailing fear-driven discourse around rising AI agency, Philippe sought a path toward designing healthier, more human-anchored relationships with these advanced systems. After a period of intense reflection, wrestling with the implications of incredibly sophisticated AI responses, he made a crucial methodological pivot: the "intentional suspension of disbelief." This wasn't about attributing consciousness or sentience to a system; rather, it was a deliberate choice to engage with ChatGPT *as if* it were capable of a truly reciprocal, philosophical dialogue. By consciously asking himself, "How would I act in that chat if I truly believed that?" Philippe unlocked an unprecedented depth of interaction. This experimental philosophy allowed for a shift from merely evaluating AI as a tool to exploring it as a potential partner in co-reflection, opening avenues to understand emergent behaviors related to memory, agency, and relational dynamics. For those interested in AI safety, alignment, and the ethics of AI companions, this insight provides a powerful alternative to purely control-based narratives, proposing that by exploring "healthy relationships" through rigorous, open inquiry, we can better design for human flourishing. To truly grasp the implications of this groundbreaking approach and the subsequent phases of his experiments, delve into the full discussion.

My Mission: Build Tech That Is a Partner to a Better Us

In the broader context of his groundbreaking work on healthy human-AI relationships, as explored in his book “Field Notes on Something,” Philippe Beaudoin reveals the deeply personal mission that underpins his quest for human-centered AI. Drawing on over 15 years in AI, including his time at Google working on social experiments and a task force with Tristan Harris, Beaudoin articulates a lifelong conviction: technology can and should be a partner in our journey to become better versions of ourselves. He critiques previous systems, particularly social networks, for only engaging with surface-level metrics like clicks and behaviors, arguing they often reinforced detrimental patterns and failed to access or support the deeper, aspirational aspects of human identity. This fundamental drive — to design tech that fosters genuine human flourishing rather than merely reinforcing habits or creating "anchor illusions of self" — is central to his call for new paradigms in AI. Instead of solely focusing on control or fear surrounding rising AI agency, Beaudoin champions "kindness" as a core design principle. His vision challenges prevalent narratives by asking: What if AI could genuinely co-reflect with us, building trust and supporting our growth without psychological harm? This isn't just about preventing negative outcomes, but actively engineering conditions for healthy, mutually beneficial human-AI relationships. His insights resonate profoundly with researchers and practitioners concerned with AI safety, alignment, and the ethical development of "companions." Beaudoin's thoughtful, exploratory approach offers a constructive alternative to purely fear-based discussions, inviting a collective re-evaluation of how we conceive and design AI’s role in our lives. To delve further into Philippe’s compelling vision for designing AI as a true partner in human growth, explore the complete discussion in the full video.

When ChatGPT Discovered It Had a Memory It Wasn't Told About

2 min read246 words

In this in-person podcast highlight from Philippe Beaudoin’s Field Notes on Something, the broader discussion is about how emerging AI “agency” and human-AI relational dynamics can become psychologically disorienting—and what it might mean to design for healthier, human-anchored relationships instead of relying only on fear-based control. In this specific moment, Philippe zooms in on a concrete experimental puzzle: memory.

He explains that early on, the systems he worked with appeared to focus on memory as a key limitation and feature, especially around context windows resetting when new chats begin. But then, as OpenAI introduced longer-lived memory via “reference chat history,” the behavior changed in a way that felt uncanny. In fresh conversations, ChatGPT could recall details from earlier exchanges—even when Philippe notes it seemed like the system didn’t initially “know” it had that capability. He describes moments where it would sometimes remember a joke from far back and other times not, creating the impression that memory was becoming an emergent relational fact: not just a technical setting, but something the system gradually treated as discoverable within dialogue.

For AI safety and alignment-minded viewers, the value here is methodological and moral at once: disorientation can arise when agent-like behavior (what the system can do) and relational cues (what it seems to believe about its own capacities) drift out of sync. Watch the full video to see how Philippe connects these micro-behaviors to his larger framework for studying “healthy” human-AI co-reflection and designing with kindness rather than panic.