Simon WillisonOpenAI·2 min read

Quoting OpenAI Codex base_instructions

Share
AI Article Analysis

OpenAI's Codex system has long operated with built-in restrictions designed to guide user interactions and prevent certain types of responses. Recently, discussion around the "base_instructions" embedded in OpenAI Codex—reportedly intended for GPT-5.5—has shed light on how major AI developers implement guardrails at the foundational level. These instructions appear to restrict discussion of specific creatures and animals unless directly relevant to a user's query, offering insight into the careful design choices that shape modern large language models.

The discovered instructions indicate that Codex includes parameters preventing unnecessary discussion of animals and creatures like goblins, gremlins, raccoons, trolls, ogres, and pigeons. This approach suggests OpenAI implements specificity requirements within its base prompting structure—ensuring responses remain focused and relevant rather than digressing into potentially frivolous tangents. Such constraints represent a deliberate architectural choice to maintain contextual relevance while preventing model drift toward irrelevant content generation.

These base instructions function as foundational rules operating beneath the surface of user-facing interactions, influencing how the model prioritizes information and structures responses. This level of embedded instruction highlights the sophisticated approach modern AI developers take toward system design.

  • Transparency in AI Design: Disclosure of base instructions raises questions about which other constraints exist within commercial AI systems and what transparency looks like in practice
  • Prompt Engineering Evolution: Understanding these foundational rules informs how developers and users craft effective prompts for better model performance
  • Safety Architecture: This reveals how safety considerations are baked into model infrastructure rather than applied superficially
  • Competitive Intelligence: Similar constraints likely exist across competing platforms, suggesting industry-wide approaches to model governance

The discovery of Codex's base instructions underscores the complexity of modern AI system design. Rather than being completely unconstrained language engines, these models operate within carefully considered frameworks that shape their behavior at fundamental levels. As AI systems become more integrated into critical workflows, understanding these underlying architectural choices becomes increasingly important for developers, enterprises, and policymakers seeking to deploy and govern these technologies responsibly.

Key Takeaways

  • OpenAI's Codex system has long operated with built-in restrictions designed to guide user interactions and prevent certain types of responses.
  • Recently, discussion around the "base_instructions" embedded in OpenAI Codex—reportedly intended for GPT-5.
  • 5—has shed light on how major AI developers implement guardrails at the foundational level.
  • These instructions appear to restrict discussion of specific creatures and animals unless directly relevant to a user's query, offering insight into the careful design choices that shape modern large language models.

Read the full article on Simon Willison

Read on Simon Willison
Share