AI model data handling guide

How ROST cloud agents, local MCP sessions, runners, BYOK, and connected tools handle model-bound data.

company setupstaffingoperating rhythm

Use this guide when a human, CLI session, MCP client, or in-app agent needs to understand what data can reach an AI model, what ROST controls, and what remains governed by the customer's provider account, local client, or connected tools.

This is product guidance, not a legal warranty. Provider terms, enterprise contracts, data-processing addenda, retention controls, regional settings, and endpoint-specific behavior can change. Verify the provider account and contract before relying on a zero-data-retention, HIPAA, regional, or no-training posture.

For ROST's own product-improvement posture, read the public Privacy Policy and Terms of Service at /privacy and /terms. They state that ROST may analyze aggregated or de-identified data about usage, setup patterns, agent outcomes, and operating metrics to improve product features, defaults, templates, recommendations, safety checks, and reliability; they also state that ROST does not sell customer data, share one customer's workspace data with another customer, or use identifiable customer content to train generalized AI models. Google user data and information derived from Google APIs stay under the Privacy Policy's Google user data limits and are not included in cross-customer product-improvement analytics. Do not turn this into a provider-side claim about a customer's local client, BYOK provider account, or third-party tool.

The three agent lanes

ROST names the execution lane on runs and work orders. Use these exact terms:

Lane	Where model-bound data is assembled	Who controls provider handling
`cloud`	The ROST cloud runtime builds bounded prompt context from the Seat Charter, signed permission manifest, task or work order, relevant operating facts, selected Skill files, and tool results.	The current managed cloud path uses Anthropic-backed model configuration, either through ROST's platform-managed provider account or the tenant's approved BYOK Anthropic credential when that tenant has saved one.
`mcp_session`	A local agent client such as Claude Code, Codex, OpenCode, Cursor, or another MCP-capable client asks ROST for scoped context, resources, and command results.	The customer's local client, model provider account, proxy, and provider settings. ROST does not control that local provider's retention, training, telemetry, or local transcript storage.
`runner`	A paired runner claims work from ROST, receives bounded work-order context, and may invoke a local or customer-controlled model client to complete the work.	The runner host, local client, and provider account used by that runner. ROST controls the work-order boundary and audit path, not every downstream local model setting.

When answering "where did my data go?", name the lane first. Do not collapse cloud, mcp_session, and runner into one generic "agent" path.

Cloud agents

For cloud runs, ROST prepares the minimum context needed for the Seat to act:

Charter purpose, responsibilities, autonomous scope, approval scope, escalation rules, Signals, and signed permission manifest.
The task, work order, Sync/Friction/Signal context, or human instruction for that run.
Assigned Skill summaries and the bounded contents of selected approved Skill files.
Tool results that the server-side guard allowed and that are relevant to the next model turn.
Model catalog choices and run metadata needed to attribute cost and diagnostics.

Cloud prompts should not include raw credentials, long exports, unrelated customer records, or secret values. A tool result can become model context, so keep tool scopes narrow and connect only the tools the Seat's Charter actually needs. The server-side guard decides whether a tool call is allowed, escalated, or denied; model output and Skill text never grant authority.

BYOK Anthropic

Tenant BYOK changes which provider account is used for eligible cloud model calls. It does not loosen tool permissions, human gates, tenant isolation, or audit requirements.

Store a tenant Anthropic key only through tenant.anthropic_key.save / rost_save_tenant_anthropic_key or the generic credential ingress path. The key is handled as a vault-backed credential: ROST stores a vault reference and safe metadata, not the raw key in prompts, config, logs, or command output. A human approves the credential flow.

After BYOK is saved, provider-side retention, training, regional, HIPAA, or zero-data-retention posture depends on that Anthropic organization, workspace, model, feature, and contract. Do not tell a customer "BYOK means zero retention" or "BYOK means no training" unless their provider arrangement actually says that.

Local MCP sessions

In a mcp_session, ROST is the governed operating-system surface. The local agent client is still its own model client.

ROST can enforce token scope, tenant isolation, command schemas, human confirmations, tool guards, and redacted command output. It cannot decide whether the local client stores transcripts, sends telemetry, opts into model improvement, uses a consumer account, uses a commercial account, routes through a proxy, or applies zero-data-retention settings. Those are local client and provider settings.

ROST command and resource outputs are returned to the local client. Seat context, Charter text, task and Signal facts, reference guide bodies, command JSON, and filtered tool results may become part of that client's provider request unless the operator narrows scope, redacts the input, or uses a provider setting that prevents it. Treat every MCP read as data that can enter the local model transcript.

Before connecting a local agent to sensitive work, verify:

Which model provider and account the local client is using.
Whether the account is consumer, team, enterprise, API, or routed through another platform.
Whether model-improvement sharing, telemetry, transcript retention, and local logs are enabled.
Whether the client stores session files locally, and where those files live.
Whether screenshots, files, terminal output, browser data, or tool results can be included in the client transcript.

For example, official Claude Code docs distinguish consumer and commercial handling, describe local plaintext session transcripts, and state that local clients send prompts and outputs to the configured provider. Official OpenAI API docs distinguish API data controls from consumer products, while official Codex help says ChatGPT training data controls apply to content processed through Codex when Codex is used with a ChatGPT plan. Treat each local agent's provider settings as part of the customer's environment review.

Runners

A runner is paired to ROST and claims work orders. ROST records the work-order state, lane, Seat, run diagnostics, and audited tool-call outcomes. The runner host controls the local execution environment.

Keep runner prompts bounded to the work order and Seat context. Do not paste runner bearer secrets, API keys, .env contents, SSH keys, OAuth tokens, database URLs, or private package credentials into prompts, tool args, screenshots, logs, or status updates. If a runner needs a credential, use the governed credential ingress and vault reference path or the customer's own secure local secret mechanism; do not turn the credential into model context.

If a runner invokes Claude Code, Codex, OpenCode, Cursor, or another local model client, provider data handling follows that local client and account, not the ROST cloud provider path.

Connected tools and tool results

Connected tools are often the highest-risk data path because a valid tool result may be summarized into the next model turn.

Use these rules:

Connect the narrowest tool scopes needed for the Seat's Charter.
Prefer filtered reads over full exports.
Keep credentials in vault refs and tool handlers; do not send raw secret values to the model.
Treat third-party tool data handling separately from model-provider data handling.
Escalate when a tool would expose regulated data, unusually broad customer records, confidential legal or HR material, or data outside the Seat's approved scope.

Tool selection is never authorization. Every live tool call still passes the server-side guard and writes an audit record.

What not to put in prompts or tool args

Do not intentionally put these into prompts, MCP tool arguments, runner work orders, Skill files, status updates, or screenshots unless the customer has a documented need and a matching provider/tool contract:

API keys, OAuth tokens, passwords, session cookies, SSH keys, signing keys, or recovery codes.
Raw .env files, private database URLs, credential vault payloads, or unredacted logs containing secrets.
Full customer exports when a filtered subset would answer the question.
Payment card data, PHI, government identifiers, or other regulated records without the required contract and workflow.
Private source URLs with embedded tokens or signed URLs.
Data from another tenant, workspace, account, or seat.

If sensitive data appears by accident, stop, do not repeat it, and route cleanup through the customer's incident process.

Provider policy checkpoints

For platform-managed cloud runs and BYOK Anthropic cloud runs, check the Anthropic account, workspace, model, and contract. As of 2026-06-22, official Anthropic Claude API documentation says retained API data is not used for model training without express permission, that some API features are ZDR-eligible while others are not, and that ZDR, HIPAA readiness, model-specific retention requirements, consumer products, Claude Code, and third-party integrations each have their own scope.

For local OpenAI API-key clients running through mcp_session or runner, check the customer's OpenAI organization, project, endpoint, and API data controls. As of 2026-06-22, official OpenAI API documentation says API data is not used to train or improve OpenAI models unless the customer explicitly opts in, and default abuse-monitoring logs may retain customer content for up to 30 days unless approved controls apply. OpenAI says zero data retention is available only for eligible customers, endpoints, and use cases.

For Codex specifically, first identify whether the operator is using Codex through a ChatGPT Free/Plus/Pro, Business, Enterprise, Edu, API, or other workspace path. Official OpenAI Codex help says ChatGPT training data controls apply to whether content processed through Codex may be used to improve OpenAI models when Codex is used with a ChatGPT plan, and that Pro and Plus conversations may be used unless training is disabled. Do not give a Codex user the OpenAI API answer unless they are actually using an API-governed Codex path.

For local Claude Code clients running through mcp_session or runner, check whether the customer is using a consumer, team, enterprise, API, or third-party-platform path. Official Claude Code documentation separately describes consumer versus commercial handling, local transcript storage, telemetry, feedback, and provider-specific defaults.

Use those statements only as current provider-doc summaries. Do not convert them into a blanket ROST claim. The correct customer-facing answer is: "Here is the lane, here is what ROST sends, here is which provider account or local client handles it, and here is what to verify in that account before treating the workflow as no-training, zero-retention, HIPAA-ready, or region-bound." If a customer or prospect asks for a DPA, BAA, SOC 2 report, regional processing commitment, customer-managed retention window, staff-access approval workflow, or customer-visible staff-access log, escalate instead of answering from this guide.

Agent response checklist

When asked about AI data handling:

1. Identify the lane: cloud, mcp_session, or runner. 2. Name the specific data classes that can enter the prompt or tool result. 3. Name whether the provider path is platform-managed, tenant BYOK Anthropic, local client/account, or runner-controlled. 4. Warn if the answer depends on provider settings, contract terms, ZDR approval, HIPAA readiness, model choice, endpoint, or third-party tool policy. 5. Escalate contractual, compliance, regional-processing, customer-managed retention, or staff-access-commitment questions instead of answering from this guide. 6. Recommend the smallest safe next step: narrow the tool scope, use a vault ref, redact the input, switch to the approved provider account, or escalate for human/legal review.