Why should marketers care about a data-engineering standard?

Because brand knowledge is the same kind of asset OKF was built to share: curated context that other systems need to read correctly. Voice, ICP, positioning, and approved claims are all concepts an AI agent has to consume. OKF shows the file format that lets it.

How is OKF different from schema.org or llms.txt?

Schema.org structured data and the proposed llms.txt file are lighter versions of the same idea: publishing machine-readable information for engines to read. OKF is more complete because it represents a whole graph of linked concepts rather than a single file or a set of tags. The three can coexist.

What fields does an OKF file need?

Only one is required: type, a short string describing what the concept is. The spec recommends five more (title, description, resource, tags, and timestamp) but treats them as optional. Producers can add any other fields they want, and consumers are told to tolerate unknown ones.

How does OKF help my brand show up correctly in AI answers?

AI assistants answer from whatever they can read about you. Publishing accurate, structured, agent-readable brand knowledge gives them a clean source, which is the goal of Generative Engine Optimization. A 2024 GEO study found structured, well-sourced content can raise a source's visibility in AI answers by up to 40%.

Is OKF a finished standard?

OKF is at version 0.1, published as a draft in June 2026. The spec is small and stable enough to use now, but expect it to grow. Because it is just markdown and YAML, files you write today stay readable even as the spec evolves.

thought-leadership

What Google's Open Knowledge Format (OKF) Means for Marketers

Google Cloud shipped the Open Knowledge Format: portable markdown files that let AI agents read curated knowledge. Here is what it means for marketers.

AI Marketing/Hendry Soong·Updated 23 Jun 2026·6 min read

The Open Knowledge Format: one portable file format your brand's knowledge can live in.

TLDR00 / 06

On 12 June 2026, Google Cloud published the Open Knowledge Format (OKF), an open spec for sharing curated knowledge with AI agents as plain markdown files. It was built for data teams. Read closely, it is also a blueprint for marketers: your brand's knowledge belongs in portable, agent-readable files the team controls, so any model can read it. This is how to read a data-engineering standard as a marketing playbook.

01What the Open Knowledge Format is

The Open Knowledge Format is an open spec, published by Google Cloud on 12 June 2026, for representing curated knowledge as a directory of markdown files with YAML frontmatter. Every concept is one file. The only required field is its type.

Sam McVeety and Amir Hormati, two engineering leads on Google's Data Cloud team, designed it to fix a data-warehouse problem. Tables and datasets carry no context, so the agents querying them guess. OKF gives each concept a plain-language file an agent can read, published as an open standard.

The format is deliberately small. A file holds YAML frontmatter and a markdown body, and concepts link to each other with ordinary markdown links, which turns a folder into a graph. Per the v0.1 spec, only type is required. Five more fields are recommended, not mandatory, and producers can add any others they want.

Field	Status	What it carries
type	Required	What kind of thing this is, e.g. a table, a metric, or a brand claim. The only required field.
title	Recommended	Human-readable name.
description	Recommended	One-line summary.
resource	Recommended	A link to the thing itself.
tags	Optional	Grouping labels.
timestamp	Optional	When it last changed.

One of Google's three design principles is "format, not platform." OKF is not tied to any cloud, database, model provider, or agent framework. In Google's words, it "will never require a proprietary account or SDK to read, write, or serve." The team shipped reference tools to prove it: an agent that documents a BigQuery dataset, a single-file HTML graph viewer, and three sample bundles built from GA4 e-commerce, Stack Overflow, and Bitcoin public datasets.

02Marketing has the same problem data teams do

A marketing team owns the same kind of asset a data team does: curated knowledge that other systems need to read correctly. Your brand voice, your ICP, your positioning, your approved claims, and your product facts are all concepts in the OKF sense. Most of them live in slide decks, wikis, and the heads of three senior people.

OKF formalizes a pattern the researcher Andrej Karpathy named the LLM wiki. Karpathy, quoted in Google's announcement, made the case in one line: "LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass." A model can read a hundred small files faster than a person can open one. For years, that knowledge was never written down in a form a model could read.

Reframe your brand guidelines as a directory of typed markdown files and the parallel is exact. One file per claim, one file per persona, one file per product. Each carries a type, a one-line description, and a link to its source of truth. This is the same move I described in agent-addressable content and context engineering: brand knowledge an agent can address directly, not prose buried in a PDF.

Frontmatter	A data team's concept	A marketing team's concept
type	BigQuery Table	Brand Claim
title	Orders	40% faster onboarding
description	one row per completed order	approved proof point

03Your own marketing agents need one source to read

Start with the internal case, because it pays off first. If you run AI agents for research, drafting, or social, each one needs your brand context to produce on-brand work. When that context is scattered across tools, every agent reads a slightly different version, and the output drifts.

This is the discipline Anthropic calls context engineering: "curating and maintaining the optimal set of tokens" an agent sees at inference time. An agent acts on whatever context it is given.

A single OKF-style directory fixes the drift at the root. One folder holds the canonical voice file, the ICP files, and the claim files, and every agent reads the same source. When positioning changes, you edit one file, and the next run of every agent reflects it. This is what my AI marketing framework means by keeping the context portable. The file is the system of record, and the agents are interchangeable.

the cost of scattered context

Point each agent at its own copy of your positioning, and the brand drifts a little each time.

One folder fixes it. Edit the canonical file once, and every agent's next run reflects the change.

Scattered context makes the brand drift. One shared folder holds it together.

04The external payoff: getting cited correctly

The external case is bigger, and you control less of it. AI assistants now answer questions about your brand directly, and at real scale. ChatGPT alone reached an estimated 800 to 900 million weekly users in 2025, by Andreessen Horowitz's count. When someone asks an assistant what your product does, the model answers from whatever it can read.

Often it reads wrong. In Skyword's June 2026 buyer research, when AI-generated information conflicted with a brand's own messaging, only 12% of people trusted the AI answer, and 54% went looking for outside sources to compare. Nearly one in five had already avoided a purchase based on what an assistant told them about a brand.

Publishing agent-readable brand knowledge is how you shape that answer. The academic version is Generative Engine Optimization, defined in a 2024 paper that showed structured, well-sourced content can lift a source's visibility in AI answers by up to 40%. The web already has lighter versions of the idea: schema.org structured data, and the proposed llms.txt file that hands LLMs a clean markdown summary of a site. Of the three, OKF goes furthest: it represents a whole graph of linked concepts, not just one page's markup or a site summary. Measuring whether it works is a share-of-model problem.

Standard	What it is	Scope
OKF	A directory of linked markdown concept files	A whole knowledge graph
schema.org	Structured-data vocabulary in page markup	Tags on individual pages
llms.txt	One markdown file at /llms.txt	A site summary for LLMs

From your files to the answer

publish once, read everywhere

01PublishYour brand knowledge as agent-readable files.

02Engines readChatGPT, Perplexity, AI Overviews, and Claude pull from them.

03Cited correctlyYour brand shows up as you wrote it.

04MeasureShare of model tracks how often that happens.

structured, well-sourced content lifts AI visibility up to 40%

Publish once, read everywhere: brand files shape what AI engines say about you.

05Portability is the moat

The strategic reason to adopt an open format is ownership. When your brand's knowledge lives inside one vendor's platform, you do not really own it. You use it on their terms, and only while you pay. If the contract ends or the tool drops the model you depend on, the knowledge is stranded. Files you control move with you.

Enterprises already behave this way about models. By Andreessen Horowitz's January 2026 CIO survey, 81% now run three or more model families in testing or production, up from 68% a year earlier. They will not depend on a single provider, and your brand knowledge needs the same hedge.

The value of a knowledge format comes from how many parties speak it, not from who owns it. Google Cloud, on releasing OKF as an open standard.

That is why the consumption side is standardizing too. The Model Context Protocol, an open standard for connecting agents to the systems where data lives, gives any compliant agent a common way to reach your files. Open format on the supply side, open protocol on the demand side. The brands that win the AI-native shift will be the ones whose knowledge is portable enough to feed any model, as I argued in model-agnostic AI marketing. A pile of tools you do not control is still a pile of parts.

Rented

Owned

Where it livesInside one vendor's platform

Where it livesPortable files you controlMarkdown and YAML, in your own repo.

Who can read itThat product's features

Who can read itAny model, via open protocolsOpen format in, open protocol out (MCP).

If the tool changesKnowledge is stranded

If the tool changesThe files move with youNo migration, no lock-in.

fig. the moat

Rented versus owned: portable files feed any model; a platform locks you in.

06How to start this quarter

You do not need Google's tools or a data warehouse to use any of this. The format is plain markdown and YAML, and the full v0.1 spec fits on a page. Start by writing your brand's Context layer as a small directory of files, one concept per file, and point your agents at it.

Pick the five concepts agents get wrong most often: your voice, your ICP, your top three claims, your category definition, and your pricing logic.
Write one markdown file per concept. Give each a type and a one-line description, and keep the body short and sourced.
Link concepts to each other with plain markdown links, so the folder becomes a graph an agent can walk.
Point your marketing agents at the folder as their first context, ahead of any prompt.
Publish a public subset for external engines. A clean llms.txt file or schema markup is the entry-level version of the same idea OKF formalizes.

None of this requires permission from a platform. The files are plain text you own, readable by whatever tool you point at them next.

EXPLOREai search visibility

Operations

Build Log #1: Building AI-Native Marketing Functions

15 Apr

Field report: augmenting a $366K marketing function with 1 hybrid operator + AI agents. 3 content engines, headless CMS. Plan 30-35% net savings (56% solo ceiling).

Strategy

Why Your AI Marketing System Should Be Model-Agnostic

8 Mar

Platform dependency always ends the same way. Why I build AI marketing systems that treat models as swappable execution layers.

Search Visibility

How to Make Your Brand Readable by AI Agents

4 Mar

Four steps to build the signal layer that gives AI agents accurate brand information: context files, schema markup, content structure, and audit.

Frequently Asked Questions

What is the Open Knowledge Format?: The Open Knowledge Format (OKF) is an open spec that Google Cloud released on 12 June 2026. It represents curated knowledge as a directory of markdown files with YAML frontmatter so AI agents can read it directly. Each concept is one file, and the only required field is its type.
Why should marketers care about a data-engineering standard?: Because brand knowledge is the same kind of asset OKF was built to share: curated context that other systems need to read correctly. Voice, ICP, positioning, and approved claims are all concepts an AI agent has to consume. OKF shows the file format that lets it.
Do I need Google Cloud to use OKF?: No. OKF is a format, not a platform. It is plain markdown and YAML, and Google states it will never require a proprietary account or SDK to read, write, or serve. You can write OKF-style files in any text editor and point your own agents at them.
How is OKF different from schema.org or llms.txt?: Schema.org structured data and the proposed llms.txt file are lighter versions of the same idea: publishing machine-readable information for engines to read. OKF is more complete because it represents a whole graph of linked concepts rather than a single file or a set of tags. The three can coexist.
What fields does an OKF file need?: Only one is required: type, a short string describing what the concept is. The spec recommends five more (title, description, resource, tags, and timestamp) but treats them as optional. Producers can add any other fields they want, and consumers are told to tolerate unknown ones.
How does OKF help my brand show up correctly in AI answers?: AI assistants answer from whatever they can read about you. Publishing accurate, structured, agent-readable brand knowledge gives them a clean source, which is the goal of Generative Engine Optimization. A 2024 GEO study found structured, well-sourced content can raise a source's visibility in AI answers by up to 40%.
Is OKF a finished standard?: OKF is at version 0.1, published as a draft in June 2026. The spec is small and stable enough to use now, but expect it to grow. Because it is just markdown and YAML, files you write today stay readable even as the spec evolves.

Built by AI Marketing Operator · Published 23 Jun 2026

Create-Articles v8.0.1 · Create-Images v4.5.0 · Create-Compiler v2.0.1

###