Build Log #1: Building AI-Native Marketing Functions
Build log documenting my journey from AI marketing framework to 7 engines, 109 curated iterations, and a "headless" website built from zero in 30 days.
The typical content execution team runs around $216,000 a year. That is a writer and designer in-house, plus SEO, translation, and web maintenance handled by agencies. A website build adds another $5,000 to $50,000 one-time. This build log documents what replacing that with AI agents actually looks like: seven engines at under $12,000 a year, translation at under $1 per article via DeepL, and a headless website built with coding agents. No agency build fee. No maintenance retainer. Hosting costs scale from free tiers to $500 to $2,000 a year as the site grows. Around $200,000 in annual execution costs, replaced. No added headcount. If you are a business leader evaluating whether AI-native marketing execution is real or a thought experiment, this is the field report.
01the Messy Middle: Content Production, Headless Infrastructure, Shared Context Layers
The typical content execution team runs around $216,000 a year. Writer and designer in-house, with SEO, translation, and web maintenance handled by agencies.
Add a website. Agency builds run $5,000 to $50,000 one-time. Ongoing maintenance is a retainer or a shared resource.
I explored what it would take to replace that execution layer. Everything on this website was produced by the system I built to find out.
The model is one operator directing AI engines. The operator brings strategic judgment: content strategy, brand positioning, SEO direction, brief writing, and final approval. The engines produce the output.
I run this as a single operator. In a team, this becomes one or two marketing generalists replacing four to six specialists. The hire profile changes. The team size drops. Output capacity goes up.
Role | What it covers | Engine | Status |
|---|---|---|---|
Content writer | Articles, extensible to landing pages, email copy, social posts, case studies | Create-Articles | Live |
Designer | Hero visuals, diagrams, brand assets | Create-Images | SVG live. Raster in progress. |
SEO specialist | Technical SEO, schema, structured data, AI search visibility | CMS + schema layer | Live. Off-page and authority building stays human. |
Content editor | Quality validation, voice compliance, field checks | Create-Compiler | Live |
Translator | Multi-language content, locale routing, cultural adaptation | DeepL + i18n | Live. Human review for idiom and nuance. |
Website developer | Headless CMS, infrastructure, deployment | Coding agent + headless stack | One-time build. Hosting scales from free tiers to $500 to $2,000 a year. |
Content strategist | Brief writing, SEO strategy, brand positioning, backlink outreach | Operator | Human. Feeds all engines. |
This is Issue #1: the content production chain. What is live, what is partial, and where the gaps are. The demand generation layer comes next.
02The Constraints That Shaped Everything
AI agents have no persistent memory. Every session starts blank. The agent that wrote your brand guidelines yesterday has no record of them today unless you explicitly load them back in.
Context windows have grown fast. Early models held around 8,000 tokens. Current flagship models hold significantly more:
Model | Context window |
|---|---|
Claude Opus 4.7 (Anthropic) | 1 million tokens |
Gemini 2.5 Pro (Google) | 1 to 2 million tokens |
GPT-4o (OpenAI) | 128,000 tokens |
The Engine Split documents what happened at scale: the original monolithic engine hit 84,000 tokens, rules dropped, output drifted off-brand, and the architecture had to be rebuilt. These numbers change with every model release, but the design principle holds: keep engines focused and token budgets explicit.
The second constraint is model churn. The best model today is not the best model in six months. A system built around one model gets rebuilt when it gets replaced.
The tool landscape has the same problem. There are over 15,000 martech solutions available today. Most marketing teams use less than half of what they buy. The architecture connecting them closes the gap.
These three constraints shaped every decision in this system: no persistent memory, model churn, and tool fragmentation. The engines are model-agnostic by design: swap Claude for Gemini or GPT without rebuilding the workflow. The context files load explicitly every session. The architecture is the constant.
This also means avoiding tools built on top of a single model. Many AI marketing tools today are a thin interface over one provider's API. When that model changes, gets deprecated, or a better alternative arrives, the tool breaks with it. Building on open infrastructure instead of AI-native SaaS removes that dependency entirely. The system runs on whichever model performs best at the time, not whichever one a vendor chose when they built the product.
Architecture is the constant. Tools and models are interchangeable.
03The Rules I Built On
In mid-2025, I published three articles that formed a thesis. The AI Marketing Strategy Gap diagnosed the problem: 88% of organisations had adopted AI, but martech usage sat at 49%. The tools existed without the architecture to connect them.
Could AI Replace an Entire Marketing Team? broke marketing work into atomic and composite jobs. AI can replace the execution layer, but only with the right architecture and a skilled operator connecting the parts.
Then the AI Marketing Framework laid out that architecture: 3 layers, 11 engines, autonomy levels from L1 to L5.
By late 2025, I was building bespoke AI marketing tools with Maciej at growthsetting. Content engines, data pipelines, AI search tracking.
growthsetting is a services business, helping companies and marketing leaders build AI systems. hendry.ai is a different question: what does the future of marketing look like in the agentic age, from a marketing leader's perspective. The two are separate.
I have always understood some code. Knowing frontend languages like CSS, HTML, and JS was a credential used for measuring performance more effectively. I read documentation and worked alongside engineers. That background matters here. When a coding agent suggests installing a package from the command line or running a destructive Git operation, I can read the command and assess the risk before executing it.
Coding agents have changed this. A non-technical operator can now build with code through an agent. But the risk assessment layer is still human. When the agent suggests a destructive Git operation or installs an unknown package, someone needs to recognise what that means. Technical literacy does not mean writing code. It means reading what the agent is doing and knowing when to say no.
Coding agents write the code. The operator reads the output, understands the architecture, and catches what is wrong. That is a strategic skill.
Before building anything, I set five rules that would govern the entire system:
Model-agnostic. No vendor lock-in. The pipeline spec has zero Claude-specific instructions. Any LLM can run it. When a better model arrives, I swap it without rewriting the system.
Brand-agnostic. Same engines, different voice files. Change the ICP, messaging, and company context. The engine code stays identical.
Infrastructure-agnostic. Payload CMS and Neon Postgres today, Supabase tomorrow. No enterprise contracts. Open-source, serverless, portable.
Modular. Each engine handles one function. Articles, images, validation, competitive intelligence. All separate, all connected through contracts. The same architecture extends to social, presentations, pitch decks, and emails. Different output formats, same engines.
Governable. Every engine has its own version history, changelog, and backlog. When something breaks, I can trace which engine, which version, and which rule changed. Isolation means one failure does not cascade downstream. It is tempting to let the AI run freely. But without versioning and observability, you cannot tell what shifted, and you cannot fix what you cannot trace.
Several things converged in late 2025 that made this possible. Coding agents matured enough to hold an entire project in context, write correct code across files, and iterate on errors without losing track of the architecture. A technical marketer with the right agent could now build what previously required a dedicated engineering team.
MCP (Model Context Protocol) arrived. Agents could finally talk to databases, APIs, and external services without custom glue code. The coordination problem that blocked a complete system started to dissolve.
Infrastructure costs dropped to near zero. Neon Postgres, Vercel hosting, Payload CMS. All free tier or open source. Production-grade infrastructure with no enterprise contracts and no DevOps team.
04Building the Content Engines
The first workflow was written on Christmas Day 2025. A baseline pipeline was running six weeks later. The iteration has never stopped. Version 8 is current.
The first failure came six days later. Version 3.8 added an 89-line validation checklist. Output quality dropped immediately. The model got confused about what to prioritise and started skipping rules at random. The fix, version 5, was obvious in retrospect: one completed example works better than 89 lines of instructions. That became Principle 3.
Version 7 introduced evidence-based validation. The engine had been checking its own work, asking itself whether it had followed the rules. It would confirm it had, even when it had not. The new system used deterministic checks: pattern matching, structural analysis, field presence. LLMs lie about validation. Evidence defeats hallucination.
By 1 February, the engine had grown to 84,312 tokens per session. Sessions were stalling before generation finished. The context window was filling with SVG templates before the article content was written. That day, 14 SVG templates moved to a new engine: Create-Images. Create-Articles dropped to 68,994 tokens. Four days later, Anthropic shipped a multi-agent architecture using the same separation principle.
Create-Images evolved separately from there. Testing across four tools established what each handled best: Claude SVG for exact brand colour diagrams, Gemini for conceptual work, Grok for human figures. DALL-E was dropped for inconsistent instruction-following.
Create-Compiler came last. Post-assembly validation revealed a category of problems neither upstream engine could catch: SVGs left-aligned in compiled pages, cross-boundary ID mismatches, source attribution only crediting one engine. Version 1.3 added a reverse manifest, closing the feedback loop back to source engines. By version 2, the role had changed from assembler to validator. 22 checks.
In April, all three engines migrated from WordPress HTML to Payload CMS Lexical JSON AST. The pipeline now runs natively against the headless stack. A brief goes in. A validated draft appears in the CMS.
05The Replicate Test
The question every system has to answer: does this only work for one brand?
I took the entire content engine to a client brand. Changed the voice files, the ICP, the messaging, the company context. The engine code, the infrastructure, and the workflow all stayed the same.
Approximately 80% of the system was universal. The 20% that changed was brand-specific configuration: who you are talking to, what you sound like, and what you are selling.
The 20% maps directly to the context layer. Voice rules, ICP definition, messaging framework, and company positioning. These are the files that make content sound like your brand instead of generic AI output. Every new brand deployment is a new context layer, not a new system.
The same infrastructure now runs two live brand sites. Same CMS, same deployment pipeline, same engine architecture. Different voice files, different content, different audiences.
An asset deploys once and replicates across a portfolio. A bespoke project starts from scratch every time.
06Headless CMS Is Not Optional Soon
The engines produced content, but had a publishing problem. WordPress has no programmatic API that agents can use reliably.
But the problem was broader than the website. Content marketing does not end at blog posts. The same structured content needs to become social posts, emails, pitch decks, battlecards, and presentations. A website is one output format. You need a content warehouse: structured content in a database with an API that any system can read from and write to.
That is what headless means. Backend separated from frontend. Content stored as structured data, not page layouts. A website renders articles from it. An agent publishes a draft to it. A social engine pulls from it. An email engine reads from it. All through the same API.
WordPress bundles content and presentation into one system. That works when humans click publish on blog posts. It becomes the bottleneck when agents need to create, validate, and distribute content across multiple formats programmatically.
On 27 February 2026, I started building hendry.ai from scratch. The stack: Payload CMS, Neon Postgres, GitHub, Vercel. Open-source, serverless, no enterprise contracts.
54 sessions and 30 days later, the site was live on a custom domain. Dual-theme support, reading progress, structured data, draft preview, and on-demand content updates.
The migration from WordPress took roughly 20 sessions. Every article had to be translated from HTML into structured JSON.
Internal links remapped to new URL paths. 41 redirects created for old URLs. Source categories normalised. Meta descriptions trimmed to fit CMS validation. SVG colours tokenised for dark mode. Every left-border callout box redesigned. It was the highest-effort, lowest-glamour work in the entire build.
Along the way, the CMS fought back. Edits made in the admin panel did not appear on the live site. The default behaviour was to cache everything with no revalidation.
Middleware for URL redirects crashed the serverless edge runtime with circular requests. That entire approach had to be abandoned. A single invisible newline character in an environment variable blocked draft preview for two days. Database type migrations required four manual steps because the CMS could not alter column types directly.
I tried self-hosting fonts for better performance. Mobile load time went from 2.9 seconds to 4.9 seconds. The "best practice" made it worse. Reverted.
After every problem was solved, another surfaced at a different layer. That is what the messy middle looks like.
But the outcome justified the cost. The system now runs a fully agentic flow: an article brief goes in, engines generate content and visuals, the compiler validates 22 fields, and a draft appears in the CMS. Ready for a human to review and publish at the click of a button.
The site is also multilingual. The internationalisation infrastructure, three languages, field-level content localisation, and a DeepL translation pipeline, was built in a single session.
Professional human translation runs $0.10 to $0.30 per word (Slator, 2026). DeepL produces the first draft for under $1 per article. A human proofreader reviews for tone and idioms at a fraction of the from-scratch rate. Multi-language content runs 70 to 80% lower than from-scratch translation.
If your organisation is investing in AI-generated content but publishing through a legacy CMS, the CMS is probably your biggest hidden blocker. The infrastructure determines whether AI can publish.
07The Substrate Experiment
Building one engine is like hiring one specialist. Building five that work together is like running a cross-functional team. A content manager talks to a marketing manager talks to a product manager. Each owns their domain. Each has people executing underneath.
The mental model for working with agents at scale is the same. One agent per engine acts as the project lead: strategy, cross-engine coordination, decisions. Each engine has an executor, a coding agent that builds and delivers. The project leads discuss what they need from each other. The executors build to spec.
That was the starting workflow. Claude Projects held the strategy and instructions for each engine. Claude Code was the executor. Simple, isolated, contained. Moving the engine builds to GitHub came early: any coding agent can clone the repo and work. Executor portability. That became the Governable rule: version history, changelogs, backlogs, traceable failures.
Solving the project lead layer took longer.
Claude Projects uses locked context. Each project lead knew everything about its own engine and nothing about the others. When I started building Signal, its project lead had no visibility into what content assets Create-Articles was capable of producing: what formats were live, what topics had been covered, what was staged. Five engines. Five isolated sessions. No shared awareness. I had built exactly the disconnected architecture the framework was designed to prevent.
The fix was a shared context layer across all engines. An ORCHESTRATOR.md became the constitution: a specification any agent reads, independent of interface or provider. Baseline context files (ICP, voice, messaging, company) sit in a shared location that all project leads load before each session. Integration contracts define what each engine expects from the others. A source registry persists across articles.
When I work on Signal today, it reads the integration contracts and knows exactly what content assets Create-Articles has staged. When voice rules change, every engine picks up the update. A cross-repo audit confirmed it: zero stale copies of any context file across the entire system.
Different engines need different levels of coupling. Create-Images works from the visual placeholders in the article output. It needs a clear handoff contract, not the full brand context. Create-Articles needs the complete context layer: voice, ICP, messaging, company positioning. Without it, output is generic.
Create-Compiler spans both, validating against field structure and voice rules simultaneously. Knowing which engine needs a contract, which needs full context, and which needs both is an operator decision. An inexperienced operator gives every engine full context and wonders why the token budget explodes. An experienced operator scopes each connection to what the engine requires.
Governance works both ways. The system catches the agent when it produces content that violates voice rules. But it also catches the operator. A voice audit on this article found 8 violations in my own draft: opposite-line patterns, dramatic thesis sentences, an AI hallmark word. Six were fixed. Two were kept intentionally because the evidence supported them. The system does not care who wrote the content. It enforces the rules.
This is the same problem every marketing organisation has. Channels do not talk. Content, social, email, and paid all operate from different briefs with different assumptions. If your AI engines do not share a single source of truth, your marketing will sound like it came from ten different companies.
I ran the full cross-engine flow end to end: Create engines to headless CMS to cross-repo orchestration. The agents navigated across repositories, read the shared context, and executed the handoffs. That run validated the architecture at the system level, not just the engine level.
That produced Principle 12: the agent is disposable. The orchestration layer is permanent.
08What's Being Built Now
The content engines are stable. Now I am building the layers that feed them.
Content was first for a reason. Most marketing teams adopted AI in content first. Getting the output right takes architecture: brand voice that holds at scale, visuals consistent and on-brand, generative engine optimisation built in by default, one article repurposed into social posts, presentations, emails, and battlecards. That is what the content production layer has to solve before anything downstream can run.
Brand-consistent quality at scale is the actual challenge. If your GTM outbound sounds like AI slop, you are alienating prospects faster. Content quality had to be solved first.
The GTM layers come second because they depend on knowing who you are going after. Without a defined ICP, tools like Clay sit there and leak cash. You are enriching accounts you have not validated. Spraying and praying with better-looking spreadsheets is still spraying and praying.
Before turning on ABM, you need to spend time defining your target. That definition should be a continuous iteration of first-party data: customer feedback loops, win/loss analysis, support ticket patterns, product usage signals, and churn reasons. Static ICP documents are a 2024 artifact. It is a living dataset that improves with every closed deal and every lost one. The risk is ICP drift: accounts gradually slide toward broader, less-fit profiles as the team optimises for volume over fit. A mature ICP model tracks acquisition fit and expansion fit separately, because most B2B revenue compounds from existing customers, not new logos.
Three new engines are in development. Listen detects composite signals relevant to your ICP: first-party triggers (pricing page activity, product usage patterns, email engagement) layered with third-party signals (competitor research, job change triggers, topic surges). A single intent score is a 2023 tool. What matters in 2026 is signal stacking and how fast you act on it. Signal prioritises accounts from that composite picture and routes them to the team. Data handles identity resolution and enrichment within GDPR and CCPA constraints: compliant enrichment is a baseline requirement, not a configuration option.
Some will say just use Clay. Before Clay it was ZoomInfo. Before ZoomInfo it was Clearbit. The tool changes every cycle. The capability stays the same: data enrichment, contact discovery, account scoring. The Listen and Signal engines exist to solve the upstream question first: who, and why.
Underneath all of this sits the data infrastructure layer. Garbage data in, garbage output out. I did not start with Data because a fresh website has no data to unify. But most companies should start there, or run it in parallel with content builds.
Meta reports in UTC. Google uses account timezone. LinkedIn defines a click differently than Meta. Attribution windows vary from 1 to 28 days depending on settings someone chose 18 months ago. Anchor on first-party sources: your CRM, website behaviour, and product telemetry. Those are the sources of truth. Paid platform data supplements them. Stitching ad platform walled gardens together is the trap: the platforms actively fight unification, the project never ends, and every engine downstream still reasons against inconsistent inputs.
The intelligence layer is only useful if it triggers something. Once Signal prioritises an account, a Slack alert routes to the right rep with a generated battlecard, an email sequence fires, or an ABM campaign activates. The activation layer connects scored intelligence to execution. That is the next build after the intelligence engines are stable.
The flow runs in both directions. When Listen detects a competitor move or surfaces a pricing objection pattern in churned accounts, that signal does not just go to the sales team. It briefs the content engine on what to write next. Market intelligence shapes content strategy. The same substrate pattern that connects Create-Articles to Create-Images can connect Listen to Signal to Data to Create: each engine reads from the shared context layer, processes its domain, and passes structured output forward and backward through the chain.
09What It Actually Costs
Here is the cost breakdown function by function, at US salary averages from Glassdoor (April 2026) and agency rates from Clutch (April 2026). The system handles execution work. Strategy, brand, and oversight stay human.
Function | Traditional cost | AI-native tool | What stays human |
|---|---|---|---|
Content production | $85K/yr (writer) | Claude, Gemini, OpenAI, or Grok ($20 to $200/month per tool) | Content strategy, topic selection, editorial judgment |
Visual production | $63K/yr (designer) | Claude, Gemini, OpenAI, or Grok ($20 to $200/month per tool) plus programmatic SVG generator | Brand decisions, design system, hero composition |
SEO implementation | $42K/yr (agency retainer) | Built into content engine | Keyword strategy, content architecture, SEO audits |
Web development | $5K to $50K one-time (web agency) | Claude or Gemini ($20 to $200/month per tool) | Architecture decisions, security review |
Website build (one-time) | $5K to $50K (agency) | Claude or Gemini (subscription during build window) | Stack selection, brand system setup |
Website maintenance (recurring) | Agency retainer | Hosting from free tiers to $420 to $2,000 a year (Vercel + Neon) | Upgrades, security patches |
Translation | $0.10 to $0.30/word | DeepL (usage-based, under $1/article) | Tone review, idiom checks |
Operator time | The marketing manager is already in the team. No added headcount. | 1 senior marketing operator. Replaces all 4 execution roles. Or the founder's own time. | Strategy, architecture, judgment, risk |
Total recurring AI stack is under $1,000 per month. One primary subscription plus supplementary tools for specific gaps, where Claude Max, Google AI Pro, ChatGPT Pro, or SuperGrok sit at equivalent tiers. On this site: Claude Max with Gemini Pro and Grok as supplements, and DeepL for translation. Web maintenance sits on free tiers (Neon Postgres, Vercel, self-hosted fonts) plus a separate small budget line for domain renewals and monitoring.
The traditional execution stack runs around $216,000 a year: writer ($85K), designer ($63K), SEO agency ($42K), translation ($14K), and web maintenance ($12K). The AI stack runs under $12,000 a year in tools. BLS median for marketing managers is $161K (May 2024). One operator replaces the full execution layer. On this site, that is one person with 15+ years of marketing strategy experience. In a larger organisation, the model is the same.
The system currently runs 2 live sites plus 1 parked on the same infrastructure. hendry.ai went from first commit to production in 30 days, 92 build sessions over roughly 50 calendar days, 109 logged entries across 7 engines, and 75 operating principles extracted.
The "Stays Human" column is the point. The context layers, the system architecture, the deep domain expertise of how a performant marketing team operates. That is the operator's job. Years of experience in marketing strategy, content architecture, and go-to-market is what separates a sub-$1,000-per-month system that produces professional output from one that produces noise.
The execution layer collapses into one operator running AI agents. Writer and designer in-house, SEO, translation, and web outsourced to agencies. One person replaces the full stack. The operator provides strategy, judgment, and architecture. The agents handle execution.
The CMS has three roles built in: admin (the operator), editor (for future team members), and agent (AI engines, restricted to creating drafts only). Engines create. Humans approve. That is the org chart.
The architecture already scales beyond one operator. The RBAC roles, the modular engines, the shared context layer, and the integration contracts are designed for a team. Adding a second operator or a specialist editor does not require rebuilding the system. It requires a new user with the right role.
Honest trade-offs: token costs scale with content volume. Hallucination in edge cases still requires human review. Governance adds overhead. The image engine is still in development. The operator's time has real cost. This system works because the operator has 15+ years of marketing strategy experience. Without that foundation, the same tools produce noise.
The operator governs. The engines execute.
Sources: Glassdoor Content Marketing Specialist, Glassdoor Graphic Designer, Clutch SEO Agency Pricing, Verbolabs Translation Rates, Clutch Web Development Pricing.
10What's Next
The system at completion runs a closed loop. Content engines produce assets. Demand gen engines detect signals, score accounts, and surface them for activation. The activation layer routes intelligence to execution: the right rep, the right message, the right moment. The signals that come back from the market, what is resonating, what objections are surfacing, what competitors are doing, brief the content engine on what to write next. The operator governs a system that generates content based on what the market is telling it.
The operator logs capture every entry in real time. If you want the raw data with timestamps and per-engine breakdowns, the full operator logs are public. This build log is the curated version.
- What is a build log?
- A monthly publication documenting what was shipped, what broke, and what was learned while building an AI marketing system. Each issue covers one month of development with specific entries, version numbers, and extracted principles.
- How is this different from the operator logs?
- The operator logs are raw, timestamped, per-engine reference entries with extracted principles. The build log is the narrative version that connects the entries into a story with context and analysis.
- What is the AI Marketing Framework?
- An architecture for building AI marketing systems. It organises marketing functions into 11 engines across 3 layers (Foundation, Execution, Optimisation) with autonomy levels from L1 to L5. Published in July 2025.
- How many engines are in production?
- Seven as of April 2026: Create-Articles (content generation), Create-Images (SVG visual generation), Create-Compiler (field validation), Create-Social (social content), Listen-Competitors (competitive intelligence), Create-Articles-Replicate (portable content engine), and Listen-Competitors-Replicate (portable competitive intel). Three more are planned.
- What is a coding agent?
- An AI tool that can write, edit, and debug code across an entire project while maintaining architectural context. Coding agents enabled technical marketers to build production systems that previously required dedicated engineering teams.
- What does Stage 2 mean?
- The system currently has documented contracts between all engines, feedback loops, and quality gates. Stage 1 was independent engines with manual routing. Stage 3 will add semi-automated routing. Stage 4 is full agent team orchestration.