NotebookLM is now an agentic research workstation — on June 8, 2026, Google overhauled the tool from a source-grounded document assistant into a system that plans multi-step research, writes and runs code in a sandboxed cloud computer, and exports finished artifacts. It is the clearest sign yet that the "chat with your documents" era is giving way to "hand a researcher a brief."

The shift matters because it changes what a marketing team can credibly delegate. A tool that only chats produces talking points; a tool that runs code, normalizes a dataset, and generates a formatted deck produces drafts. That is a different category of help — and a different category of risk, because a confident draft full of unverified numbers is more dangerous than a hedged answer in a chat window.

This guide covers exactly what shipped, the cloud-computer-plus-skills architecture underneath it, how to read Google's vendor-stated win-rate numbers honestly, which of the 11 output formats matter for agency work, and the governance line every agency principal should draw before pointing this at client material. Everything below is sourced from Google's June 8 announcement and corroborating independent coverage.

Key takeaways

01
Document assistant became an agentic workstation.The June 8, 2026 overhaul moves NotebookLM from source-grounded chat to a tool that plans tasks, writes and runs code, and generates finished outputs — running on Gemini 3.5 Flash with Antigravity as the orchestration backbone.
02
Each notebook now has a secure cloud computer.An isolated virtual machine lets NotebookLM execute Python-level data work — calculations, dataset normalization, statistical analysis — in the background, instead of guessing numeric trends through token prediction. Vendor-described.
03
100+ skills and 11 export formats, auto-selected.Over 100 curated software skills are bundled in; the model picks them without manual configuration. Outputs now span 11 formats including PDF, DOCX, CSV, JSON, XLSX, and PPTX — first-draft decks and analyzed spreadsheets are in scope.
04
The win-rate numbers are Google's own internal eval.Google reports a 65% average win rate versus the prior system, with 69.9% on large-document analysis and 78.2% on advanced web research. These are vendor-stated, from a side-by-side internal evaluation, not independently replicated.
05
The agency line is automate the gathering, keep the judgment.NotebookLM can automate competitive data collection, format conversion, and first-draft artifacts. It cannot decide what an insight means for a client's positioning, make brand claims, or replace the senior interpretation layer. Drafts, not deliverables.

01 — What ShippedFrom source-grounded chat to an agentic research partner.

On June 8, 2026, Google announced a major NotebookLM overhaul in a post authored by Trond Wuellner (Director of Product Management) and Usama Bin Shafqat (Software Engineer, Google Labs). The headline: NotebookLM now runs on Gemini 3.5 Flash and Antigravity, replacing the prior Gemini 3-based stack, and Google says the upgrade delivers "even more accurate and reliable information along with better visibility into the thinking process."

Two points of precision matter before going further. First, the model is Gemini 3.5 Flash (API ID gemini-3.5-flash), which became generally available June 4, 2026 — not a "Gemini 3.5 Pro," which does not exist, and not Gemini 3.1 Pro. Second, Antigravity in this context is the reasoning and orchestration backbone inside NotebookLM that plans multi-step tasks and invokes skills — distinct from Antigravity the standalone, VS Code-based agentic IDE. Throughout this guide, Antigravity means the orchestration layer, not the coding environment.

The functional changes group into four shifts. You can now start from a vague question and have NotebookLM help build the source repository in chat — suggesting sources via Google Search, locating primary sources in other languages, and surfacing related authors, with the human accepting or rejecting each one. Reasoning steps are now visible in chat, giving a verification path where the model previously operated as a black box. You can provide detailed formatting instructions before generating, and edit outputs after. And underneath all of it, each notebook gets a cloud computer.

Before

Source-grounded chat

Gemini 3 · passive RAG + Deep Research

Answered questions from uploaded sources and, since November 2025's Deep Research, searched the web autonomously. Numeric reasoning was token prediction — prone to subtle, confident errors on data.

Talking points

After

Agentic workstation

Gemini 3.5 Flash · Antigravity · cloud computer

Plans multi-step tasks, writes and runs code in a sandboxed VM, selects from 100+ skills automatically, shows its reasoning, and exports across 11 formats. Recognizes a data request and hands it to the Antigravity engine.

Drafts you can edit

Release snapshot

NotebookLM's agentic overhaul was announced June 8, 2026, running on Gemini 3.5 Flash and Antigravity. It builds on the November 2025 Deep Research feature — the tool's first move from passive retrieval to autonomous web search — by adding code execution, the cloud computer, and full-format output generation. At launch the update is web-only; the Android and iOS apps were not confirmed to receive the new agentic features in the announcement.

02 — The EngineA secure cloud computer and 100+ curated skills.

The real story is not a model swap — it is that NotebookLM became a lightweight agent platform with pre-provisioned tools. Two pieces make that true. Each notebook is now equipped with a secure cloud computer: an isolated virtual machine that lets the tool write and run code in the background during a research session. That is what moves data work from guessing to computing — calculations, dataset normalization, and statistical analysis happen in code rather than by token prediction.

Layered on top are more than 100 curated software skills. Each skill is a predefined technical blueprint for a class of task — data transformation, statistical analysis, document generation. Crucially, users do not configure or invoke skills manually; the model selects them automatically based on the request. The cloud computer plus the skills library is a software layer, not just a smarter chatbot — and that framing is what puts NotebookLM in the same conversation as dedicated agent platforms.

Curated skills

Auto-selected blueprints

100+

Predefined technical recipes for data transformation, statistical analysis, and document generation. The model picks the right skill for the task with no manual configuration. (Vendor-stated.)

No setup required

Context window

Gemini 3.5 Flash input

1,048,576 input tokens and 65,536 max output. Generally available June 4, 2026. Google describes it as its most intelligent model for sustained agentic and coding performance.

65K output

Reasoning effort

New default level

Medium

Gemini 3.5 Flash defaults to medium effort, with a new minimal level tuned for speed and an improved low level for code and agentic tasks. Intermediate reasoning carries across multi-turn chats automatically.

+ minimal level

The shift in one sentence

The change isn't that the model got smarter — it's that NotebookLM stopped guessing at data and started computing it. When the model recognizes a data request, it hands the task to the Antigravity engine and the cloud computer runs real code, instead of predicting a plausible-looking number.

"Each notebook is now equipped with a secure cloud computer, enabling NotebookLM to write and run code useful for helping you perform deeper research and more complex analysis."— Google, NotebookLM announcement, June 8, 2026

This is why the "workstation" framing earns its keep. Passive RAG could only reorganize what you uploaded. An agent with a sandbox and a skills library can take raw inputs and produce something new — a normalized dataset, a chart, an analyzed spreadsheet — which is precisely the kind of work that used to sit between a junior analyst and a finished deliverable. For agencies already building agentic competitive intelligence workflows, this collapses several manual steps into one prompt.

03 — The NumbersThe win rates are real — and vendor-stated.

Google ran a side-by-side internal evaluation of the new system against the prior one and reported a 65% average win rate across five core evaluation dimensions — a 15-percentage-point margin above parity. Two dimensions stood out: a 69.9% win rate on large-document analysis and a 78.2% win rate on advanced web research and source discovery. Independent outlets including PPC Land, Chrome Unboxed, and 9to5Google all report the same figures.

Read those numbers as what they are. They come from Google's own internal side-by-side evaluation; they have not been independently replicated in academic or third-party benchmarks, and a "win rate versus the prior system" measures relative preference, not absolute accuracy. They are a credible directional signal that the agentic system is preferred to its predecessor on Google's own tasks — not a claim of independently verified superiority over any other vendor's tool.

Google's reported win rates · new system vs prior NotebookLM

Source: Google internal side-by-side evaluation, June 8, 2026 — vendor-stated, not independently replicated

Parity baselineA tie with the prior NotebookLM system

50%

Average win rateAcross five core evaluation dimensions

65%

Large-document analysisWin rate on long-source comprehension

69.9%

Web research & source discoveryWin rate on advanced research tasks

78.2%

How to cite this honestly

If you put these figures in a client deck, label them as Google's internal evaluation and frame them as a preference rate versus the previous version — not as an independent benchmark and not as a head-to-head against ChatGPT or Claude. We have deliberately omitted secondary-source claims that Gemini 3.5 Flash is "four times faster than competing LLMs" or "outperforms" a rival flagship: those did not appear in Google's primary announcement and we could not confirm them against a primary source.

04 — Output FormatsEleven output formats — and where the agency value sits.

NotebookLM now supports 11 downloadable output formats, up from a limited prior export set: data visualizations (PNG, SVG); documents (PDF, DOCX, Markdown, plain text); images via Nano Banana (PNG, JPG, GIF); structured data (CSV, JSON); Microsoft Excel (XLSX); and Microsoft PowerPoint (PPTX). The format count is independently confirmed by TechCrunch and Google's own post.

For agencies, the formats that change the workflow are XLSX, PPTX, and PDF. An analyzed Excel from raw spend data, a first-draft client deck from research documents, and a charted PDF report are the artifacts that used to consume junior hours. The table below maps each format to its agency use case. Note one important constraint: the agentic features — code execution, the cloud computer, and full output generation — went to premium tiers at launch, not the free tier.

NotebookLM's 11 downloadable output formats mapped to agentic availability and a typical marketing-agency use case. Agentic output generation shipped to premium tiers (Google AI Ultra and eligible Workspace business plans) at the June 8, 2026 launch, not the free tier.
Output format	Agentic availability	Marketing-agency use case
11 formats · agentic generation gated to premium tiers at launch
PDF report	Premium tiers	Client-ready campaign or competitive findings with embedded charts.
DOCX	Premium tiers	Editable briefs and strategy memos that drop into a team's doc stack.
Markdown / plain text	Premium tiers	Content drafts and notes that paste cleanly into a CMS or repo.
SVG chart	Premium tiers	Scalable vector visuals for decks and landing pages.
PNG chart	Premium tiers	Quick-share data visualizations for Slack and email.
Nano Banana image (PNG / JPG / GIF)	Premium tiers	Generated illustrative assets for reports and social drafts.
CSV	Premium tiers	Structured data extracts for import into a spreadsheet or BI tool.
JSON	Premium tiers	Machine-readable output that feeds a downstream automation.
XLSX (Excel)	Premium tiers	Analyzed spend or performance data ready for finance review.
PPTX (PowerPoint)	Premium tiers	First-draft client decks generated from research documents.
Plain text export	Premium tiers	Lightweight transcript or summary for archival and search.

Nano Banana is Google's image-generation model integrated into NotebookLM for the PNG, JPG, and GIF outputs; its Pro variant was introduced in November 2025 for the Infographics and Slide Deck features. For most agency reporting, the generated-image formats are a nice-to-have — the structured-data and document formats are where the time savings live, because those feed directly into AI-powered content strategy for agencies and existing reporting pipelines.

05 — The DecisionWhat to automate, and what to keep human.

This is the section no mainstream coverage has built, and the one agency principals actually need. NotebookLM's agentic outputs are drafts, not final deliverables. The tool cannot exercise strategic judgment on whether a competitive insight matters for a specific client's positioning, cannot make qualitative brand claims without sourced data, and cannot replace the senior analyst's interpretation layer. The verification responsibility stays with the practitioner.

The matrix below splits ten common agency tasks into automate, partial, and keep-human, with the reason and the required human review step for each. It is author analysis, drawn from the feature documentation and marketing-specific use-case coverage — treat it as a governance starting point, not a fixed rule set.

Decision matrix for marketing agencies: which research and content tasks to automate with NotebookLM, which to run partially, and which to keep human, with the reason and the required human-review step for each. Author analysis, June 2026.
Agency task	Verdict	Why	Required human-review step
Ten tasks · drafts, not deliverables
Competitive teardown from public pages	Automate	Structured gathering from sources you supply or approve.	Confirm every cited claim against the competitor's own page.
Format conversion (CSV to XLSX, doc to deck)	Automate	Deterministic transformation, low interpretation risk.	Spot-check totals and labels before sharing.
Campaign performance report	Partial	Pattern surfacing is fast; causal reads are not.	Analyst writes the so-what and the recommended action.
Raw spend and sales data analysis	Partial	Code execution handles math the model used to hallucinate.	Validate the formula logic and source-data integrity.
Multi-source trend synthesis	Partial	Good at collation, weak at ranking what actually matters.	Strategist decides which trend is signal for this client.
First-draft client deck (PPTX)	Partial	Saves the blank-page hours, not the judgment hours.	Rework the narrative, claims, and recommendations by hand.
Content brief from research	Partial	Solid skeleton from synthesized sources.	Editor adds angle, voice, and search intent.
Strategic interpretation of findings	Keep human	Requires judgment on what the insight means for positioning.	Senior owns the read; the tool only supplies inputs.
Client-facing claims and advice	Keep human	Liability and accuracy sit with the practitioner.	Never ship an unverified generated claim to a client.
Brand voice editing	Keep human	Voice is a qualitative call no eval can stand in for.	Human edit pass on every customer-facing word.

The pattern is consistent: NotebookLM is strong at gathering, converting, and drafting, and weak at deciding what matters and standing behind a claim. The tasks safe to fully automate are the deterministic ones — format conversion, structured data collection — where a wrong output is obvious and cheap to catch. Everything that requires a judgment about a specific client sits in the partial or keep-human columns, because that judgment is the service you sell.

"Instead of asking an AI to figure it out from the internet, NotebookLM lets you upload your own sources, and then works exclusively from that material."— Marketing AI Institute, on NotebookLM's core differentiator

06 — Data GovernanceWhat you should never upload — and why the tier matters.

The capability story has a compliance counterpart most coverage skips. On the free Standard tier, human reviewers may review queries, uploads, and model responses when a user submits feedback. Free-tier notebooks have no audit trail and no data-residency control, and they can be shared publicly via link — including every source document inside. That combination makes the free tier inappropriate for confidential client materials, PII, or commercially sensitive documents.

Paid Workspace and Enterprise tiers change the calculus. Google contractually commits that documents and outputs in paid tiers are never used to train foundational models, and enterprise tiers integrate with Google Cloud VPC Service Controls and Identity and Access Management. For any agency handling client data under an NDA, the tier is not a budget decision — it is a contractual one. Confirm the current terms before pointing a notebook at a client's confidential material.

The rule for agencies

Treat the free tier as a public bulletin board: fine for public web sources and your own marketing collateral, never for client PII, unreleased financials, or anything under NDA. Reserve confidential client work for a paid Workspace or Enterprise tier with the train-on-your-data exclusion and access controls in writing — and verify those terms currently apply to your plan.

07 — AccessWhich tier unlocks the agentic features.

NotebookLM cannot be purchased standalone — it is bundled with Google AI subscription plans. At the June 8 launch, the agentic update went to Google AI Ultra subscribers and Workspace business customers with AI Ultra Access and AI Expanded Access. Free, Plus, and Pro tier users were not included at launch, and Google said the update will come to other tiers "over time." The stated reason for the gate is infrastructure cost: running sandboxed cloud-computer instances for millions of simultaneous data-analysis queries carries compute overhead that makes free-tier delivery economically unviable at launch.

On pricing, treat specifics with care. Per-tier limits and monthly prices reported by third-party aggregators do not always match Google's official plans page, so the dependable move is to read current pricing directly from Google's AI plans page before committing a client budget. What is well established: Google AI Ultra became substantially more accessible after a price cut at Google I/O 2026 (May 19, 2026), and the June 8 NotebookLM update was the first major feature wave to land on that tier afterward — Google made the agentic tier cheaper, then immediately gave it powerful new capabilities.

Exploring the tool

Free Standard tier

Good for kicking the tires on public sources and your own marketing collateral. The agentic code-execution and full-output features are not available here at launch, and there is no audit trail — never upload confidential client material.

Public sources only

Agentic features

Premium AI Ultra tier

The cloud computer, 100+ skills, and 11-format output generation shipped to Google AI Ultra and eligible Workspace business plans at launch. This is the tier that turns NotebookLM into the workstation described here.

Unlocks the workstation

Confidential client work

Paid Workspace / Enterprise

Contractual no-train commitment plus VPC Service Controls and IAM integration. Required before any NDA-bound or PII-containing client material enters a notebook. Confirm current terms apply to your specific plan.

Required for NDA work

Pricing diligence

Verify before you buy

Third-party aggregator prices and per-tier limits can lag Google's official page. Read current pricing directly from Google's AI plans page before committing a client budget — do not quote a number you have not verified at the source.

Check the official page

For agencies, the practical sequence is to pilot on the free tier with public data, confirm the agentic outputs are worth the subscription against your real workflows, then move to a paid plan sized to your client-confidentiality needs — not just your usage volume. The governance tier almost always dictates the choice before the feature tier does.

08 — ImplicationsWhat it means for agencies this quarter.

Step back and the trend is clearer than any single feature. NotebookLM is the latest mainstream tool to cross from "answers your questions" to "does the task," and it is doing so inside a source-grounded box rather than the open web. That source-groundedness is the differentiator worth understanding: because NotebookLM works from files you upload plus sources it discovers with your permission, it reduces hallucination risk in citation-dependent work, at the cost of a narrower knowledge surface. That trade is exactly right for competitive intelligence and client reporting, where a sourced answer beats a comprehensive one. We are careful not to call it the first or only agentic research tool — ChatGPT and Claude both ship Deep Research modes; the difference is the grounding discipline, not the novelty.

Projecting forward, the second-order effect is what to watch. Once first-draft decks, analyzed spreadsheets, and competitive teardowns are a prompt away, the scarce skill stops being production and becomes judgment — knowing which insight matters, which claim is defensible, and which draft to throw away. Agencies that treat tools like this as a junior-analyst multiplier, with senior review as the non-negotiable last mile, will compound the time savings. Agencies that ship the drafts unedited will discover that a confident wrong number costs more than the hours it saved. The winning posture is to wire the gathering and drafting into your agentic SEO and research workflows while keeping a human firmly on the interpretation layer.

Where it fits in a stack

NotebookLM is not a replacement for a measurement stack or a campaign platform — it is a research-and-drafting accelerator that sits upstream of them. Use it to compress the gathering and first-draft work, then route the analyzed outputs into your analytics and reporting practice and your content production engine where the human judgment and brand voice get applied. Read its data outputs the same way you would read any AI-powered analytics for marketing decisions — as inputs to verify, not conclusions to ship.

09 — ConclusionA workstation, not a replacement.

The shape of agentic research, June 2026

NotebookLM now drafts the work — your team still owns the judgment.

Google's June 8 overhaul is the most consequential NotebookLM update since Audio Overviews. By giving every notebook a secure cloud computer, a library of 100+ auto-selected skills, and 11 output formats, it crossed the line from a tool that talks about your sources to one that works on them. For a marketing team, that means competitive teardowns, analyzed spreadsheets, and first-draft decks move from junior-hours to a prompt.

The honest framing keeps the win-rate numbers in their lane: 65% average, 69.9% on documents, 78.2% on web research are Google's own internal preference rates versus the prior system, not independent benchmarks. They are a real directional signal, not a head-to-head verdict against any other vendor. Treat them that way in anything you put in front of a client.

The durable lesson is the one the decision matrix encodes: automate the gathering, the conversion, and the drafting; keep the interpretation, the claims, and the brand voice human. The agencies that thrive with tools like this are not the ones that delegate the most — they are the ones that draw the automate-versus-keep-human line cleanly, and put their senior judgment exactly where the tool cannot reach. NotebookLM just made that line more important, not less.

NotebookLM Is Now an Agentic Research Workstation

01 — What ShippedFrom source-grounded chat to an agentic research partner.

Source-grounded chat

Agentic workstation

02 — The EngineA secure cloud computer and 100+ curated skills.

Auto-selected blueprints

Gemini 3.5 Flash input

New default level

03 — The NumbersThe win rates are real — and vendor-stated.

Google's reported win rates · new system vs prior NotebookLM

04 — Output FormatsEleven output formats — and where the agency value sits.

05 — The DecisionWhat to automate, and what to keep human.

06 — Data GovernanceWhat you should never upload — and why the tier matters.

07 — AccessWhich tier unlocks the agentic features.

Free Standard tier

Premium AI Ultra tier

Paid Workspace / Enterprise

Verify before you buy

08 — ImplicationsWhat it means for agencies this quarter.

09 — ConclusionA workstation, not a replacement.

NotebookLM now drafts the work — your team still owns the judgment.

Put agentic research to work without giving up senior judgment.

Agentic research engagements

The questions we get every week.

Continue exploring agentic AI for marketing.

Why Agentic AI Projects Get Canceled (and How to Ship)

ChatGPT Work: OpenAI's Agent That Ships Finished Work

Agent Washing: The Definition — and a Scorecard to Catch It

The AI Agent Build & Run Cost Index 2026: Real Numbers