Deep Research
Create high-fidelity research reports with strict format control, evidence mapping, source governance, and multi-pass synthesis.
Architecture: Lead Agent + Subagents
Lead Agent (coordinator β minimizes raw search context)
|
P0: Environment + source policy setup
|
P1: Research Task Board (roles, queries, parallel groups)
|
Dispatch βββ Subagent A βββ writes task-a.md βββ
βββ Subagent B βββ writes task-b.md βββ€ (parallel)
βββ Subagent C βββ writes task-c.md βββ
| |
| research-notes/ <βββββββββββββββββββββββββ
|
P2: Build citation registry with source_type + as_of + authority
P3: Evidence-mapped outline with counter-claim flags
P4: Draft from notes (never from raw search results)
P5: Counter-review (claims, confidence, alternatives)
P6: Verify (every [n] in registry, traceability check)
P7: Polish β final report with confidence markers
Context efficiency: Subagents' raw search results stay in their context and are discarded. Lead agent sees only distilled notes (~60-70% context reduction).
Mode Selection
Determine the research mode before starting:
| Dimension |
Options |
| Topic Mode |
Enterprise Research (company/corporation) OR General Research (industry/policy/tech) |
| Depth Mode |
Standard (5-6 tasks, 3000-8000 words) OR Lightweight (3-4 tasks, 2000-4000 words) |
- Enterprise Research Mode: Six-dimension data collection with structured analysis frameworks (SWOT, risk matrix, competitive barrier quantification)
- General Research Mode: Standard P0-P7 research pipeline with source governance
- Depth Selection: Lightweight for single entity/concept < 30 words; Standard for multi-entity comparison or "ζ·±ε
₯"/"comprehensive" requests
Source Governance (V6)
Source Accessibility Classification
CRITICAL RULE: Every source must be classified by accessibility:
| Accessibility |
Definition |
Examples |
Usage Rule |
public |
Available to any external researcher without authentication |
Public websites, news articles, WHOIS (without privacy), academic papers |
β
Always allowed |
semi-public |
Requires registration or limited access |
LinkedIn profiles, Crunchbase basic, industry reports (free tier) |
β
Allowed with disclosure |
exclusive-user-provided |
User's paid subscriptions, private APIs, proprietary databases |
Crunchbase Pro, PitchBook, private data feeds, internal databases |
β
ALLOWED for third-party research |
private-user-owned |
User's own accounts when researching themselves |
User's registrar for user's own company, user's bank for user's own finances |
β FORBIDDEN - circular verification |
β οΈ CIRCULAR VERIFICATION BAN: You must NOT:
- Use user's private data to "discover" what they already know about themselves
- Research user's own company by accessing user's private accounts
- Present user's private knowledge as "research findings"
β
EXCLUSIVE INFORMATION ADVANTAGE: You SHOULD:
- Use user's Crunchbase Pro to research competitors
- Use user's proprietary databases for market research
- Use user's private APIs for investment analysis
- Leverage any exclusive source user provides for third-party research
Source Type Labels
Every source MUST also be tagged with:
| Label |
Definition |
Examples |
official |
Primary source, official documentation |
Company SEC filings, government reports, official blog |
academic |
Peer-reviewed research |
Journal articles, conference papers, dissertations |
secondary-industry |
Professional analysis |
Industry reports, analyst coverage, trade publications |
journalism |
News reporting |
Reputable media outlets, investigative journalism |
community |
User-generated content |
Forums, reviews, social media, Q&A sites |
other |
Uncategorized or mixed |
Aggregators, unverified sources |
Quality Gates:
- Standard mode: β₯30% official sources in final approved set
- Lightweight mode: β₯20% official sources
- Maximum single-source share: β€25% (Standard), β€30% (Lightweight)
- Minimum unique domains: 5 (Standard), 3 (Lightweight)
AS_OF Date Policy
Set AS_OF date explicitly at P0. For all time-sensitive claims:
- Include source publication date with every citation
- Downgrade confidence if source is older than relevant horizon
- Flag stale sources in registry (studies >3 years, news >6 months for fast-moving topics)
P0: Environment & Policy Setup
Check capabilities before starting:
| Check |
Requirement |
Impact if Missing |
| web_search available |
Required |
Stop - cannot proceed |
| web_fetch available |
Required for DEEP tasks |
SCAN-only mode |
| Subagent dispatch |
Preferred |
Degrade to sequential |
| Filesystem writable |
Required |
In-memory notes only |
Set policy variables:
AS_OF: Today's date (YYYY-MM-DD) - mandatory for timed topics
MODE: Standard (default) or Lightweight
SOURCE_TYPE_POLICY: Enforce official/academic/secondary/journalism/community/other labels
COUNTER_REVIEW_PLAN: What opposing interpretation to test
Report: [P0 complete] Subagent: {yes/no}. Mode: {standard/lightweight}. AS_OF: {YYYY-MM-DD}.
When researching a specific company/enterprise, follow this specialized workflow that ensures six-dimension coverage, quantified analysis frameworks, and three-level quality control.
Enterprise Workflow Overview
Enterprise Research Progress:
- [ ] E1: Intake β confirm company entity, research depth, format contract
- [ ] E2: Six-dimension data collection (parallel where possible)
- [ ] D1: Company fundamentals (entity, founding, funding, ownership)
- [ ] D2: Business & products (segments, products, revenue structure)
- [ ] D3: Competitive position (industry rank, competitors, barriers)
- [ ] D4: Financial & operations (3-year financials, efficiency metrics)
- [ ] D5: Recent developments (6-month events, strategic signals)
- [ ] D6: Internal/proprietary sources (or note limitation)
- [ ] E3: Structured analysis frameworks
- [ ] SWOT analysis (evidence-backed, 4 quadrants Γ 3-5 entries)
- [ ] Competitive barrier quantification (7 dimensions, weighted score)
- [ ] Risk matrix (8 categories, probability Γ impact)
- [ ] Comprehensive scorecard (6 dimensions, weighted total)
- [ ] E4: L1/L2/L3 quality checks at each stage transition
- [ ] E5: Draft report using 7-chapter enterprise template
- [ ] E6: Multi-pass drafting + UNION merge (same as general Step 6-7)
- [ ] E7: Present draft for human review and iterate
P1: Research Task Board
Decompose the research question into 4-6 investigation tasks (Standard) or 3-4 tasks (Lightweight).
Each task assignment includes:
- Expert Role: Specialist persona (e.g., "Policy Historian", "Ecosystem Mapper")
- Objective: One-sentence investigation goal
- Queries: 2-3 pre-planned search queries
- Depth: DEEP (fetch 2-3 full articles) or SCAN (snippets sufficient)
- Output: Path to research notes file
- Parallel Group: Group A (independent) or Group B (depends on Group A)
Task Decomposition Rules
- Each task covers one coherent sub-topic a specialist would own
- Group A tasks must be independent and source-diverse
- Max 3 tasks per parallel group (concurrency limit)
- Every task must flag time-sensitive claims and expected citation aging risk
Enterprise Research Integration
When in Enterprise Research Mode, task board maps to six dimensions:
- Task A: Company fundamentals (entity, founding, funding, ownership)
- Task B: Business & products (segments, products, revenue structure)
- Task C: Competitive position (industry rank, competitors, barriers)
- Task D: Financial & operations (3-year financials, efficiency metrics)
- Task E: Recent developments (6-month events, strategic signals)
- Task F: Internal/proprietary sources (or document limitation)
Report: [P1 complete] {N} tasks in {M} groups. Dispatching Group A.
Enterprise Research Mode (Specialized Pipeline)
When researching a specific company/enterprise, follow this specialized workflow that ensures six-dimension coverage, quantified analysis frameworks, and three-level quality control.
E1: Intake
Same as P0/P1 above, plus:
- Confirm the exact legal entity being researched (parent vs subsidiary)
- Select research depth: Quick scan (3-5 pages) / Standard (10-20 pages) / Deep (20-40 pages)
- Identify any specific comparison targets (benchmark companies)
P2: Dispatch + Investigate
Subagents execute tasks using references/subagent_prompt.md and output to references/research_notes_format.md.
With Subagents (Claude Code / Cowork / DeerFlow)
- Dispatch Group A tasks in parallel (max 3 concurrent)
- Each subagent searches, fetches, and tags source types
- Every source line includes
Source-Type and As Of
- Wait for Group A completion
- Dispatch Group B (can read Group A notes)
Subagent Output Requirements
Each task-{id}.md must contain:
- Sources section: URLs from actual search results with Source-Type, As Of, Authority (1-10)
- Findings section: Max 10 one-sentence facts with source numbers
- Deep Read Notes (DEEP tasks): 2-3 sources read in full with key data/insights
- Gaps section: What was searched but NOT found, alternative interpretations
Without Subagents (Degraded Mode)
Lead agent executes tasks sequentially, acting as each specialist. Raw search results are discarded after writing notes.
Enterprise Research: Six-Dimension Collection
Follow references/enterprise_research_methodology.md for:
- Detailed collection workflow per dimension (query strategies, data fields, validation)
- Data source priority matrix (P0-P3 ranking)
- Cross-validation rules (min sources, max deviation thresholds)
Key principles:
- Evidence-driven: every conclusion must trace to a citable source
- Multi-source validation: key data requires β₯2 independent sources
- Restrained judgment: mark speculation explicitly, avoid unsubstantiated claims
- Structured presentation: complex information via tables, lists, hierarchies
Run L1 quality check after completing each dimension (see enterprise_quality_checklist.md).
Status per task: [P2 task-{id} complete] {N} sources, {M} findings.
Status all: [P2 complete] {N} tasks done, {M} total sources. Building registry.
E3: Structured Analysis Frameworks
Apply frameworks from references/enterprise_analysis_frameworks.md in order:
- SWOT analysis β each entry with evidence + source + impact assessment
- Competitive barrier quantification β 7 dimensions with weighted scoring β A+/A/B+/B/C+/C rating
- Risk matrix β 8 mandatory categories, probability Γ impact β Red/Yellow/Green
- Comprehensive scorecard β 6-dimension weighted total β X/10
Run L2 quality check after analysis is complete.
E4: Quality Control
Three-level checks from references/enterprise_quality_checklist.md:
- L1 (Data): Source count, attribution, cross-validation, timeliness
- L2 (Analysis): SWOT completeness, risk coverage, barrier scoring, conclusion support
- L3 (Document): Structure compliance, format consistency, readability, appendices
E5: Draft Using Enterprise Template
Use the 7-chapter enterprise report template from enterprise_quality_checklist.md:
- Company Overview
- Business & Product Structure
- Market & Competitive Position
- Financial & Operations Analysis
- Risks & Concerns
- Recent Developments
- Comprehensive Assessment & Conclusion
Plus appendices: Data Source Index, Glossary, Disclaimer.
E3-E7: Enterprise Analysis, Drafting, and Review
P3: Citation Registry + Source Governance
Lead agent reads all task notes and builds unified registry.
Registry Process
- Read every task file's
## Sources section
- Merge all sources, deduplicate by URL
- Assign sequential [n] numbers by first appearance
- Tag: source_type, as_of date, authority score (1-10), task id
- Apply quality gates:
- Standard: β₯12 approved sources, β₯5 unique domains, β₯30% official
- Lightweight: β₯6 approved sources, β₯3 unique domains, β₯20% official
- Max single-source share: β€25% (Standard), β€30% (Lightweight)
- Drop sources below threshold and list them explicitly
Registry Output Format
CITATION REGISTRY
Approved:
[1] Author/Org β Title | URL | Source-Type: official | Accessibility: public | Date: 2026-03-01 | Auth: 8 | task-a
[2] ...
Dropped:
x Source | URL | Source-Type: community | Accessibility: privileged | Auth: 3 | Reason: PRIVILEGED SOURCE - NOT ALLOWED
Stats: {approved}/{total}, {N} domains, official_share {xx}%
Privileged sources rejected: {N}
Critical rule: These [n] are FINAL. P5 may only cite from Approved list. Dropped sources never reappear.
Circular verification handling: When researching the user's own company/assets, if you discover data in user's private accounts (e.g., user's domain registrar showing they own domains), you MUST:
- Reject it from the registry (user already knows this)
- Note it as "CIRCULAR - USER ALREADY KNOWS" in Dropped
- Search for equivalent PUBLIC sources (e.g., public WHOIS, news articles)
- Report from external investigator perspective only
Exclusive source handling: When user EXPLICITLY PROVIDES their paid subscriptions or private APIs for third-party research (e.g., "Use my Crunchbase Pro to research competitors"), you SHOULD:
- Accept it as "exclusive-user-provided" accessibility
- Use it as competitive advantage
- Cite it properly in registry
- If no public equivalent exists, mark as [unverified] or omit the claim
Report: [P3 complete] {approved}/{total} sources. {N} domains. Official share: {xx}%. Privileged rejected: {N}.
Handling Information Black Box
When researching entities with no public footprint (like the "εθθ·³ε¨εε
¬εΈ" example):
What an external researcher would find:
- WHOIS: Privacy protected β No owner info
- Web search: No news, no press releases
- Social media: No company pages
- Business registries: No public API or requires local access
- Result: Complete information black box
Correct response:
Findings: NO PUBLIC INFORMATION AVAILABLE
Sources checked:
- WHOIS (public): Privacy protected [failed]
- Company registry (public): Access denied/No API [failed]
- News media: No coverage [failed]
- Corporate website: Placeholder only [minimal]
Verdict: UNABLE TO VERIFY COMPANY EXISTENCE from external perspective
Sources found: 0 (or minimal, e.g., only WHOIS showing domain exists)
Confidence: N/A - Insufficient evidence
DO NOT:
- β Use user's own credentials to "fill in the gaps"
- β Assume the company exists based on domain registration alone
- β Fill missing data with speculation
- β Claim to have "verified" information you accessed through privileged means
DO:
- β
Clearly state what an external researcher can/cannot verify
- β
Document all failed search attempts
- β
Mark claims as [unverified] or omit entirely
- β
Downgrade mode to Lightweight or stop if insufficient public sources
- β
Recommend direct contact for due diligence
P4: Evidence-Mapped Outline
Lead agent reads notes + registry to build outline.
- Identify cross-task patterns
- Design sections topic-first, not task-order-first
- Map each section to specific findings with source numbers
- Flag sections needing counter-review
- Mark recency-sensitive claims with AS_OF checks
Outline format:
## N. {Section Title}
Sources: [1][3][7] from tasks a, b
Claims: {claim from task-a finding 3}, {claim from task-b finding 1}
Counter-claim candidates: {alternative explanations}
Recency checks: {source dates + AS_OF}
Gaps: {limited official evidence}
P5: Draft from Notes
Write section by section using references/report_template_v6.md.
Rules:
- Every factual claim needs citation [n]
- Numbers/percentages must have source
- Add confidence marker per section: High/Medium/Low with rationale
- Add counter-claim sentence when evidence conflicts
- No new sources may be introduced
- Use [unverified] for unsupported statements
Anti-hallucination:
- Lead agent never invents URLs β only from subagent notes
- Lead agent never fabricates data β mark [unverified] if number not in notes
Status: [P5 in progress] {N}/{M} sections, ~{words} words.
P6: Counter-Review (Mandatory)
For each major conclusion, perform opposite-view checks:
- Could the conclusion be wrong?
- Which high-impact claims depend on a single source?
- Which claims lack official/academic support?