Research Pipeline
Build an autonomous research system that takes a topic, searches across web, academic papers, and your internal knowledge base, and delivers a structured report with source citations — reviewed by an editor agent for accuracy. Turn hours of research into minutes.
What you'll have
- A research agent that searches multiple sources in parallel
- An editor agent that validates claims and flags unsupported statements
- A pipeline that chains research → editing in one API call
- Knowledge graph entries that accumulate across research sessions
Step 1: Configure agents
# astra.yml — research pipeline
agents:
- id: research-agent
systemPromptTemplate: |
You are a senior research analyst. When given a topic, conduct
thorough multi-source research using web search, academic papers,
and existing knowledge. Cite all sources. Produce structured reports.
providers: [claude, gemini]
tools:
allow: [brave_search, arxiv_search, web_scrape, summarize, knowledge_retrieve]
budget:
maxCostCents: 300
maxToolCalls: 50
maxDuration: 300000 # 5 minutes
- id: editor-agent
systemPromptTemplate: |
You are a technical editor. Review research reports for accuracy,
clarity, and completeness. Flag unsupported claims.
providers: [claude]
tools:
allow: [brave_search, summarize]Step 2: Run the pipeline
# Run a research pipeline: research → edit → publish
curl -X POST http://localhost:3000/pipelines/run \
-H "Authorization: Bearer ${JWT_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"input": "Analyze the current state of edge AI inference: hardware trends, latency benchmarks, and deployment patterns for 2026",
"stages": [
{ "agentId": "research-agent" },
{ "agentId": "editor-agent", "systemPromptSuffix": "Focus on factual accuracy and source quality" }
]
}'Step 3: Seed with domain knowledge
Drop a context file into ./workspace/ to guide research focus and source preferences. The file watcher picks it up immediately.
# Drop domain knowledge into workspace files
# workspace/RESEARCH-CONTEXT.md
Research focus areas:
- Edge computing and IoT
- LLM inference optimization
- Semiconductor supply chain
Preferred sources:
- ArXiv, IEEE, ACM
- Company technical blogs (not marketing)
- Benchmark datasets from MLPerfGoing deeper
For multi-day research projects, use Deep Research mode which spawns sub-agent swarms for parallel investigation and delivers structured reports with progress tracking.
Research findings automatically flow into the knowledge graph, so each research session builds on the last. After a few runs, the agent starts connecting dots across topics that a human would miss.