Web Research Agent¶

Performs guided web research on a user query and returns a structured report with references. You can constrain sources to specific URLs and/or include local documents, while selecting different LLM roles (fast, smart, strategic) and an embedding model to influence research quality and speed.

Usage¶

Use this node when you need an AI-generated research summary or report grounded in web sources and optional local files. Provide a clear query, pick a report type and format, optionally add task context, restrict or augment sources via URLs and local documents, then connect the outputs to downstream nodes that consume the report and its reference list.

Inputs¶

Field	Required	Type	Description	Example
query	True	STRING	The research question or topic to investigate. Supports multiline input.	What are the latest techniques for RAG evaluation in production systems?
report_type	True	SELECT	Select the research style or template (e.g., deep-dive, overview, competitor analysis). Determines how the agent plans and composes the report.	deep_dive
report_format	True	SELECT	Select the output format (e.g., markdown, bullet_points, executive_summary). Controls the structure and presentation of the final report.	markdown
task_context	True	STRING	Optional additional context (such as form data or constraints) to steer the research. Appended to the query internally.	Audience: CTO; Focus on actionable recommendations and recent benchmarks.
embed_model	True	SELECT	Embedding model used for retrieving and ranking relevant context.	text-embedding-3-small
fast_llm_model	True	SELECT	LLM used for fast operations such as quick summaries.	gpt-4o-mini
smart_llm_model	True	SELECT	LLM used for higher-quality reasoning and report generation.	gpt-4o-2024-08-06
strategic_llm_model	True	SELECT	LLM used for strategic planning steps (e.g., research plans and strategies).	o1-preview
source_urls	True	STRING	Comma-separated list of specific URLs to search within. When provided, the research will prioritize or limit to these sources.	https://arxiv.org, https://openai.com/research
combine_source_urls_with_web	True	BOOLEAN	If true, also performs general web search in addition to the provided source URLs.	true
local_documents_path	True	STRING	Path to a folder of local documents to include in the research corpus.	/workspace/research_docs/
combine_local_documents_with_web	True	BOOLEAN	If true and local documents are provided, also include general web sources; otherwise limit to provided sources.	false

Outputs¶

Field	Type	Description	Example
report	STRING	The generated research report in the selected format.	# RAG Evaluation in Production 1. Overview... 2. Methods...
references	STRING	A stringified collection of references/citations used in the report.	[{"title":"Paper A","url":"https://arxiv.org/abs/xxx"},{"title":"Blog B","url":"https://example.com"}]

Important Notes¶

Usage limits may apply: This node is categorized under Web Research and may be subject to workspace limits.
Source control: Providing source_urls restricts the research to those sites unless you enable combining with general web search.
Local documents: Supplying a local_documents_path includes your local files in the research; optionally combine with web sources.
Model selection matters: Fast/Smart/Strategic LLM choices and the embedding model affect latency, quality, and cost.
Input formatting: For multiple URLs, provide them as a comma-separated list without spaces or with spaces trimmed.

Troubleshooting¶

Timeouts or failing requests: Ensure network connectivity and try again; large or complex queries may require simplifying the prompt or adjusting model selections.
Service error received: Verify your user permissions/limits and that the selected models are available; retry later if the research service is temporarily unavailable.
Invalid or empty references: Confirm that the query and sources are specific enough; broaden sources or enable combining with web search.
No results from local documents: Check that local_documents_path is correct and accessible, and that files are supported/readable.
Unexpected output format: Re-run with a different report_format and confirm downstream nodes expect a STRING output.