Skip to content

Web Research Agent

Performs guided web research on a user query and returns a structured report with references. You can constrain sources to specific URLs and/or include local documents, while selecting different LLM roles (fast, smart, strategic) and an embedding model to influence research quality and speed.
Preview

Usage

Use this node when you need an AI-generated research summary or report grounded in web sources and optional local files. Provide a clear query, pick a report type and format, optionally add task context, restrict or augment sources via URLs and local documents, then connect the outputs to downstream nodes that consume the report and its reference list.

Inputs

FieldRequiredTypeDescriptionExample
queryTrueSTRINGThe research question or topic to investigate. Supports multiline input.What are the latest techniques for RAG evaluation in production systems?
report_typeTrueSELECTSelect the research style or template (e.g., deep-dive, overview, competitor analysis). Determines how the agent plans and composes the report.deep_dive
report_formatTrueSELECTSelect the output format (e.g., markdown, bullet_points, executive_summary). Controls the structure and presentation of the final report.markdown
task_contextTrueSTRINGOptional additional context (such as form data or constraints) to steer the research. Appended to the query internally.Audience: CTO; Focus on actionable recommendations and recent benchmarks.
embed_modelTrueSELECTEmbedding model used for retrieving and ranking relevant context.text-embedding-3-small
fast_llm_modelTrueSELECTLLM used for fast operations such as quick summaries.gpt-4o-mini
smart_llm_modelTrueSELECTLLM used for higher-quality reasoning and report generation.gpt-4o-2024-08-06
strategic_llm_modelTrueSELECTLLM used for strategic planning steps (e.g., research plans and strategies).o1-preview
source_urlsTrueSTRINGComma-separated list of specific URLs to search within. When provided, the research will prioritize or limit to these sources.https://arxiv.org, https://openai.com/research
combine_source_urls_with_webTrueBOOLEANIf true, also performs general web search in addition to the provided source URLs.true
local_documents_pathTrueSTRINGPath to a folder of local documents to include in the research corpus./workspace/research_docs/
combine_local_documents_with_webTrueBOOLEANIf true and local documents are provided, also include general web sources; otherwise limit to provided sources.false

Outputs

FieldTypeDescriptionExample
reportSTRINGThe generated research report in the selected format.# RAG Evaluation in Production 1. Overview... 2. Methods...
referencesSTRINGA stringified collection of references/citations used in the report.[{"title":"Paper A","url":"https://arxiv.org/abs/xxx"},{"title":"Blog B","url":"https://example.com"}]

Important Notes

  • Usage limits may apply: This node is categorized under Web Research and may be subject to workspace limits.
  • Source control: Providing source_urls restricts the research to those sites unless you enable combining with general web search.
  • Local documents: Supplying a local_documents_path includes your local files in the research; optionally combine with web sources.
  • Model selection matters: Fast/Smart/Strategic LLM choices and the embedding model affect latency, quality, and cost.
  • Input formatting: For multiple URLs, provide them as a comma-separated list without spaces or with spaces trimmed.

Troubleshooting

  • Timeouts or failing requests: Ensure network connectivity and try again; large or complex queries may require simplifying the prompt or adjusting model selections.
  • Service error received: Verify your user permissions/limits and that the selected models are available; retry later if the research service is temporarily unavailable.
  • Invalid or empty references: Confirm that the query and sources are specific enough; broaden sources or enable combining with web search.
  • No results from local documents: Check that local_documents_path is correct and accessible, and that files are supported/readable.
  • Unexpected output format: Re-run with a different report_format and confirm downstream nodes expect a STRING output.