Skip to content

Web Research Agent

Runs an end-to-end web research workflow for a given query and returns a structured report plus references. You can constrain research to specific URLs and/or include local documents, and select different LLM/embedding models for speed, quality, and strategy. The node composes a report request from your inputs and calls a backend research service to generate results.
Preview

Usage

Use this node when you need a concise or in-depth research report with cited sources. Typical flow: provide your research query, choose a report type and format, optionally add task context, and decide whether to target specific URLs, include local documents, or allow broader web search. Adjust model choices for fast summaries versus deep reasoning. The output includes the final report and a references list.

Inputs

FieldRequiredTypeDescriptionExample
queryTrueSTRINGYour research question or topic. This is the main prompt that the agent investigates.What are the latest trends and risks in using RAG for enterprise search?
report_typeTrueCHOICEThe style/purpose of the report. Options are the available keys from report_types (e.g., overview, deep_dive, competitive_analysis; exact options depend on environment).deep_dive
report_formatTrueCHOICEThe output format for the report. Options are the available keys from report_formats (e.g., markdown, bullet_summary; exact options depend on environment).markdown
task_contextFalseSTRINGAdditional context to tailor the research (e.g., form inputs, audience, constraints). Appended to the query to guide the agent.Audience: CTOs in fintech; Emphasize compliance and cost trade-offs.
embed_modelTrueCHOICEEmbedding model used to retrieve relevant context. Provided as a choice from available embedding models.text-embedding-3-small
fast_llm_modelTrueCHOICELLM for fast operations like quick summaries.gpt-4o-mini
smart_llm_modelTrueCHOICELLM for higher-quality reasoning and composing the main report.gpt-4o-2024-08-06
strategic_llm_modelTrueCHOICELLM for strategy-heavy tasks (planning the approach, complex reasoning).o1-preview
source_urlsFalseSTRINGLimit research to specific sources (no page traversal). Provide a single URL or a comma-separated list.https://openai.com/research, https://arxiv.org
combine_source_urls_with_webFalseBOOLEANIf true, also perform a general web search in addition to the provided source_urls.true
local_documents_pathFalseSTRINGFilesystem path to a folder of local documents to include in the research corpus./workspace/docs/rag-evaluations
combine_local_documents_with_webFalseBOOLEANIf true, also perform web search in addition to local documents.false

Outputs

FieldTypeDescriptionExample
reportSTRINGThe generated research report in the selected format.# RAG in Enterprise Search: 2025 Outlook 1. Summary...
referencesSTRINGA serialized list or text block of sources cited in the report.["https://arxiv.org/abs/2401.12345", "https://openai.com/blog/..." ]

Important Notes

  • Report scope: If only source_urls are provided and web search is not combined, research is constrained to those URLs; adding local_documents_path enables a hybrid mode that includes local files.
  • Input formatting: source_urls must be comma-separated; no automatic site crawling or pagination is performed for a base URL.
  • Context handling: task_context is appended to the query to guide the agent’s focus and tone.
  • Model selection: The node prefixes models internally for the backend (e.g., openai:model-name). Ensure chosen models are available in your environment.
  • Service dependency: This node calls an external research service and may take up to ~180 seconds to complete; failures from the service propagate as errors.
  • Access limits: This node may be subject to workspace or plan limits; exceeding limits can block execution.

Troubleshooting

  • Non-200 error from research service: Verify your workspace has access and that the backend endpoint is configured. Try again later or simplify the query.
  • Timeouts or long waits: Reduce scope (fewer source_urls, shorter query), choose faster models, or try again during lower load.
  • Empty or low-quality results: Provide clearer task_context, pick a more capable smart_llm_model, add targeted source_urls or local documents.
  • Invalid source_urls parsing: Ensure URLs are comma-separated with no trailing commas and include a scheme (https://).
  • Local documents not included: Confirm local_documents_path exists and is accessible; set combine_local_documents_with_web if you also expect web sources.
  • Unexpected format: Check the selected report_format and report_type; choose a different option better aligned with your needs.