Web Research Agent¶
Runs an automated web (and/or local documents) research workflow and returns a formatted report plus references. It crafts a research query from your input and task context, optionally restricts sources to given URLs or local files, and uses selected LLM/embedding models to generate results in a chosen format.

Usage¶
Use this node when you need structured, citation-backed research on a topic. Provide a clear query, select the desired report type and output format, and optionally constrain sources (URLs or local documents). Integrate it into workflows that require research summaries, comparative analyses, or data-oriented outputs (CSV/JSON) for downstream processing.
Inputs¶
| Field | Required | Type | Description | Example |
|---|---|---|---|---|
| query | True | STRING | The main research question or topic to investigate. Can be multi-line. | Assess the market outlook for residential solar in the EU over the next 5 years. |
| report_type | True | CHOICE | Specifies the style of output. Options: Direct Answer, Concise Summary, Detailed Report, Comparative Analysis, News Article, How-To Guide, Creative Writing. | Detailed Report |
| report_format | True | CHOICE | Specifies the output format. Options: Plain Text, Markdown, CSV, JSON, HTML, LaTeX. | Markdown |
| task_context | True | STRING | Additional context to guide the research (e.g., form inputs, target schema, constraints). Leave empty if not needed. | Audience: non-technical executives. Focus on installation cost trends and policy incentives. Include a short risks section. |
| embed_model | True | CHOICE | Embedding model used to gather and rank relevant context. | text-embedding-3-small |
| fast_llm_model | True | CHOICE | LLM for fast operations like brief summaries or lightweight steps. | gpt-4o-mini |
| smart_llm_model | True | CHOICE | LLM for more complex reasoning and full report generation. | gpt-4o-2024-08-06 |
| strategic_llm_model | True | CHOICE | LLM for strategic planning tasks (research plans and strategies) within the workflow. | o1-preview |
| source_urls | True | STRING | Optional comma-separated list of URLs to restrict the research sources. No page traversal beyond provided URLs. Leave empty to use general web search. | https://www.iea.org/reports/solar-pv, https://ec.europa.eu/energy_topics |
| combine_source_urls_with_web | True | BOOLEAN | If true, combines general web search with the provided source URLs. If false, research will be limited to the specified URLs when provided. | true |
| local_documents_path | True | STRING | Optional filesystem path to a folder of local documents to include in the research. If set, the node will incorporate these documents into the evidence. | /data/research/solar_policy_docs |
| combine_local_documents_with_web | True | BOOLEAN | If true, includes default web search in addition to local documents. If false and only local docs are provided, research emphasizes local materials. | false |
Outputs¶
| Field | Type | Description | Example |
|---|---|---|---|
| report | STRING | The generated research report in the selected report format and style. | ## EU Residential Solar Outlook (2025–2030) ... (Markdown content) |
| references | STRING | Reference list or citations used to generate the report. Typically includes URLs and/or document identifiers. | - https://www.iea.org/... - https://ec.europa.eu/... |
Important Notes¶
- Report formatting is strict: The output is crafted to match the selected report_format (e.g., CSV/JSON/HTML). Downstream steps should expect that exact format.
- Source control: Providing source_urls limits research to those URLs unless you set combine_source_urls_with_web to true.
- Local documents: Setting local_documents_path includes local files; ensure the path is accessible to the backend environment executing the research.
- Context injection: task_context is appended to the research prompt; include schemas for CSV/JSON outputs to get structured data.
- Model selection: Different LLMs are used for fast, smart, and strategic steps; choose models appropriate for your cost/quality needs.
- Timeouts: Requests generally time out after several minutes; very broad or complex queries may require narrowing the scope.
- Usage limits: Access to this node may be subject to organizational limits for web research usage.
Troubleshooting¶
- Service error or timeout: If you get an error or no response, try simplifying the query, reducing scope, or verifying connectivity. Consider using fewer or more specific sources.
- Empty or low-quality references: Provide source_urls to enforce credible sources, or add task_context specifying citation requirements.
- Wrong output format: If the report content doesn’t conform (e.g., invalid CSV/JSON), clarify the desired schema in task_context and retry.
- Local documents not found: Verify local_documents_path exists and is reachable by the backend. Use absolute paths and correct permissions.
- Model not available: If a selected model fails, switch to another supported model or defaults.
- Results too generic: Add detailed task_context (audience, required sections, metrics, comparison criteria) and select a more capable smart_llm_model.