DataFrame Columns to List¶

This node pulls values from selected columns in a DATAFRAME and formats them as a single string list. It can combine multiple columns, optionally split delimited cell contents, remove missing values, deduplicate entries, quote values, and output either JSON-array-style or tuple-style formatting.

Usage¶

Use this node when you need to turn table column values into a compact string list for prompts, filters, API parameters, SQL-like clauses, or downstream text-processing workflows. It is commonly placed after data-loading nodes such as CSV to DataFrame, Excel to DataFrame, JSON to DataFrame, or Parquet to DataFrame, and can also follow transformation nodes like Join DataFrames or Concatenate DataFrames. Typical use cases include extracting gene symbols, customer IDs, product SKUs, tags, or category labels from a table and passing the resulting string to downstream text, query, or export nodes. For best results, confirm column names first with a table preview or conversion node, use dropna to avoid unwanted missing-value strings, and enable split_cell_values only when cells contain multiple values separated by a known delimiter.

Inputs¶

Field	Required	Type	Description	Example
dataframe	True	DATAFRAME	The input table to extract values from. It must contain the columns named in the `columns` field.	A DATAFRAME loaded from a clinical CSV with columns `DRUG1_GENE`, `DRUG2_GENE`, `PATIENT_ID`, and `RESPONSE_STATUS`.
columns	True	STRING	Comma-separated column names to extract. Column names are trimmed for surrounding whitespace and must exactly match columns in the DataFrame after trimming.	DRUG1_GENE,DRUG2_GENE
output_format	True	COMBO: json_array \\| tuple	Controls the wrapper used around the formatted values. `json_array` returns values inside square brackets, while `tuple` returns values inside parentheses.	json_array
quote_char	True	COMBO: double \\| single \\| none	Controls how individual values are quoted. `double` wraps values in double quotes, `single` wraps values in single quotes, and `none` leaves values unquoted. JSON array output always uses double quotes to remain JSON-compatible.	double
split_cell_values	False	BOOLEAN	When enabled, each extracted cell value is converted to text and split using `cell_delimiter`. Useful for cells containing multiple values such as comma-separated tags or gene lists.	True
cell_delimiter	False	STRING	Delimiter used when `split_cell_values` is enabled. Empty split results are discarded after trimming whitespace.	,
dropna	False	BOOLEAN	Drops missing values from each selected column before formatting. Disable this only if missing values should appear in the output as text.	True
unique	False	BOOLEAN	When enabled, duplicate values are removed while keeping the first occurrence order. When disabled, repeated values are preserved.	True
escape_quotes	False	BOOLEAN	Escapes backslashes and matching quote characters inside values so the resulting string is safer to pass into JSON-like or tuple-like text contexts.	True

Outputs¶

Field	Type	Description	Example
formatted_list	STRING	A single formatted string containing the extracted values. Depending on settings, this may look like a JSON array or a tuple-style list and can be passed to downstream prompt, query, or text nodes.	["BRCA1","EGFR","TP53","ALK"]

Important Notes¶

Column matching: Every requested column must exist in the DataFrame. If any column is missing, the node stops with an error listing available columns.
JSON behavior: When output_format is json_array, the node forces double quotes even if quote_char is set to single or none, because JSON arrays require double-quoted strings.
Value conversion: Extracted values are formatted as strings. Numeric IDs, dates, booleans, and other cell types will be represented as text in the output.
Deduplication: With unique enabled, duplicates across all selected columns are removed while preserving the first time each value appears.

Troubleshooting¶

columns parameter must be provided: The columns input is empty or only whitespace. Enter one or more exact DataFrame column names, separated by commas.
Column 'X' not found in DataFrame: The column name does not exactly match the table. Check capitalization, spaces, and spelling, or inspect the DataFrame with a preview/export node before extraction.
Unexpected quoted output: If output_format is json_array, values will always use double quotes. Choose tuple if you need single quotes or unquoted values.
Output contains combined values instead of separate entries: Enable split_cell_values and set cell_delimiter to the delimiter used inside cells, such as ,, ;, or |.