Skip to content

DataFrame Columns to List

This node pulls values from selected columns in a DATAFRAME and formats them as a single string list. It can combine multiple columns, optionally split delimited cell contents, remove missing values, deduplicate entries, quote values, and output either JSON-array-style or tuple-style formatting.
Preview

Usage

Use this node when you need to turn table column values into a compact string list for prompts, filters, API parameters, SQL-like clauses, or downstream text-processing workflows. It is commonly placed after data-loading nodes such as CSV to DataFrame, Excel to DataFrame, JSON to DataFrame, or Parquet to DataFrame, and can also follow transformation nodes like Join DataFrames or Concatenate DataFrames. Typical use cases include extracting gene symbols, customer IDs, product SKUs, tags, or category labels from a table and passing the resulting string to downstream text, query, or export nodes. For best results, confirm column names first with a table preview or conversion node, use dropna to avoid unwanted missing-value strings, and enable split_cell_values only when cells contain multiple values separated by a known delimiter.

Inputs

FieldRequiredTypeDescriptionExample
dataframeTrueDATAFRAMEThe input table to extract values from. It must contain the columns named in the `columns` field.A DATAFRAME loaded from a clinical CSV with columns `DRUG1_GENE`, `DRUG2_GENE`, `PATIENT_ID`, and `RESPONSE_STATUS`.
columnsTrueSTRINGComma-separated column names to extract. Column names are trimmed for surrounding whitespace and must exactly match columns in the DataFrame after trimming.DRUG1_GENE,DRUG2_GENE
output_formatTrueCOMBO: json_array \| tupleControls the wrapper used around the formatted values. `json_array` returns values inside square brackets, while `tuple` returns values inside parentheses.json_array
quote_charTrueCOMBO: double \| single \| noneControls how individual values are quoted. `double` wraps values in double quotes, `single` wraps values in single quotes, and `none` leaves values unquoted. JSON array output always uses double quotes to remain JSON-compatible.double
split_cell_valuesFalseBOOLEANWhen enabled, each extracted cell value is converted to text and split using `cell_delimiter`. Useful for cells containing multiple values such as comma-separated tags or gene lists.True
cell_delimiterFalseSTRINGDelimiter used when `split_cell_values` is enabled. Empty split results are discarded after trimming whitespace.,
dropnaFalseBOOLEANDrops missing values from each selected column before formatting. Disable this only if missing values should appear in the output as text.True
uniqueFalseBOOLEANWhen enabled, duplicate values are removed while keeping the first occurrence order. When disabled, repeated values are preserved.True
escape_quotesFalseBOOLEANEscapes backslashes and matching quote characters inside values so the resulting string is safer to pass into JSON-like or tuple-like text contexts.True

Outputs

FieldTypeDescriptionExample
formatted_listSTRINGA single formatted string containing the extracted values. Depending on settings, this may look like a JSON array or a tuple-style list and can be passed to downstream prompt, query, or text nodes.["BRCA1","EGFR","TP53","ALK"]

Important Notes

  • Column matching: Every requested column must exist in the DataFrame. If any column is missing, the node stops with an error listing available columns.
  • JSON behavior: When output_format is json_array, the node forces double quotes even if quote_char is set to single or none, because JSON arrays require double-quoted strings.
  • Value conversion: Extracted values are formatted as strings. Numeric IDs, dates, booleans, and other cell types will be represented as text in the output.
  • Deduplication: With unique enabled, duplicates across all selected columns are removed while preserving the first time each value appears.

Troubleshooting

  • columns parameter must be provided: The columns input is empty or only whitespace. Enter one or more exact DataFrame column names, separated by commas.
  • Column 'X' not found in DataFrame: The column name does not exactly match the table. Check capitalization, spaces, and spelling, or inspect the DataFrame with a preview/export node before extraction.
  • Unexpected quoted output: If output_format is json_array, values will always use double quotes. Choose tuple if you need single quotes or unquoted values.
  • Output contains combined values instead of separate entries: Enable split_cell_values and set cell_delimiter to the delimiter used inside cells, such as ,, ;, or |.