It is important to know that each of these nodes may present unique errors, however the list of errors we posses is still in development, therefore we can’t provide a full list. What we can do, is to provide a logic to follow when facing one or multiple errors at the time.
One common error is related to the cloud provider account connection (google drive, dropbox), verify workflows launch and create an unique folder inside your connected account. This folder name is managed right in the node, when typing it manually.
Folder Creation Best Practices - This is the most common error, that’s why we decided to have an stand alone section.
Folders names are unique for all enabled workflows, so be carful to not use it on separete workflows, as a tip, use the workflow name as your folder name.
The system automatically creates folders for Google Drive and Dropbox to ensure correct placement, as long as you are using an unique name, you won’t have troubles creating the folder.
With this covered, we can start looking at the whole nodes logic, we must always think that there must be some kind of ingest (input), some kind of extraction ( processing ) and a result ( output ), unless you only want to store it in the verify API account ( very unlikely ).
Each node can fail by itself, and the first time you create a connection it is expected to see a lot of errors, to prevent this overwhelming experience, it is best to start by dragging the nodes you want to use, without connecting them.
It should look similar to this:
Once you know what knows you wish to connect, you can start dragging the connector to build the workflow.
Once you have started the connections, you will see errors listed one node a the time, the system will “walk” into each step and validate the missing configurations on each, solve one node at the time.
Node Functionality and Connections
API Import Node - The core node.
Triggers when the API processes a document of a selected type or when a document is dropped into the account UI inbox.
Outputs already extracted data. This is OCR behavior just the way an API call would work.
Connecting it to a data extraction node results in two extractions, which is inefficient, you can simply place an output node right after this one.
Business Insights NodeUses natural language questions to retrieve information.
Can be scheduled to run queries regularly (daily or weekly) and email results.
This is an llm behavior tho help automate reports, you could place it at the end of an extraction to provide details or information.
Process Steps section
We have several different “In between steps” that could be divided in categories depending on the extraction output (API import node).
Extract Data : This are the “extract data” and “Extract data set” nodes.
Extract data: This behaves as an API call over the endpoint, when choosing it make sure you understand which use case and document you want to process, we have different model behaviors for each.
Extract data set: Available for invoices bulk uploads and banks statements, this is designed to handle multiple pages documents and large volumes.
Splits and Decisions:
PDF splitter:
Decision:
Document Classifier:
Split collector:
Output nodes:
Dropbox and Gdrive output: Will return the extraction a a json file in the connected google drive account.
Email file: Emails the json file.
Upload to Gsheets : Places the json keys as columns in a google sheet.
Xero export : Integrates the output directly into the Xero ecosystem.
Considerations
Data Flow and Compatibility
Google Drive, Gmail, Email Collector, and Dropbox inputs provide plain files.
Document Aggregator provides a list of extracted documents.
Business Insights provides a file with the insights.
The Email Approval node sends an email for manual workflow continuation and accepts both plain files and extracted data, outputting whatever its input was.
PDF Splitter nodes typically precede extraction nodes.
Xero Export requires extracted data.
Directly connecting a file input (e.g., Google Drive) to Google Sheets upload or Convert to CSV will result in an error, as these nodes require extracted data.
The current categorization of nodes on the left side of the UI is not functional and needs improvement to indicate accepted input types and output types.
Workflow Error Types
Configuration Errors
Occur when required values are missing in a node's configuration (e.g., folder name, document type, blueprint).
Invalid Connection Errors
Arise when a node's output type is incompatible with the subsequent node's input requirements (e.g., connecting a plain file to a node expecting extracted data).
Error messages for invalid connections need improvement for clarity.
Duplicate Input Node Errors
Google Drive and Dropbox: Cannot have identical folder triggers across different workflows.
Email Collector: Only one Email Collector input is allowed across all workflows for a given account.
Gmail Input: Users must carefully configure search filters to prevent a single email from matching multiple workflows, as the system cannot automatically guard against this.
Structural Errors
Cycle Detected: Workflows cannot contain loops.
Missing Enabled Handles: Occur when an output node is not connected to anything and is not explicitly disabled, indicating an incomplete workflow.
Runtime Execution Errors
Unexpected external service failures (e.g., Google services downtime) can cause workflows to disable.
Google Sheets reaching its upper size limit during execution can disable the workflow.
These errors typically indicate a system bug or an external issue and may resolve upon refreshing or re-enabling the workflow.
Workflow Status
Workflows cannot be enabled until all configuration and connection errors are resolved.
The workflows feature has been under development for approximately one year, with new features added intermittently based on user needs.




