PDF Splitter API

In - pdf with multiple invoices | Out - a list of separate documents

Helen Birulia avatar
Written by Helen Birulia
Updated over a week ago

PDF Splitter endpoint

PDF splitting involves the segmentation of a single PDF file into multiple distinct PDF files.

Use case

In - pdf with multiple invoices\receipts

Out - a list of separate documents

Frequently, a PDF document comprises diverse pages containing various expenses or invoices. However, each page might necessitate individual treatment as discrete transactions within the accounting software. While utilizing a PDF as a convenient means to consolidate these receipts and even facilitate generation (as seen in scanning), the subsequent task of segregating and integrating them into accounting software can pose a significant challenge from a bookkeeping perspective.

The automation of PDF content separation is now achievable through the utilization of Veryfi APIs. By accessing your Veryfi account (app.veryfi.com), you can seamlessly employ Veryfi APIs to swiftly process PDFs of varying sizes. Powered entirely by automated algorithms, Veryfi APIs yield instant outcomes upon executing the PDF splitting process, enabling you to redirect your attention towards more engaging pursuits beyond the realm of PDF segmentation.

Empower Veryfi to undertake the mundane aspects, thereby liberating you from manual interventions.

We strongly recommend referring to API Documentation for PDF Splitter specification.

PDF Splitter endpoint:

Response body

"id": " 12345"

What is expected

Veryfi will split the original PDF and process splitter items separately.

When processed, a notification will be sent to a client’s webhook similar to async processing.

Webhook notification

If in your use case you don't use async and don't need a webhook notification you can still use https://app.veryfi.com/api/docs/api-docs-process-asynchronous/ without adding "async": ture / false and having a webhook setup.

"event": "document.created",
"data": {"id": [1031, 1032, 1033],
"created": "2022-06-01 20:00:00"

"id" will contain a list of separate child documents created from the original parent, which can be queried with GET documents:

Known limitations

Splitting is based on:

  • Page numbers (present on all pages)

  • Invoice number (if the invoice number is different -> new document)

  • File size 50 Mb max

  • Page number 100 max

  • The endpoint uses the v8 model, if you use v7 you can still process using v8 and by running a GET request call v7 e.g: https://api.veryfi.com/api/v7/partner/documents/1032

  • JSON fields for document "is_child" of "parent_of" are not yet displayed, so a webhook notification is the only source to see the child documents.

This is the very beginning of splitting functionality, we appreciate your feedback submitted to support@veryfi.com with the "id" and stating what worked and what didn't. Your feedback helps us improve.

Did this answer your question?