Veryfi Bounding boxes and Bounding regions

Veryfi's cutting-edge OCR API not only transforms unstructured documents into valuable business insights by extracting meaningful information from financial documents but also provides precise object location coordinates that pinpoint each data point's location.

Visualize the extracted values with confidence, overlaying bounding boxes effortlessly to showcase the precise boundaries of each piece of information. Elevate your data visualization game and impress your users with accurate and visually stunning representations of extracted data.

What are Bounding Boxes and Bounding Regions

As mentioned above, along with the extracted values and confidence details Veryfi will return both bounding boxes and bounding regions.

Bounding Box:

A bounding_box is a rectangular region that tightly encloses an object within an image. Veryfi Bounding box has the following format: [page_number, Xmin, Ymin, Xmax, Ymax]. The coordinates are a decimal value representing the positioning relative to the image dimensions, between 0 and 1.
Once the bounding boxes are obtained, you can draw them on the image of a document that Veryfi processed and extracted the data for you.

Bounding Region:

A bounding region is a more general representation of the spatial extent of an object or region of interest. Usually, it can take any shape, such as rectangles, polygons, or even irregular shapes. The bounding region that Veryfi provides is rectangular.
Veryfi Bounding region has the following format: bounding_region - [x1, y1, x2, y2, x3,y3, x4,y4] "bounding_region": [0.2369, 0.3643, 0.3418, 0.3643, 0.3418, 0.3772, 0.2369, 0.3772]
In contrast to bounding boxes, defined by two coordinate points and can be used to draw horizontal rectangles, bounding regions utilize four coordinate points. While bounding regions can still be represented as rectangles, they have the flexibility to be angled. This feature is particularly useful when working with images that are skewed or contain handwritten notes, such as signatures, which may not align perfectly horizontally.

Veryfi just started working on bounding_region functionality, while it has only four coordinate points that shape a rectangular, in the future, we might add more coordinates to support more shapes & polygons to provide a better experience and accuracy for object localization.

In some cases, bounding_box and bounding _region can be used together for a more comprehensive analysis of an image. For example, you can detect objects using bounding boxes and refine the localization or shape estimation using bounding regions.

🔑 What to use `bounding_box` or `bounding_region`?

Depending on the use case, you can make an informed decision about what to use bounding_box or bounding_region.

We recommend using bounding_region to capture skewed fields & precise regions only for one-page PDFs or JPGs. As the current implementation does not return page_number, it might be not easy to visualize bounding_region for multi-page documents. But, as an alternative, you can use page_number from bounding_box.

The choice between bounding_box and bounding_region depends on the specific needs of the application. If a rough estimate of the object's location and size is sufficient, bounding boxes are often used due to their simplicity. However, if precise shape information is required, bounding regions, such as polygons or contours, are more suitable. While bounding_box is simpler and quicker to compute, bounding_region provides more accurate shape information but may require more computational resources for processing.

How to retrieve object localization

At Veryfi API bounding_boxes is a request parameter that, if added to a POST call, will return both bounding_box and bounding_region fields for each extracted value in the JSON response.

 
curl --location --request POST 'https://api.veryfi.com/api/v8/partner/documents/' \
--header 'CLIENT-ID: id' \
--header 'AUTHORIZATION: apikey name:value' \
--form 'file=@"/receipt_public.jpg"' \
--form 'bounding_boxes="true"'

JSON Response sample

"Invoice_number": {

"bounding_box": [

0,

0.4746,

0.8984,

0.5972,

0.9258

],

"bounding_region": [

0.4746,

0.8984,

0.5972,

0.8984,

0.5972,

0.9258,

0.4746,

0.9258

],

"value": 1234568

},

❗️Since the majority of Veryfi customers are using bounding_box, samples below will be provided for bounding_box.

Where, Xmin, Ymin, Xmax, Ymax, we can calculate the four normalized points as follows: (Xmin, Ymin), (Xmin, Ymax), (Xmax, Ymin), (Xmax, Ymax). As those coordinates are normalized, if we want the rectangle that surrounds the field on the actual image, we would need to use the image height and width to calculate its coordinates:

X1 = image width * Xmin
X2 = image width * Xmax
Y1 = image height * Ymin
Y2 = image height * Ymax

Then, the points would be the following: (X1, Y1), (X1, Y2), (X2, Y1), (X2, Y2).

Options to retrieve `bounding_box` and `bounding_region`

Add parameter bounding_boxes: true to your payload along with your POST request
Add bounding_boxes: true to your GET request {this will work on the assumption that bounding_boxes: true were added to POST request}
Add detailed: true to your GET request {this will return fields coordinates even if bounding_boxes the parameter was not added to your POST request}

Assumptions

Please note that there are cases when some values in JSON response may miss the bounding_box and bounding_region. The most common reason for that is that the value was enriched on the Veryfi side from a document context or online sources or not found. Logic is the same as with confidence details.
bounding_box and bounding_region can not be updated in the current implementation.
In case the extracted value is updated the corresponding bounding_box & bounding_region are removed. We understand the disadvantage of this approach and this will be improved soon. Also knowing this limitation, you may still allow end users to update the extracted values in your application but do not send those PUT API edits to Veryfi.
If you do not store the data with Veryfi it may not be possible to retrieve object location details for already processed documents via GET call.

In v9 API bounding_box will be deprecated and bounding_region will take over. V9 rollout is scheduled for 2024.

Visualizing Bounding Boxes

To visualize bounding boxes effectively, follow these steps:

Retrieve the extracted value and bounding_box data from JSON response.
Load the original document image or obtain it img_url for visualization purposes.
Calculate coordinates using the image height and width.
Overlay the bounding boxes on the document image.

This will allow your end-users to visually verify the extraction's accuracy and understand the data's context within the document.

You can employ various visualization techniques based on your specific requirements and preferences. Please find the Jupyter Notebook code sample that will help you draw bounding boxes.

bounding_box.ipynb

A Walk-Through of Veryfi Lens SDK

Veryfi’s Data Enrichment

Factors Affecting document processing response times?

What affects data extraction accuracy

Detecting AI-Generated Documents with Veryfi

Veryfi Bounding boxes and Bounding regions

What are Bounding Boxes and Bounding Regions

🔑 What to use bounding_box or bounding_region?

How to retrieve object localization

JSON Response sample

Options to retrieve bounding_box and bounding_region

Assumptions

Visualizing Bounding Boxes

🔑 What to use `bounding_box` or `bounding_region`?

Options to retrieve `bounding_box` and `bounding_region`