Blur detection is a part of Veryfi Fraud Detection and Prevention Framework and an important signal Veryfi returns about the Document quality.
Given that the accuracy of document extraction heavily relies on image quality, it becomes crucial to have control over the image quality of the content provided by users, ensuring the reliability of extraction results.
Within Veryfi JSON response, the 'is_blurry' field plays a vital role in indicating the quality of an image, distinguishing between clear and blurry images. This field returns a boolean value (true or false) to assess the image's quality status.
It's important to note that the 'is_blurry' field is not activated by default for user accounts. If you wish to include this field in your JSON response, we kindly request you get in touch with our support team. Once enabled for your account, you can immediately expect this field to be available in your JSON response, granting you more control over image quality assessment."
π₯€ Why should I pay attention to image quality and blur detection?
Image Quality Matters: The quality of the receipt/invoice images directly affects the accuracy of data extraction. Blurriness makes it challenging to recognize and extract text accurately. Distorted characters or blurred shapes can lead to OCR errors or misinterpretation of the text, affecting the integrity of the extracted data.
Enhance User Experience:
is_blurry
returned withtrue
in your JSON response helps you sort & flag potential documents with possible poor data extraction results. If Veryfi powers your Expense Management/CPG loyalty or else product, by incorporatingis_blurry
field you can choose whether to pass this submission to your product or give end-users a friendly warning that the extraction results might need be verified/reviewed.As your trusted partner Veryfi guarantees that
is_blurry
can help you to enable smooth automation, improve the data accuracy you pass to your users, and ultimately enhance the overall experience by managing the expectations for the data extraction results.
While the is_blurry
flag is something we return after processing the submission, you might be interested in preventing your users from submitting blurry or low-quality images using Veryfi Lens for mobile.
How to Interpret & Assumptions
When you submit a single document for processing, the response will contain a list consisting of one flag.
However, if you submit multiple receipts' URLs, multi-page documents, or a zip file within the same request, the response will include a list of multiple flags. Each flag corresponds to a specific page, allowing you to assess the status individually for each page in the response.
Responses example:
βFor one image:
"is_blurry": [false]
Meaning that the we think that this image is blurry
βFor a zip that has 3 images:
"is_blurry": [
true,
true,
false
]
Meaning that first two pages are Blurry and the 3rd one is OK
Beta feature: meta ocr_score and image quality score
meta.ocr_score
- for api/v8/partner/documents
is a default field in meta-object.
API Docs:
There are at least ~6 indirect causes of poor data extraction related to image quality (blur, bleed-through, crumples, wrinkled, skew etc..). meta.ocr_score
can serve as a signal for image quality trust score, though with some important considerations:
High OCR Score
Clear, legible text
Good image resolution
Minimal noise or distortion
Proper document orientation
Good contrast between text and background
ββ
Low OCR Score (<0.92) might indicate:
Blurry images
Low-resolution scans
Poor lighting/contrast
Document skew or warping
Potential tampering or manipulation
Poor-quality scans/photos
β
What is behind meta.ocr_score
?
This is a composite score that combines two aspects of OCR text quality:
First component - average (ocr_score of all extracted fields):
This looks at the ocr_score specifically for the text in extracted fields
These are fields that have been identified as containing specific information. For example, fields like "Invoice Number", "Date", "Total Amount", etc.
Second component - average (ocr_score of all OCR text):
This considers the ocr_score for ALL text detected in the document
Includes both extracted fields and any other text
Gives a general measure of overall recognized text recognition
β
The final ocr_score is a number with Possible values: <= 1
Build your own document trust score logic
Confidence Details
a)
ocr_score
(per field) which is part of the confidence details for important fields. If theocr_score
is lower than 0.8, it's a signal that the extraction results may not be as stable. This lower score indicates that you should carefully review and validate the extracted data for accuracy.b)
score"score"
(per field) a confidence score, represents the confidence of mapping an extracted value to a particular field in JSON.
βImage Size and Resolution
Consider the image size in terms of width and height. If either the width or height is equal to or less than 500 pixels, it's a factor to take into account.
meta.pages_height
andmeta.pages_width
π Read More about Confidence Details
Refer to API Docs for JSON structure and schema
Have questions? Please contact support@veryfi.com.