Overview

The Blueprint Generator allows users to create custom document processing templates by uploading documents and defining data extraction fields using the intuitive visual interface. Users can draw boxes around specific areas of documents to identify and extract key information using Veryfi's OCR technology.

👨‍🏫 Learn more about ADocs

Accessing the Blueprint Generator

Navigate to Inbox → Any Docs
Click on Blueprints to view the blueprint management interface

So what are these Blueprints?

Think of blueprints as templates that teach ADocs exactly what to look for in your document.
You've got two options:

Prebuilt Blueprints - We've already created templates for common documents like insurance forms, work orders, passports, etc.
Custom Blueprints - You can create your own templates for literally any document type.

Creating a New Blueprint

The process involves several straightforward steps:

Initialize creation - Navigate to the AnyDocs inbox and select "Create Blueprint."
Upload document - Provide a sample document in PDF, JPEG, or PNG format
Define extraction areas - Draw boxes around text regions you want to extract
Configure fields - Assign names, data types, and descriptions to each extracted field
Organize with groups (optional) - Group related fields together for better data structure
Save blueprint - Finalize the template for future document processing

🔌 Step by Step Guide

Step 1️⃣ : Initialize Blueprint Creation

Click Add Blueprint to open the blueprint creator
The system will display a file uploader interface

Step 2️⃣ : Upload Document

Upload your document by selecting the appropriate file type and choosing your file. Supported file formats: JPEG, PNG, PDF (including multi-page documents)

Step 3️⃣ : Configure Blueprint Settings

Name: Provide a descriptive name for your blueprint
Type: Select the appropriate blueprint type from the available options

Field Creation

Draw Selection Box: Click and drag to draw a box around the text area you want to extract
Automatic Text Recognition: The system uses OCR to automatically detect and populate text from the selected region
Configure Field Properties:
- Text: Review and edit the automatically detected text
- JSON Field Name: Enter a unique key name for this field
  - Must be lowercase alphanumeric characters
  - Spaces are automatically converted to underscores
  - Uppercase letters are automatically converted to lowercase
- Type: Select from available field types
- Description: Add optional description for the field
Click Save to add the field to your blueprint

Step 4️⃣ Field Management

Editing Fields

Visual Editing: Click on any drawn box to edit field properties
Table Editing: Click directly into table cells to modify values inline
Field List: View all fields in the fields table at the bottom of the interface

Field Requirements

JSON field names must be unique across all fields
All key names follow lowercase alphanumeric format with underscores

Working with Groups

Groups allow you to organize related fields into logical collections, which affects the JSON output structure.

Creating Groups

Click Add Group
Configure group properties:
- JSON Key: Unique identifier for the group
- Type: Choose between:
  - Object: Creates a single object containing grouped fields
  - List of Objects: Creates an array of objects

Group Requirements

Each group must contain at least one field
Groups without fields will display a warning icon

How to Assign Fields to Groups?

Method 1: Direct assignment

Click on a field box
Select group from the dropdown menu (appears when groups exist)

Method 2: Move the existing field

Click the menu icon for any field
Select Move to Group
Choose the target group from the modal

Group Management

Expand/Collapse: Click the folder icon to show/hide group contents
Visual Indicators: Alert icons indicate groups needing fields, folder icons show properly configured groups

Best Practices on Object vs List of Objects

Simple rule of thumb

Ask: “Can there be several of these in a single response?”
- No → Object
- Yes / maybe → List of Objects

Apply that to every top‑level key when you’re choosing between the two options in the blueprint UI.

Use Object when…

There is a single entity per response expected
Object because each prescription has one patient and one pharmacy.

"patient": {

"name": "John, Doe",

"address": null

}

"pharmacy": {

"name": "KAISER PERMANENTE",

"address": "...",

"phone_number": "..."

}

-> Use List of Objects when there can be zero, one, or many of the same kind of thing.

You expect multiple repeating items of the same structure

"medication_list": [

{ ... one medication ... },

{ ... another medication ... }

If you need to create a List of Objects or an Object:

Go to Blueprint:
Add a new group by selecting Add Group
Type: Object or List of Objects
Give it a good name
Add a NEW field to the existing Object ot List of Objects
Add a new field
Select the group where to add

If you need to change the existing Fields location

Open Blueprint
Navigate to the Field and More option
Select Move Group if you want to move a field from object A to object B

🧑🏻‍🏫 How does multi-page PDF support work?

For PDF documents with multiple pages it includes navigation controls for multi-page PDFs. Users can move between pages and define different extraction fields for each page as needed. Field assignments are page-specific and remain associated with their designated pages.

Navigation

Page Footer: Use the navigation controls at the bottom right
Page Jumping: Enter a specific page number and press Enter or click away to jump directly
Page-Specific Fields: Fields are associated with specific pages where they were created

Page Management

Each page maintains its own set of field boxes
Navigate between pages to see relevant field overlays
Fields created on one page won't appear on others

📕 Saving and Managing Blueprints

Saving Your Blueprint

Enter a descriptive name for your blueprint, something straight forward and easy to remember.
Click Save to create the blueprint
The blueprint will appear in the blueprints list with your specified name and type

Viewing Saved Blueprints

Scroll down to the blueprints list
Click View on any blueprint to open it
The system will automatically:
- Display the original document
- Draw all saved field boxes
- Show group organization
- Expand grouped fields for easy viewing

Can blueprints be modified after creation?

Yes, existing blueprints can be edited by selecting the "View" option, making necessary changes, and saving the updated version. Changes take effect immediately. So be careful if you have something in production that could potentially be impacted by a sudden change. (This is also related to response times and fields mapping).

Open a blueprint using the View option
Make necessary changes to fields or groups
Click Save to update the blueprint
Changes are immediately reflected in the blueprint list

☝🏼 Duplicate / Export & Import - new!

We’ve introduced new capabilities to make managing and iterating on blueprints easier across environments and use cases.

Duplicate Blueprint
- What it does: Creates an exact copy of an existing blueprint within the same profile.
- When to use it:
  - To create versions of a blueprint for A/B testing or experimentation (e.g., …_V1, …_V2).
  - To safely try changes to fields, schema, or extraction logic without impacting the original/production blueprint.
Export Blueprint
- What it does: Downloads a blueprint definition (schema and configuration) into a portable file format.
- When to use it:
  - To back up a blueprint externally.
  - To prepare moving a blueprint from one profile/environment (e.g., Dev) to another (e.g., Prod).
Import Blueprint
- What it does: Uploads a previously exported blueprint file into a selected profile/environment, creating a new blueprint there.
- When to use it:
  - To promote a blueprint from development to staging/production.
  - To keep blueprint definitions in sync across multiple profiles or accounts.

Together, these features support:

Easy versioning and testing (using Duplicate in the same environment), and
Smooth migration and reuse of blueprints across different environments (using Export/Import).

💡 Best practices for creating effective blueprints

Core Principles

Design for generality and reuse: favor semantic descriptions over positional cues or visual hints. The model won't understand coordinates or layout as it will process plain text.
Keep field descriptions concise but informative, optimize for clarity and LLM reasoning speed.
Treat the description as prompt engineering: remove redundancy already implied by the document type.
Rely on verifiable data: the anti-hallucination layer rejects outputs not supported by the source document.

Blueprint Naming & Document Types

Use descriptive document names that express purpose and context.

Prefer explicit names: “bank account verification letter” instead of “verification letter”.
Select an accurate document type; it provides the LLM stronger contextual priors.

Field Design Best Practices

Descriptions guide the model and function as prompts. Write them to be model-friendly and reusable.

Be descriptive and concise. Avoid long narratives that distracts the model and increase latency.
Avoid positional language: “date at the bottom/right/green box” as won’t provide context the way you think (ocr is plain text, single lined as input to the llm).
Describe the field’s semantic role broadly so it works across variants (letters, forms, statements).
Expand abbreviations in parentheses to teach the model: e.g., “GM (General Motors)”.
Field names matter, but descriptions matter more. Keep names short.

Choosing Field Types

String: free-form text; use for labels, IDs that may include letters/numbers/symbols.
Float: use for monetary totals/amounts, rates, taxes. Avoid integer for currency.
Date: normalized date output (see “Dates & Localization”).
Format can be defined by editing the blueprint's json:
- The date format by default is US, but can be changed in the json file by using "ddmm" or "mmdd" in the scenario that the model isn’t extracting it from context.
```
"date": {       "description": "Data de emissão",       "page_number": 0,       "bounding_region": [       ],       "type": "date",       "example": " 14 / AGO  / 2025",       "format": "ddmm"
```
List of strings: multiple similar values where order may matter (e.g., line item notes).
Parsed address: prefer this over a single string when downstream needs structured address parts. Consider adding a parallel “raw_address” string when you must preserve exact OCR text.

Bounding Boxes

Boxes aid your authoring workflow and help visualize OCR regions but are not passed to the LLM as contextual anchors. Do not rely on coordinates or shapes in prompts.

Prompt Engineering & Model Interaction

Field descriptions act as targeted prompts, keep them minimal, explicit, and domain-focused.
Do not specify exact punctuation requirements (quotes, commas). Natural variation is okay.
Avoid visual terms (color, location). LLMs do not “see” layout; they rely on text and semantic insights.
Limit conversions in-prompt (e.g., “twice” should be “two”); prefer explicit field types and post-processing rules. The anti-hallucination layer blocks unverifiable transformations.

Prompt Limits, Debugging, and Versions

Prompt size: keep descriptions short; excessively long prompts increase latency and may alterate context depending on model limits.
Prompt debugging: direct inspection tools may be limited. Favor small, isolated edits and regression tests to confirm impact.

Accuracy & Regression Testing

Assemble a gold set of diverse documents per blueprint (locales, layouts, qualities). Include edge cases (noise, stamps) 30 documents or more to have a good sampling set.
Freeze expected outputs and re-run after any blueprint change. Track pass/fail per field and per document.
Compare outputs on each field, not only the ones modified in the blueprint, the model will have an entire new context input which can modify other fields output. Add guardrails as needed.
Prioritize high-value fields (totals, dates, account numbers) and fields historically prone to locale issues (decimals, separators).

Never modify a live blueprint without a rollout plan. Use versioning, stage changes, and validate with regression tests before promoting.

Field Description Templates (Examples)

currency_code (string): “Three-letter ISO 4217 currency code (e.g., CAD, USD) found on the document; use as context to format monetary values and decimals.”
total_amount (float): “Grand total payable on the document; prefer labeled totals over subtotals; interpret separators per document’s locale and currency_code.”
issue_date (date): “Official document issuance date; choose the primary issuance label over received/printed dates.”
signature_present (bool): “True if the document includes a handwritten or electronic signature indicator (e.g., signature line signed, ‘Digitally signed by’).”

Troubleshooting & Common Pitfalls

Symptoms often trace back to ambiguous or overly long descriptions, or to locale/currency assumptions. Iterate with small, testable changes.

Amounts off by factor of 100 or 1000: check separators and ensure currency_code is present and described; confirm float type.
Wrong date chosen: clarify primary semantic (issue vs. due vs. received) and avoid positional hints.
Missing line items: model your group as a list of objects and ensure each required child field has clear, short descriptions.

🛡 What security measures are in place?

Veryfi maintains GDPR, HIPAA, and SOC 2 Type 2 compliance with bank-level security protocols. Documents and extracted data remain within Veryfi's infrastructure and are not shared with external AI providers.

What fields Veryfi extract for Receipts/Invoices?

Veryfi’s Data Enrichment

Confidence Score Explained

What affects data extraction accuracy

Supported Blueprints

How to Create a Blueprint