Skip to main content

How to Create a Blueprint

The Blueprint Generator allows users to create custom document processing templates by uploading documents and defining data extraction fields through an intuitive visual interface.

Updated over a week ago

Overview

The Blueprint Generator allows users to create custom document processing templates by uploading documents and defining data extraction fields using the intuitive visual interface. Users can draw boxes around specific areas of documents to identify and extract key information using Veryfi's OCR technology.

👨‍🏫 Learn more about ADocs

Accessing the Blueprint Generator

  1. Navigate to InboxAny Docs

  2. Click on Blueprints to view the blueprint management interface

So what are these Blueprints?

Think of blueprints as templates that teach ADocs exactly what to look for in your document.
You've got two options:

  • Prebuilt Blueprints - We've already created templates for common documents like insurance forms, work orders, passports, etc.

  • Custom Blueprints - You can create your own templates for literally any document type.

Creating a New Blueprint

The process involves several straightforward steps:

  1. Initialize creation - Navigate to the AnyDocs inbox and select "Create Blueprint."

  2. Upload document - Provide a sample document in PDF, JPEG, or PNG format

  3. Define extraction areas - Draw boxes around text regions you want to extract

  4. Configure fields - Assign names, data types, and descriptions to each extracted field

  5. Organize with groups (optional) - Group related fields together for better data structure

  6. Save blueprint - Finalize the template for future document processing


🔌 Step by Step Guide

Step 1️⃣ : Initialize Blueprint Creation

  1. Click Add Blueprint to open the blueprint creator

  2. The system will display a file uploader interface

Step 2️⃣ : Upload Document

Upload your document by selecting the appropriate file type and choosing your file. Supported file formats: JPEG, PNG, PDF (including multi-page documents)

Step 3️⃣ : Configure Blueprint Settings

  • Name: Provide a descriptive name for your blueprint

  • Type: Select the appropriate blueprint type from the available options

Field Creation

  1. Draw Selection Box: Click and drag to draw a box around the text area you want to extract

  2. Automatic Text Recognition: The system uses OCR to automatically detect and populate text from the selected region

  3. Configure Field Properties:

    • Text: Review and edit the automatically detected text

    • JSON Field Name: Enter a unique key name for this field

      • Must be lowercase alphanumeric characters

      • Spaces are automatically converted to underscores

      • Uppercase letters are automatically converted to lowercase

    • Type: Select from available field types

    • Description: Add optional description for the field

  4. Click Save to add the field to your blueprint

Step 4️⃣ Field Management

Editing Fields

  • Visual Editing: Click on any drawn box to edit field properties

  • Table Editing: Click directly into table cells to modify values inline

  • Field List: View all fields in the fields table at the bottom of the interface

Field Requirements

  • JSON field names must be unique across all fields

  • All key names follow lowercase alphanumeric format with underscores

Working with Groups

Groups allow you to organize related fields into logical collections, which affects the JSON output structure.

Creating Groups

  1. Click Add Group

  2. Configure group properties:

    • JSON Key: Unique identifier for the group

    • Type: Choose between:

      • Object: Creates a single object containing grouped fields

      • List of Objects: Creates an array of objects

Group Requirements

  • Each group must contain at least one field

  • Groups without fields will display a warning icon

How to Assign Fields to Groups?

Method 1: Direct assignment

  1. Click on a field box

  2. Select group from the dropdown menu (appears when groups exist)

Method 2: Move the existing field

  1. Click the menu icon for any field

  2. Select Move to Group

  3. Choose the target group from the modal

Group Management

  • Expand/Collapse: Click the folder icon to show/hide group contents

  • Visual Indicators: Alert icons indicate groups needing fields, folder icons show properly configured groups


Best Practices on Object vs List of Objects

Simple rule of thumb

  • Ask: “Can there be several of these in a single response?”

    • No → Object

    • Yes / maybe → List of Objects

Apply that to every top‑level key when you’re choosing between the two options in the blueprint UI.


Use Object when…

  • There is a single entity per response expected

  • Object because each prescription has one patient and one pharmacy.

"patient": {

"name": "John, Doe",

"address": null

}

"pharmacy": {

"name": "KAISER PERMANENTE",

"address": "...",

"phone_number": "..."

}

-> Use List of Objects when there can be zero, one, or many of the same kind of thing.

  • You expect multiple repeating items of the same structure

"medication_list": [

{ ... one medication ... },

{ ... another medication ... }

If you need to create a List of Objects or an Object:

  1. Go to Blueprint:

  2. Add a new group by selecting Add Group

  3. Type: Object or List of Objects

  4. Give it a good name

  5. Add a NEW field to the existing Object ot List of Objects

  6. Add a new field

  7. Select the group where to add

If you need to change the existing Fields location

  1. Open Blueprint

  2. Navigate to the Field and More option

  3. Select Move Group if you want to move a field from object A to object B


🧑🏻‍🏫 How does multi-page PDF support work?

For PDF documents with multiple pages it includes navigation controls for multi-page PDFs. Users can move between pages and define different extraction fields for each page as needed. Field assignments are page-specific and remain associated with their designated pages.

Navigation

  • Page Footer: Use the navigation controls at the bottom right

  • Page Jumping: Enter a specific page number and press Enter or click away to jump directly

  • Page-Specific Fields: Fields are associated with specific pages where they were created

Page Management

  • Each page maintains its own set of field boxes

  • Navigate between pages to see relevant field overlays

  • Fields created on one page won't appear on others



📕 Saving and Managing Blueprints

Saving Your Blueprint

  1. Enter a descriptive name for your blueprint, something straight forward and easy to remember.

  2. Click Save to create the blueprint

  3. The blueprint will appear in the blueprints list with your specified name and type

Viewing Saved Blueprints

  1. Scroll down to the blueprints list

  2. Click View on any blueprint to open it

  3. The system will automatically:

    • Display the original document

    • Draw all saved field boxes

    • Show group organization

    • Expand grouped fields for easy viewing

Can blueprints be modified after creation?

Yes, existing blueprints can be edited by selecting the "View" option, making necessary changes, and saving the updated version. Changes take effect immediately. So be careful if you have something in production that could potentially be impacted by a sudden change. (This is also related to response times and fields mapping).

  1. Open a blueprint using the View option

  2. Make necessary changes to fields or groups

  3. Click Save to update the blueprint

  4. Changes are immediately reflected in the blueprint list

☝🏼 Duplicate / Export & Import - new!


We’ve introduced new capabilities to make managing and iterating on blueprints easier across environments and use cases.

  1. Duplicate Blueprint

    • What it does: Creates an exact copy of an existing blueprint within the same profile.

    • When to use it:

      • To create versions of a blueprint for A/B testing or experimentation (e.g., …_V1, …_V2).

      • To safely try changes to fields, schema, or extraction logic without impacting the original/production blueprint.

  2. Export Blueprint

    • What it does: Downloads a blueprint definition (schema and configuration) into a portable file format.

    • When to use it:

      • To back up a blueprint externally.

      • To prepare moving a blueprint from one profile/environment (e.g., Dev) to another (e.g., Prod).

  3. Import Blueprint

    • What it does: Uploads a previously exported blueprint file into a selected profile/environment, creating a new blueprint there.

    • When to use it:

      • To promote a blueprint from development to staging/production.

      • To keep blueprint definitions in sync across multiple profiles or accounts.

Together, these features support:

  • Easy versioning and testing (using Duplicate in the same environment), and

  • Smooth migration and reuse of blueprints across different environments (using Export/Import).


💡 Best practices for creating effective blueprints

Core Principles

  • Design for generality and reuse: favor semantic descriptions over positional cues or visual hints. The model won't understand coordinates or layout as it will process plain text.

  • Keep field descriptions concise but informative, optimize for clarity and LLM reasoning speed.

  • Treat the description as prompt engineering: remove redundancy already implied by the document type.

  • Rely on verifiable data: the anti-hallucination layer rejects outputs not supported by the source document.

Blueprint Naming & Document Types

Use descriptive document names that express purpose and context.

  • Prefer explicit names: “bank account verification letter” instead of “verification letter”.

  • Select an accurate document type; it provides the LLM stronger contextual priors.

Field Design Best Practices

Descriptions guide the model and function as prompts. Write them to be model-friendly and reusable.

  • Be descriptive and concise. Avoid long narratives that distracts the model and increase latency.

  • Avoid positional language: “date at the bottom/right/green box” as won’t provide context the way you think (ocr is plain text, single lined as input to the llm).

  • Describe the field’s semantic role broadly so it works across variants (letters, forms, statements).

  • Expand abbreviations in parentheses to teach the model: e.g., “GM (General Motors)”.

  • Field names matter, but descriptions matter more. Keep names short.

Choosing Field Types

  • String: free-form text; use for labels, IDs that may include letters/numbers/symbols.

  • Float: use for monetary totals/amounts, rates, taxes. Avoid integer for currency.

  • Date: normalized date output (see “Dates & Localization”).
    Format can be defined by editing the blueprint's json:

    • The date format by default is US, but can be changed in the json file by using "ddmm" or "mmdd" in the scenario that the model isn’t extracting it from context.

      "date": {       "description": "Data de emissão",       "page_number": 0,       "bounding_region": [       ],       "type": "date",       "example": " 14 / AGO  / 2025",       "format": "ddmm"
  • List of strings: multiple similar values where order may matter (e.g., line item notes).

  • Parsed address: prefer this over a single string when downstream needs structured address parts. Consider adding a parallel “raw_address” string when you must preserve exact OCR text.

Bounding Boxes

Boxes aid your authoring workflow and help visualize OCR regions but are not passed to the LLM as contextual anchors. Do not rely on coordinates or shapes in prompts.

Prompt Engineering & Model Interaction

  • Field descriptions act as targeted prompts, keep them minimal, explicit, and domain-focused.

  • Do not specify exact punctuation requirements (quotes, commas). Natural variation is okay.

  • Avoid visual terms (color, location). LLMs do not “see” layout; they rely on text and semantic insights.

  • Limit conversions in-prompt (e.g., “twice” should be “two”); prefer explicit field types and post-processing rules. The anti-hallucination layer blocks unverifiable transformations.

Prompt Limits, Debugging, and Versions

  • Prompt size: keep descriptions short; excessively long prompts increase latency and may alterate context depending on model limits.

  • Prompt debugging: direct inspection tools may be limited. Favor small, isolated edits and regression tests to confirm impact.

Accuracy & Regression Testing

  • Assemble a gold set of diverse documents per blueprint (locales, layouts, qualities). Include edge cases (noise, stamps) 30 documents or more to have a good sampling set.

  • Freeze expected outputs and re-run after any blueprint change. Track pass/fail per field and per document.

  • Compare outputs on each field, not only the ones modified in the blueprint, the model will have an entire new context input which can modify other fields output. Add guardrails as needed.

  • Prioritize high-value fields (totals, dates, account numbers) and fields historically prone to locale issues (decimals, separators).

Never modify a live blueprint without a rollout plan. Use versioning, stage changes, and validate with regression tests before promoting.

Field Description Templates (Examples)

  • currency_code (string): “Three-letter ISO 4217 currency code (e.g., CAD, USD) found on the document; use as context to format monetary values and decimals.”

  • total_amount (float): “Grand total payable on the document; prefer labeled totals over subtotals; interpret separators per document’s locale and currency_code.”

  • issue_date (date): “Official document issuance date; choose the primary issuance label over received/printed dates.”

  • signature_present (bool): “True if the document includes a handwritten or electronic signature indicator (e.g., signature line signed, ‘Digitally signed by’).”

Troubleshooting & Common Pitfalls

Symptoms often trace back to ambiguous or overly long descriptions, or to locale/currency assumptions. Iterate with small, testable changes.

  • Amounts off by factor of 100 or 1000: check separators and ensure currency_code is present and described; confirm float type.

  • Wrong date chosen: clarify primary semantic (issue vs. due vs. received) and avoid positional hints.

  • Missing line items: model your group as a list of objects and ensure each required child field has clear, short descriptions.


🛡 What security measures are in place?

Veryfi maintains GDPR, HIPAA, and SOC 2 Type 2 compliance with bank-level security protocols. Documents and extracted data remain within Veryfi's infrastructure and are not shared with external AI providers.

Did this answer your question?