Updated over a week ago

Below you will find more information about the JSON field category and line_item.category, how those are being extracted, what problems solve, and what customization is available.

What are the various use cases served by Veryfi smart categorization?

Categories are used to classify and track different types of expenses to understand spending patterns and analyze financial data.

Veryfi data extraction caters to multiple use cases, each with its unique demands and priorities. Different use cases may prioritize different aspects of data extraction. For instance, while one use case might require the extraction of the manufacturer or lot numbers, another might place greater importance on determining the type of expense and the category to which it belongs.

Our experience has shown that categories are particularly relevant in the CPG loyalty vertical and expense management domains. Categories are crucial in these industries, facilitating efficient expense tracking, personalized marketing, and strategic decision-making.

Let's take a closer look at the demand:

CPG loyalty

Categories are important in the CPG loyalty space as they enable targeted marketing, personalized experiences, data analysis, and effective inventory management. By leveraging category insights, loyalty programs can enhance customer engagement, loyalty, and drive business growth.

Expense management / Bill pay

For spend management use cases, categories enable expense tracking, budgeting, vendor management, compliance, cost optimization, strategic decision-making, and effective reporting. They provide structure and clarity to financial data, allowing organizations to gain insights, control costs, and make informed decisions related to their spending.

Smart categorization at Veryfi

Veryfi OCR API utilizes smart categorization at both the document and line item levels.

Smart categorization streamlines the data extraction process, ensures accurate categorization of expenses, and offers flexibility for users to customize categories according to their specific needs.

There are two types of categories at Veryfi:

  • category - a field in your JSON that represents the document-level category

  • line_item.category - line items level category

All Veryfi user accounts are being created with a standard list of Expense Categories (COA) that users can customize.

You can find a list of standard categories under Data Transformations > Categories.

How does Veryfi smart categorization work?

Smart categorization is a part of the data enrichment process. At the document level, the system examines the categories associated with the user's account first. If no matches are found within the standard list, the model will apply the category it deems most suitable based on previous selections and patterns across all customers at Veryfi taking into account vendor domain and type. This ensures that the system can still provide relevant categorization even without explicit category assignments.

Currently, we are actively working on enhancing line item categorization to match the advanced categorization capabilities available at the document level. Our aim is to ensure that once the document is processed and the document level category is assigned, the line items are categorized based on the product or service information and previous categorizations made manually in the Web portal or via PUT API to update line item categories.

If your use case requires advanced smart categorization on a line item level, please get in touch with us.

What if you have your custom categories?

In addition, users have the flexibility to send a personalized list of their own categories. When users provide their custom categories, the model attempts to match expenses with the specified categories, supporting multiple languages. However, it is important to note that the model relies on common sense assumptions when matching expenses. For instance, if an expense is from Burger King, it is expected that Veryfi will categorize it as "food" or related custom categories, but it is not expected to categorize it as "dinner with a colleague" or similar specific descriptions. The emphasis is on aligning the expense with broader, more general categories to ensure accurate and practical categorization within the Veryfi smart categorization system.

If you want to use your own categories, you can provide the list of categories along with a request; just use a request parameter categories. Please check out API Schema provided by Veryfi to determine the correct parameter type and format.

Please note that boost_mode turns off categorization along with some other enrichments if used in the same API request.

If you want even more control, take a look at the field called default_category - is a category predicted by the Veryfi model from the Veryfi list of best known categories.

Turn documents into customer insights

We understand the importance of the data classification Veryfi does, so we recommend you look at two other fields that Veryfi returns. If you are in the CPG loyalty or Expense management field, these might be important for you as they unlock customer insights and purchase behavior. Meet line_item.type and vendor_type together with a line item and document level category and vendor_type Veryfi gives you a great tool to validate purchases and unlock cross-basket insights to fuel your loyalty marketing programs or empower expense management solutions by flagging items not covered by the company travel reimbursement policy faster. E.g., a Hotel Folio with a line item type "alcohol" can be flagged instantly.

  • line_item.type - predicted from the Veryfi list

  • line_item.type - predicted from the Veryfi list, e.g., food


  • vendor_type type predicted from Veryfi list:

    other┃Grocery┃Taxi┃Fuel┃Hardware┃Online Shopping┃Restaurant┃Utilities┃Hotel┃Fast Food┃Department Store┃Convenience┃General Contractor┃Food┃Car Repair┃Coffee┃Parking┃Drugstore / Pharmacy┃Airlines┃Nurseries & Gardening┃Auto Parts┃Bakery┃Transportation┃Health┃Building Supplies┃Office Equipment

    Please find the full list of types in API Schema

Automation and data transformation

If you desire more control over the categorization process and define precise guidelines, you can create automation using rules. These rules can be designed to assign specific categories at both the document and line item levels based on predetermined criteria or conditions. Setting up these rules allows you to ensure that items are categorized according to your specific guidelines, allowing for a more tailored and consistent categorization process.

🧑‍🍳 Rules can be set up in your Veryfi Portal under Data Transformation > Rules

📚 Learn more about Data transformation on Line items FAQ

Smart Categorization and model training

In the current implementation, only document-level fields are participating in the model training. It's important to note that all changes made via PUT API call are used in the model training process. This allows the system to learn from the updated category assignment, continuously improving its accuracy and performance over time.

If your use case is dependent on the line item level category and you want to participate in the model training to improve the assignment, please reach out to us at

📚 Read more about Model training at Veryfi

Let's look at all the fields mentioned in this Article and their description one more time:


A category predicted from sent categories, user categories, or default ones


A request parameter that users can use to provide the list of custom categories


A category predicted from Veryfi list if custom categories are used


A line type predicted from Veryfi list, e.g. food, product, service.


Category taken from reviewed line item with same SKU and/or description. Otherwise from the document category


A vendor type predicted from Veryfi list, e.g. Taxi

If your JSON is missing any of the mentioned above, please contact

Did this answer your question?