Strategy, automation & security

What is AI Builder (document extraction / OCR)?

AI Builder is the AI component of the Microsoft Power Platform that lets you use pretrained and custom AI models without deep data science skills. A central use case is document extraction via OCR: AI Builder reads fields out of PDFs, invoices or forms and makes them available for automated processing.

Also known as: document processing · OCR · text recognition · document extraction · AI Builder

01

Where AI Builder is used

OCR (optical character recognition) converts image or PDF content into machine-readable text. AI Builder goes further: with document processing models it recognizes not only text but also structure — that is, which field is the invoice amount, the order number or the date. This structured output can be passed directly into downstream processes.

AI Builder is tightly integrated with Power Automate: a flow passes a document to the model, receives the extracted fields back and processes them further. This replaces manual data entry steps that previously caused broken handoffs and errors.

02

A practical example

A logistics company receives orders as PDFs in different layouts. A trained AI Builder document extraction model reliably reads the relevant fields — such as sender, line items and quantities — even when the formats differ. Power Automate takes the extracted data and transfers it into the target system without manual input.

03

How it relates & how smiit uses it

Pure OCR only delivers text, AI Builder delivers structured, field-based data — that is the decisive difference for automation. AI Builder is not a standalone workflow engine but is orchestrated through Power Automate and often feeds into systems whose master data was consolidated beforehand. For G&B Logistics GmbH, smiit automated PDF order capture with AI Builder and Power Automate, saving around 140 working hours per month.

Common mistakes & misconceptions

  • AI Builder is often seen as a replacement for custom data-science projects; in fact it offers prebuilt and easily trainable models for well-defined tasks, not arbitrary custom AI.
  • Many expect document extraction to be error-free; especially with poor scan quality or unusual layouts, the recognized data still needs validation.
  • People overlook that AI Builder consumes paid credits and is tied to Power Platform licensing.

Frequently asked questions

Does AI Builder also work with different document layouts?

Yes. Document processing models can be trained on different layouts and recognize fields even when the formats vary. Accuracy improves with suitable training examples.

Do I need a data science team for AI Builder?

No. AI Builder is designed to be usable without deep data science skills. For integrating it into processes and training it on your own documents, experience with the Power Platform is nonetheless helpful.

What is the difference between AI Builder and pure OCR?

Pure OCR merely converts a document into text without understanding its meaning. AI Builder additionally recognizes the structure and returns named fields such as invoice amount or order number that can be processed directly.

How many sample documents are needed for training?

That depends on the complexity and the variety of layouts. For clearly structured documents a few examples are often enough, whereas widely varying formats require more training examples for field recognition to become reliable.

What happens if AI Builder reads a field incorrectly or with low certainty?

AI Builder typically provides confidence scores for the fields it recognizes. These can be used to automatically route uncertain cases for manual review instead of passing faulty data unchecked into downstream processes.

Related terms

Sources & further reading

Want to put this topic to work in your company?

Updated · Back to the glossary

Get in touch