Mistral Document AI Understanding & OCR – Magic xpi

Goal 1 : Use Mistral AI Document Understanding and OCR REST API with your Magic xpi Plaform

Goal 2 : Enable flexible, full document lifecycle workflows, and make your archives instantly accessible.

Prerequisites :

Mistral AI API Key
Poppler utility installed (https://poppler.freedesktop.org/)

1/ Transform your document (Invoice PDF file or any kind of document) to PNG image.

Assume that you have this invoice below in PDF format (testfact.pdf) and want to extract information from it in json format

Use poppler utility in a command line in the first Magic xpi step to convert your pdf to png file. (use FileManagement connector)

Define 2 Environment variables :

cmd_processor : *C:\windows\system32\cmd

Poppler_Bin_Path : C:\poppler-24.08.0\Library\bin\

2/ Upload your image file using the Mistal IA REST API (https://api.mistral.ai/v1/files)

You can create your API KEY on your Mistral console (La Plateforme – Mistral AI) like below

Define your Mistral AI REST Client resource with your Authorization header (API KEY) and 3 paths

Add 2 parameters (purpose, file) to Request (/v1/files)

/Configure your REST Client connector like below

Retrieve the Uploaded Id File from the DataMapper using JSON Source in a Flow variable (F.FileId)

3/ Retrieve your uploaded Image URL by passing the id (https://api.mistral.ai/v1/files/:file_id/url)

Click OK and pass your F.FileId in the mapper

Retrieve the URL from the JSON URL with a Datamapper

4/ Populate the prompt body and post it to the URL : https://api.mistral.ai/v1/chat/completions

Build your prompt JSON Body using Magic xpi Template (use the <!$MG_xxx> Tags in the template file) or a JSON schema

Important : to get a generic JSON response, you can use the tag response_format with type : json_schema

than you need to define the json schema that you want inside this tag

Use the Datamapper to populate the template tags with values or your json file

The text prompt could be like below ( we use the model (mistral-small-latest) with Mistral Vision capabilities)

‘From this json, extract the Date, Address To, Ship To and Model name and color and associated product code, unitprice and total price and return it as a string in a Json object‘

Your prompt Body should be like below

(*) if your first step is generating several png files, you need to iterate on the « image_url » node.

In this case, you have use the <!$MGREPEAT>and <!$MGENDREPEAT>tags or Multiple JSON node

(**) you can add a role « system » node with any content to enrich your prompt with the user text

(***) you can change the response_format type to json_schema and add inside this tag, your generic json schema

5/ Extract the JSON Content part from the result and parse it to send data to any backend system

You can use Strtoken function to extract the Content part : (StrToken (StrToken (C.UserBlob,2,' »content »: »‘),1,' »}, »finish_reason »: »stop »}]’))

And then replace the \n with empty space and \ » with » (RepStr (RepStr (C.UserBlob,’\n’, »),’\ »‘,' »‘))

(*) OCR and Document Understanding | Mistral AI Large Language Models

Laisser un commentaire Annuler la réponse