In the business world, automating the reading of documents such as invoices can save a lot of time and prevent many errors. In this workshop, we'll see how easy it is to use. Azure Form Recognizer to parse PDF invoices and save their structured data to a JSON file.
? What is Azure Form Recognizer?
Form Recognizer is a Microsoft Azure artificial intelligence service that extracts structured information from unstructured documents such as invoices, receipts, forms, etc. We will use the pre-trained model for invoices, ideal for getting started without having to train your own model.
?The code step by step
1. Imports and credentials
We import the necessary Azure libraries and the module json.
from azure.ai.formrecognizer
import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
import json
2. Function to extract information through the Azure service
This function receives a PDF file, sends it to the Azure service, and returns the extraction result.
DocumentAnalysisClient: Azure client to communicate with the service.begin_analyze_document: Start the analysis using the model"prebuilt-invoice", specialized in invoices.cls=...: This part defines how the response should be processed. We usejson.loads(...)to convert the HTTP response into a Python dictionary.

? 3. Run the analysis and save the result
- The PDF file to be read is opened with the first
with openThis way we ensure that the stream closes only when it ends. The parameter is used"rb"because it will be used in binary read mode. - We then assign to result the value returned after analyzing the PDF using the aforementioned function.
- A second opens
with openthis time in writing mode"w", the result of the reading is saved in the dump method of the json module, which will make it easier to go from the Python dictionary to the information serialized in JSON. - The file will be saved in a JSON file as the raw result of the call.

✅ What does Azure Form Recognizer return?
In the case of invoices, the pre-trained model can return:
- Invoice number
- Issue date
- Total
- Subtotal, taxes, discounts
- Supplier and customer name
- Product or service lines
- Payment methods
All this becomes a Structured JSON ready to integrate with other systems.
?️ Requirements to run this code
Before using this script, make sure you:
- Have an Azure account and have created a resource Form Recognizer.
- Having obtained your API Key Y Endpoint from the Azure portal.
- Install the necessary packages:
pip install azure-ai-formrecognizer
? Conclusion
This small script automates PDF invoice reading using AI with just a few lines of code. With Azure Form Recognizer, you can easily scale to thousands of documents, saving time and avoiding human error.
And you? Are you ready to stop reading invoices by hand? Contact us if you found this interesting.




