File formats for invoice automation, does a PDF need to have OCR Applied to it?
There are many solutions that will allow a whole range of file formats that can be imported. Document restrictions for invoice automation are not new, however, solutions are becoming more flexible in allowing additional formats. but this may not be the correct approach though for several reasons.
Firstly, there is an impact on the ability to extract the data or OCR the document to read the characters. Secondly, you must consider legal admissibility in terms of whether you should be receiving documents that can be edited, as there are technical implications for these, but the primary concern should be admissibility.
The ability to OCR a document
With more suppliers now able to send PDF invoices directly to your mailbox, there is an increase of PDFs being processed now having an embedded text layer. Most OCR solutions will utilise this text layer for indexing purposes. However, don’t confuse OCR with the ability to auto-index and understand the document context, layout and the information that is being validated. With an embedded text layer all the characters are known and it then becomes easier for the indexing process to logically use this information.
Common image file formats such as PDF image TIFF files are usually the result of a document being scanned. Whenever a document is scanned image quality is critical for an OCR process, it benefits from being done with a configuration of black and white with 300dpi on your scanner. Anything above this is still usable, however, the file size can grow quickly and is unnecessary. Anything below this will be of too poor quality to ensure the character recognition has a high success rate.
Word and Excel both have text layers, however, they can cause technical issues. For example, if the file is created with a US date format but when you open these on a server that is configured for UK date, when the file is processed the date value can be flipped.
Critically, the main implication of these file formats is their editability for an admissibility purpose and governance. For PNG and JPG image file formats it solely comes down to the quality of the image they provide. They are normally far smaller in size and the dots per inch and are typically lower than Tiff. Additionally, they are more commonly used in email signatures and will be imported to your invoice processing solution as a result of allowing that file type. This will lead to a higher number of documents needing to be manually reviewed and rejected.
The Extensible Markup Language (XML) is a structured data format, it doesn’t require ‘OCR’ as all the characters are known and is typically used as part of an EDI process.
The following file types are not recommended;
Legal Admissibility and Governance
When undertaking Digital Transformation, one of the fundamental requirements of compliance and audit is that the document captured must not be edited. Additionally, you must be able to prove it has not been tampered with. With MS Word and Excel documents, this is not possible.
From a tax perspective and reporting tax liabilities, the danger associated with sending an editable document is that information on the invoice would no longer reflect what the supplier is submitting to the tax authority. The document could be edited with good intentions but when it comes to managing invoice documents they should be final.
There is also a concern that someone could edit the document internally. For example, if someone noticed an invoice as a Word document, they could amend the supplier details with their own bank account information. You should have certainty that the document you are receiving is the document you intend to pay.
If you need help with your invoice automation processes please get in touch or register for your own tailored demonstration of Mi Invoices
STILL KEYING INVOICES MANUALLY?
Why are so many businesses still keying invoices manually?
We’re in a dynamic and fast-changing world of technology and believe it or not, businesses are still manually undertaking tasks such as invoice processing. They are keying the invoice from the paper or a document viewer and entering each value one by one.
Why has it taken so long for companies to consider reducing the number of people and implementing an automated system? The reality is, that there are many reasons why, but most importantly, automated invoice processing solutions used to be expensive and complex to implement.
Customer Success Stories
Review our customer successes in accelerating their processing of Supplier Invoices including:
As part of their upgrade to Oracle ERP Cloud, a multinational asset management company seamlessly migrated to Mi Invoices to replace their old Invoice Automation platform.
As part of their migration to E-Business Suite 12.2.4, a leading newspaper upgraded to Mi Invoices to replace their Oracle WebCenter Imaging solution, which had become obsolete since its installation in 2011.
A global development and manufacturer of printing and control technologies had decided to stay on EBS but their Invoice Automation platform had become obsolete and unsupported. They needed a SaaS Invoice Automation tool integrated with their Oracle EBS platform, to meet their UK and European requirements.
Follow Us On