Deep Learning for Invoice Information Extraction

I am very new to the field of Deep learning, can you guys please help me with an idea to extract invoice information from invoice using the Deep learning.
I would like to use unsupervised learning with unlabeled data. For Image/PDF to text extraction
I have used Amazon textract api.

Regards,
Santosh H

Tesseract OCR is one of the best OCR models available, however it does give some funky outputs depending on the input image
A few basic OpenCV operations can be used on the input images like negative thresholding ie turning the background black and test white, image straightening, etc

Hey buddy!
From what I have I understood, I think you only need to extract information regarding a product(or service) that has it’s name on the invoice and (information relating to it, maybe price and quantity).

I just want to say, you don’t need deep learning for that. You can use a library (pdf miner, in my opinion) to extract relevant information from PDF, and then use it. It would be a lot more complex without it and won’t worth the effort and time.

If you need any help, do reply and I’ll try my level best to make things moving for you.
All the best!

© Copyright 2013-2019 Analytics Vidhya