You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "N. Ramasubramanian" <ra...@gmail.com> on 2015/02/16 12:33:36 UTC

Help needed on choosing OCR software

> 
> 
> Hi All,
> 
> Can you please help in choosing the right OCR software.. 
> 
> Requirements are :
> The scanned document may contain anything handwritten, Typed document in the form of PDF, TIFF, JPG/image files
> These has to be converted to a structured data. Most of these files will have scanned copy of doctor’s prescription.
> 
> Please let me know if any more details are required to suggest the software….
> 
> They are looking to use some big data technologies to store these and convert into structured data..
> 
> Have attached 3 sample images...
> 
> Thanks and Regards,
> Rams
> 
> 


Re: Help needed on choosing OCR software

Posted by anil gupta <an...@gmail.com>.
Hi Rams,

I don't think HBase mailing list is appropriate to search for an OCR.
Please use appropriate mailing list.

~Anil

On Mon, Feb 16, 2015 at 5:27 AM, hongbin ma <ma...@apache.org> wrote:

> I used to came across this: https://code.google.com/p/tesseract-ocr/
> AFAIK, OCR requires training if you want to get a high quality
> recognition.
> and it's not easy to have a model that suits all styles of hand writings
>
> On Mon, Feb 16, 2015 at 7:33 PM, N. Ramasubramanian <
> ramasubramanian.narayanan@gmail.com> wrote:
>
>>
>>
>> Hi All,
>>
>> Can you please help in choosing the right OCR software..
>>
>> Requirements are :
>> The scanned document may contain anything handwritten, Typed document in
>> the form of PDF, TIFF, JPG/image files
>> These has to be converted to a structured data. Most of these files will
>> have scanned copy of doctor’s prescription.
>>
>> Please let me know if any more details are required to suggest the
>> software….
>>
>> They are looking to use some big data technologies to store these and
>> convert into structured data..
>>
>> Have attached 3 sample images...
>>
>> Thanks and Regards,
>> Rams
>>
>>
>>
>>
>


-- 
Thanks & Regards,
Anil Gupta

Re: Help needed on choosing OCR software

Posted by hongbin ma <ma...@apache.org>.
I used to came across this: https://code.google.com/p/tesseract-ocr/
AFAIK, OCR requires training if you want to get a high quality recognition.
and it's not easy to have a model that suits all styles of hand writings

On Mon, Feb 16, 2015 at 7:33 PM, N. Ramasubramanian <
ramasubramanian.narayanan@gmail.com> wrote:

>
>
> Hi All,
>
> Can you please help in choosing the right OCR software..
>
> Requirements are :
> The scanned document may contain anything handwritten, Typed document in
> the form of PDF, TIFF, JPG/image files
> These has to be converted to a structured data. Most of these files will
> have scanned copy of doctor’s prescription.
>
> Please let me know if any more details are required to suggest the
> software….
>
> They are looking to use some big data technologies to store these and
> convert into structured data..
>
> Have attached 3 sample images...
>
> Thanks and Regards,
> Rams
>
>
>
>