Skip to content

ocr #1

Open
aidinkrmz wants to merge 67 commits intotheraysmith:masterfrom
0xqq:master
Open

ocr #1
aidinkrmz wants to merge 67 commits intotheraysmith:masterfrom
0xqq:master

Conversation

@aidinkrmz
Copy link
Copy Markdown

Hello
I'm a software engineering student and i use tesseract OCR engine in a university project. For persian language, traineddata which it's a file and it made by Training tesseract 4.00 and LSTM method, has a good result and output in Arial fonts but it doesn't have any good result in some specific fonts for persian. So the questions are :
1- did you use specific fonts like B Nazanin , B Roya or etc in Training Tesseract 4.00 with LSTM or not?
2- if they haven't used how can we use these fonts for getting better result?
I prepared a text that all the cases of litrates have repeated for 10 or 15 or more than 15 times in this text. Also i used the process of training tesseract 3.05 for this text but i didn't get better and beneficial output.
For achieving to a good result in persian in Tesseract OCR engine we need your experience and your help.
Thanks for your attention
Sincerely.

Ce Ge and others added 30 commits May 15, 2016 15:13
Corrected "One this" to "Once this" and added comma for proper punctuation on line 702.
Revert "Use open() instead of tf.gfile.FastGFile()"
Fix broken link in inception readme
added python3 support to read_label_file
For English News Corpus,
[Ling et al. (2015)](http://www.cs.cmu.edu/~lingwang/papers/emnlp2015.pdf)'s score is 
97.78 -> 97.44 (lower than SyntaxNet and Parsey Mcparseface)
according to [Andor et al. (2016)](http://arxiv.org/abs/1603.06042).
Fix POS tagging score of Ling et al.(2005)
"threads" declared twice, so delete one
Add Inception-ResNet-v2 pre-trained model
Fix comment of parameter "output_codes"
Fix end point collection to return a dict
panyx0718 and others added 30 commits October 27, 2016 21:37
Explicitly set state_is_tuple=False.
Differential privacy analysis for the privacy model tutorial
Added STREET model for FSNS dataset
Consolidate privacy/ and differential_privacy/.
Now differential_privacy and privacy are
under the same project.
Remove privacy/ after consolidation.
val_captions_file -> captions_val2014.json
Remove comment that TensorFlow must be built from source.
Update compression model README with results for comparison.
Adding list of maintainers
Changing model links to point to tensorflow/models repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.