A Computational Approach for Recognizing Text in Digital and Natural Frames
by Mithun Dutta1 , Dhonita Tripura1 and Jugal Krishna Das2
1 Department of Computer Science and Engineering, Rangamati Science and Technology University, Rangamati-4500, Bangladesh
2 Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka-1342, Bangladesh
* Author to whom correspondence should be addressed.
Journal of Engineering Research and Sciences, Volume 3, Issue 7, Page # 53-58, 2024; DOI: 10.55708/js0307005
Keywords: Iterative, Component, Recognition, Bilingual
Received: 16 May, 2024, Revised: 02 July, 2024, Accepted: 09 July, 2024, Published Online: 23 July, 2024
APA Style
Dutta, M., Tripura, D., & Das, J. K. (2024). A flourished approach for recognizing text in digital and natural frames. Journal of Engineering Research and Sciences, 3(7), 53-58. https://doi.org/10.55708/js0307005
Chicago/Turabian Style
Dutta, Mithun, Dhonita Tripura, and Jugal Krishna Das. “A Flourished Approach for Recognizing Text in Digital and Natural Frames.” Journal of Engineering Research and Sciences 3, no. 7 (2024): 53-58. https://doi.org/10.55708/js0307005.
IEEE Style
M. Dutta, D. Tripura, and J. K. Das, “A Flourished Approach for Recognizing Text in Digital and Natural Frames,” Journal of Engineering Research and Sciences, vol. 3, no. 7, pp. 53-58, 2024, doi: 10.55708/js0307005.
Acquiring tenable text detection and recognition outcomes for natural scene images as well as for digital frames is very challenging emulating task. This research approaches a method of text identification for the English language which has advanced significantly, there are particular difficulties when applying these methods to languages such as Bengali because of variations in script, morphology. Text identification and recognition is accomplished on multifarious distinct steps. Firstly, a photo is taken with the help of a device and then, Connected Component Analysis (CCA) and Conditional Random Field (CRF) model are introduced for localization of text components. Secondly, a merged model (region-based Convolutional Neural Network (Mask-R-CNN) and Feature Pyramid Network (FPN)) are used to detect and classify text from images into computerized form. Further, we introduce a combined method of Convolutional Recurrent Neural Network (CRNN), Connectionist Temporal Classification (CTC) with K-Nearest Neighbors (KNN) Algorithm for extracting text from images/ frames. As the goal of this research is to detect and recognize the text using a machine learning-based model a new Fast Iterative Nearest Neighbor (Fast INN) algorithm is now proposed based on patterns and shapes of text components. Our research focuses on a bilingual issue (Bengali and English) as well as it producing satisfactory image experimental outcome with better accuracy and it gives around 98% accuracy for our proposed text recognition methods which is better than the previous studies.
- Li, D. Doermann, and O. Kia, “Automatic Text Detection and Tracking in Digital Video,” IEEE Transactions on Image Processing, vol. 9, no. 1, 147-156, Jan. 2000, doi: 10.1109/83.817607.
- Chen, J.-M. Odobez, and H. Bourlard, “Text detection and recognition in images and video frames,” Pattern Recognition, vol. 37, 595-608, 2004, doi:10.1016/j.patcog.2003.06.001.
- Shivakumara, T. Q. Phan, and C. L. Tan, “A Laplacian Approach to Multi-Oriented Text Detection in Video,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 2, 412-420, Feb. 2011, doi:10.1109/TPAMI.2010.166.
- R. Lyu, J. Song, and M. Cai, “A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, 243-255, Feb. 2005, doi:10.1109/TCSVT.2004.841653.
- -C. Yin, Z.-Y. Zuo, S. Tian, and C.-L. Liu, “Text Detection, Tracking and Recognition in Video: A Comprehensive Survey,” IEEE Transactions on Image Processing, 1-24, 2015, doi: 10.1109/TIP.2016.2554321.
- Ye, Q. Huang, W. Gao, and D. Zhao, “Fast and robust text detection in images and video frames,” Image and Vision Computing, vol. 23, 565-576, 2005, doi:10.1016/j.imavis.2005.01.004.
- S. Raghunandan, P. Shivakumara, S. Roy, G. H. Kumar, U. Pal, and T. Lu, “Multi-Script-Oriented Text Detection and Recognition in Video/Scene/Born Digital Images,” IEEE Transactions on Circuits and Systems for Video Technology, doi: 10.1109/TCSVT.2018.2817642, 2018.
- Cai, J. Song, and M. R. Lyu, “A New Approach for Video Text Detection,” in Proceedings. International Conference on Image Processing, 2002, 117-120, doi: 10.1109/ICIP.2002.1037973.
- S. Shin, K. I. Kim, M. H. Park, and H. J. Kim, “Support Vector Machine-Based Text Detection in Digital Video,” in Proceedings of IEEE ICIP, 2000, 634-641.
- Wang, S. Huang, and L. Jin, “Focus On Scene Text Using Deep Reinforcement Learning,” in Proceedings of the 24th International Conference on Pattern Recognition (ICPR), Beijing, China, Aug. 2018, 3759-3766, doi: 10.1109/ICPR.2018.8545022.
- Wang, “Extraction Algorithm of English Text Information from Color Images Based on Radial Wavelet Transform,” Special Section on Gigapixel Panoramic Video with Virtual Reality, Aug. 2020, doi 10.1109/ACCESS.2020.3020621.
- Y. Ling, L. B. Theng, A. Chai, and C. McCarthy, “A Model for Automatic Recognition of Vertical Text in Natural Scene Images,” in 2018 8th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, 2018, doi: 10.1109/ICCSCE.2018.8685019.
- M. Dutta, A. Mohajon, S. Dev, D. S. Bappi, and J. K. Das, “Text Recognition of Bangla and English Scripts in Natural Scene Images,” International Journal of Advanced Research in Science and Technology, vol. 12, no. 10, 1137-1142, 2023