DataXLab.org

DoT-Net

architecture

 Document Layout Classification Using Texture-based CNN (DoT-Net) can effectively and simultaneously classify multiple classes of document blocks. Our main contributions are: (1) adopting a dilated convolutional layer replacing all convolutional layers for the texture based analysis, (2) automatic feature extraction via a deep learning model rather than using explicitly predefined features, and (3) extending to multiclass classification whereas previous methods have typically focused on binary classification of text vs. nontext.

This study was published in ICDAR 2019:
S. Kosaraju, M. Masum, N. Tsaku, P. Patel, T. Bayramoglu, G. Modgil, and M. Kang, "DoT-Net: Document Layout Classification Using Texture-based CNN", The 15th International Conference on Document Analysis and Recognition (ICDAR), 2019 


Citation

@INPROCEEDINGS{8977986,
               author = {Kosaraju, Sai Chandra and Masum, Mohammed and Tsaku, Nelson Zange and Patel, Pritesh
                         and Bayramoglu, Tanju and Modgil, Girish and Kang, Mingon},
               booktitle = {2019 International Conference on Document Analysis and Recognition (ICDAR)},
               title = {DoT-Net: Document Layout Classification Using Texture-Based CNN},
               year = {2019},
               pages={1029-1034},
}
		

Source code: