Adopting attention and cross-layer features for fine-grained representation

Fine-grained visual classification (FGVC) is challenging task due to discriminative feature representations. The attention-based methods show great potential for FGVC, which neglect that the deeply digging inter-layer feature relations have an impact on refining feature learning. Similarly, the asso...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sun, Fayou, Ngo, Hea Choon, Sek, Yong Wee
Format:	Article
Language:	en
Published:	Institute Of Electrical And Electronics Engineers Inc. 2022
Online Access:	http://eprints.utem.edu.my/id/eprint/26209/2/ADOPTING_ATTENTION_AND_CROSS-LAYER_FEATURES_FOR_FINE-GRAINED_REPRESENTATION.PDF http://eprints.utem.edu.my/id/eprint/26209/ https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9847252
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Fine-grained visual classification (FGVC) is challenging task due to discriminative feature representations. The attention-based methods show great potential for FGVC, which neglect that the deeply digging inter-layer feature relations have an impact on refining feature learning. Similarly, the associating cross-layer features methods achieve significant feature enhancement, which lost the long-distance dependencies between elements. However, most of the previous researches neglect that these two methods are mutually correlated to reinforce feature learning, which are independent of each other in related models. Thus, we adopt the respective advantages of the two methods to promote fine-gained feature representations. In this paper, we propose a novel CLNET network, which effectively applies attention mechanism and cross-layer features to obtain feature representations. Specifically, CL-NET consists of 1) adopting self attention to capture long-rang dependencies for each element, 2) associating cross-layer features to reinforce feature learning, and 3) to cover more feature regions, we integrate attention-based operations between output and input. Experiments verify that CLNET yields new state-of-the-art performance on three widely used fine grained benchmarks, including CUB-200-2011, Stanford Cars and FGVC-Aircraft. The url of our code is https://github.com/dlearing/CLNET.git.

Adopting attention and cross-layer features for fine-grained representation

Similar Items