Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Multispectral image fusion based pedestrian detection using a multilayer fused deconvolutional single-shot detector

Not Accessible

Your library or personal account may give you access

Abstract

Recent research has demonstrated that effective fusion of multispectral images (visible and thermal images) enables robust pedestrian detection under various illumination conditions (e.g., daytime and nighttime). However, there are some open problems such as poor performance in small-sized pedestrian detection and high computational cost of multispectral information fusion. This paper proposes a multilayer fused deconvolutional single-shot detector that contains a two-stream convolutional module (TCM) and a multilayer fused deconvolutional module (MFDM). The TCM is used to extract convolutional features from multispectral input images. Then fusion blocks are incorporated into the MFDM to combine high-level features with rich semantic information and low-level features with detailed information to generate features with strong a representational power for small pedestrian instances. In addition, we fuse multispectral information at multiple deconvolutional layers in the MFDM via fusion blocks. This multilayer fusion strategy adaptively makes the most use of visible and thermal information. In addition, using fusion blocks for multilayer fusion can reduce the extra computational cost and redundant parameters. Empirical experiments show that the proposed approach achieves an 81.82% average precision (AP) on a new small-sized multispectral pedestrian dataset. The proposed method achieves the best performance on two well-known public multispectral datasets. On the KAIST multispectral pedestrian benchmark, for example, our method achieves a 97.36% AP and a 20 fps detection speed, which outperforms the state-of-the-art published method by 6.82% in AP and is three times faster in its detection speed.

© 2020 Optical Society of America

Full Article  |  PDF Article
More Like This
Exploiting fusion architectures for multispectral pedestrian detection and segmentation

Dayan Guan, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, and Christel-Loic Tisse
Appl. Opt. 57(18) D108-D116 (2018)

Multiscale feature pyramid network based on activity level weight selection for infrared and visible image fusion

Rui Xu, Gang Liu, Yuning Xie, Bavirisetti Durga Prasad, Yao Qian, and Mengliang Xing
J. Opt. Soc. Am. A 39(12) 2193-2204 (2022)

Accurate stacked-sheet counting method based on deep learning

Dieuthuy Pham, Minhtuan Ha, Cao San, and Changyan Xiao
J. Opt. Soc. Am. A 37(7) 1206-1218 (2020)

Supplementary Material (1)

NameDescription
Visualization 1       This video shows a result demo of our method.

Cited By

You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Figures (12)

You do not have subscription access to this journal. Figure files are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Tables (4)

You do not have subscription access to this journal. Article tables are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Equations (6)

You do not have subscription access to this journal. Equations are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.