Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention

Levent Karacan

doi:10.35377/saucis...1517723

Research Article

Year 2025, Volume: 8 Issue: 1, 47 - 57, 28.03.2025

Levent Karacan

https://doi.org/10.35377/saucis...1517723

Abstract

References

S. Jiddi, P. Robert, and E. Marchand, “Detecting specular reflections and cast shadows to estimate reflectance and illumination of dynamic indoor scenes,” IEEE Trans. Vis. Comput. Graph., vol. 28, no. 2, pp. 1249–1260, 2020.
S. A. Shafer, “Using color to separate reflection components,” Color Res. Appl., vol. 10, no. 4, pp. 210–218, 1985.
L. T. Maloney and B. A. Wandell, “Color constancy: a method for recovering surface spectral reflectance,” in Readings in Computer Vision, Elsevier, 1987, pp. 293–297.
Osadchy and Ramamoorthi, “Using specularities for recognition,” in IEEE ICCV, IEEE, 2003, pp. 1512–1519.
J. B. Park and A. C. Kak, “A truncated least squares approach to detecting specular highlights in color images,” in IEEE ICRA, IEEE, 2003, pp. 1397–1403.
O. El Meslouhi, M. Kardouchi, H. Allali, T. Gadi, and Y. A. Benkaddour, “Automatic detection and inpainting of specular reflections for colposcopic images,” Cent. Eur. J. Comput. Sci., vol. 1, pp. 341–354, 2011.
R. Li, J. Pan, Y. Si, B. Yan, Y. Hu, and H. Qin, “Specular reflections removal for endoscopic image sequences with adaptive-RPCA decomposition,” IEEE Trans. Med. Imaging, vol. 39, no. 2, pp. 328–340, 2019.
W. Zhang, X. Zhao, J.-M. Morvan, and L. Chen, “Improving shadow suppression for illumination robust face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 3, pp. 611–624, 2018.
Q. Yang, S. Wang, and N. Ahuja, “Real-time specular highlight removal using bilateral filtering,” in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, Springer, 2010, pp. 87–100.
H. Kim, H. Jin, S. Hadap, and I. Kweon, “Specular reflection separation using dark channel prior,” in IEEE CVPR, 2013, pp. 1460–1467.
Q. Yang, J. Tang, and N. Ahuja, “Efficient and robust specular highlight removal,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1304–1311, 2014.
Y. Liu, Z. Yuan, N. Zheng, and Y. Wu, “Saturation-preserving specular reflection separation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3725–3733.
J. Suo, D. An, X. Ji, H. Wang, and Q. Dai, “Fast and high quality highlight removal from a single image,” IEEE Trans. Image Process., vol. 25, no. 11, pp. 5441–5454, 2016.
T. Yamamoto, T. Kitajima, and R. Kawauchi, “Efficient improvement method for separation of reflection components based on an energy function,” in 2017 IEEE international conference on image processing (ICIP), IEEE, 2017, pp. 4222–4226.
I. Funke, S. Bodenstedt, C. Riediger, J. Weitz, and S. Speidel, “Generative adversarial networks for specular highlight removal in endoscopic images,” in Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, SPIE, 2018, pp. 8–16.
S. Muhammad, M. N. Dailey, M. Farooq, M. F. Majeed, and M. Ekpanyapong, “Spec-Net and Spec-CGAN: Deep learning models for specularity removal from faces,” Image Vis. Comput., vol. 93, p. 103823, 2020.
G. Fu, Q. Zhang, Q. Lin, L. Zhu, and C. Xiao, “Learning to Detect Specular Highlights from Real-world Images,” in ACM Multimedia, 2020, pp. 1873–1881.
G. Fu, Q. Zhang, L. Zhu, P. Li, and C. Xiao, “A multi-task network for joint specular highlight detection and removal,” in IEEE/CVF CVPR, 2021, pp. 7752–7761.
Z. Wu, C. Zhuang, J. Shi, J. Xiao, and J. Guo, “Deep specular highlight removal for single real-world image,” in SIGGRAPH Asia 2020 Posters, 2020, pp. 1–2.
G. Fu, Q. Zhang, L. Zhu, C. Xiao, and P. Li, “Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data,” in IEEE/CVF ICCV, 2023, pp. 12857–12865.
Z. Wu, J. Guo, C. Zhuang, J. Xiao, D.-M. Yan, and X. Zhang, “Joint specular highlight detection and removal in single images via Unet-Transformer,” Comput. Vis. Media, vol. 9, no. 1, pp. 141–154, 2023.
J. Shi, Y. Dong, H. Su, and S. X. Yu, “Learning non-lambertian object intrinsics across shapenet categories,” in IEEE CVPR, 2017, pp. 1685–1694.
Z. Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada: IEEE, Oct. 2021, pp. 9992–10002. doi: 10.1109/ICCV48922.2021.00986.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in International Conference on Learning Representations, 2020.
Y. Li, K. Zhang, J. Cao, R. Timofte, and L. Van Gool, “Localvit: Bringing locality to vision transformers,” ArXiv Prepr. ArXiv210405707, 2021.
L. Karacan, “Multi-image transformer for multi-focus image fusion,” Signal Process. Image Commun., vol. 119, p. 117058, 2023.
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE CVPR, 2018, pp. 7132–7141.
C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. Jorge Cardoso, “Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3, Springer, 2017, pp. 240–248.
H.-L. Shen, H.-G. Zhang, S.-J. Shao, and J. H. Xin, “Chromaticity-based separation of reflection components in a single image,” Pattern Recognit., vol. 41, no. 8, pp. 2461–2469, 2008.
J. Lin, M. El Amine Seddik, M. Tamaazousti, Y. Tamaazousti, and A. Bartoli, “Deep multi-class adversarial specularity removal,” in Image Analysis: 21st Scandinavian Conference, SCIA 2019, Norrköping, Sweden, June 11–13, 2019, Proceedings 21, Springer, 2019, pp. 3–15.

Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention

Year 2025, Volume: 8 Issue: 1, 47 - 57, 28.03.2025

Levent Karacan

https://doi.org/10.35377/saucis...1517723

Abstract

Specular highlights play a pivotal role in comprehending scenes within developed visual environment. Nevertheless, their presence can adversely affect the efficacy of solutions in various computer vision tasks. Current methodologies typically use Convolutional Neural Network (CNN)-based Unet architectures for specular highlight detection. However, CNNs exhibit limitations in capturing global contextual information, despite excelling in local context analysis. To utilize global context information, it is proposed a novel network architecture leveraging Vision Transformers (ViTs) to jointly detect and remove specular highlights for a given image. Developed model incorporates a multi-scale patch-based self-attention mechanism to effectively capture global context, alongside a CNN-based feed-forward network for local contextual cues. Experimental results with both quantitative and qualitative evaluations demonstrate that the proposed approach achieves state-of-the-art performance.

Keywords

Specular highlight detection , Specular highlight removal , Vision transformers , Convolutional neural networks

References

S. Jiddi, P. Robert, and E. Marchand, “Detecting specular reflections and cast shadows to estimate reflectance and illumination of dynamic indoor scenes,” IEEE Trans. Vis. Comput. Graph., vol. 28, no. 2, pp. 1249–1260, 2020.
S. A. Shafer, “Using color to separate reflection components,” Color Res. Appl., vol. 10, no. 4, pp. 210–218, 1985.
L. T. Maloney and B. A. Wandell, “Color constancy: a method for recovering surface spectral reflectance,” in Readings in Computer Vision, Elsevier, 1987, pp. 293–297.
Osadchy and Ramamoorthi, “Using specularities for recognition,” in IEEE ICCV, IEEE, 2003, pp. 1512–1519.
J. B. Park and A. C. Kak, “A truncated least squares approach to detecting specular highlights in color images,” in IEEE ICRA, IEEE, 2003, pp. 1397–1403.
O. El Meslouhi, M. Kardouchi, H. Allali, T. Gadi, and Y. A. Benkaddour, “Automatic detection and inpainting of specular reflections for colposcopic images,” Cent. Eur. J. Comput. Sci., vol. 1, pp. 341–354, 2011.
R. Li, J. Pan, Y. Si, B. Yan, Y. Hu, and H. Qin, “Specular reflections removal for endoscopic image sequences with adaptive-RPCA decomposition,” IEEE Trans. Med. Imaging, vol. 39, no. 2, pp. 328–340, 2019.
W. Zhang, X. Zhao, J.-M. Morvan, and L. Chen, “Improving shadow suppression for illumination robust face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 3, pp. 611–624, 2018.
Q. Yang, S. Wang, and N. Ahuja, “Real-time specular highlight removal using bilateral filtering,” in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, Springer, 2010, pp. 87–100.
H. Kim, H. Jin, S. Hadap, and I. Kweon, “Specular reflection separation using dark channel prior,” in IEEE CVPR, 2013, pp. 1460–1467.
Q. Yang, J. Tang, and N. Ahuja, “Efficient and robust specular highlight removal,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1304–1311, 2014.
Y. Liu, Z. Yuan, N. Zheng, and Y. Wu, “Saturation-preserving specular reflection separation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3725–3733.
J. Suo, D. An, X. Ji, H. Wang, and Q. Dai, “Fast and high quality highlight removal from a single image,” IEEE Trans. Image Process., vol. 25, no. 11, pp. 5441–5454, 2016.
T. Yamamoto, T. Kitajima, and R. Kawauchi, “Efficient improvement method for separation of reflection components based on an energy function,” in 2017 IEEE international conference on image processing (ICIP), IEEE, 2017, pp. 4222–4226.
I. Funke, S. Bodenstedt, C. Riediger, J. Weitz, and S. Speidel, “Generative adversarial networks for specular highlight removal in endoscopic images,” in Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, SPIE, 2018, pp. 8–16.
S. Muhammad, M. N. Dailey, M. Farooq, M. F. Majeed, and M. Ekpanyapong, “Spec-Net and Spec-CGAN: Deep learning models for specularity removal from faces,” Image Vis. Comput., vol. 93, p. 103823, 2020.
G. Fu, Q. Zhang, Q. Lin, L. Zhu, and C. Xiao, “Learning to Detect Specular Highlights from Real-world Images,” in ACM Multimedia, 2020, pp. 1873–1881.
G. Fu, Q. Zhang, L. Zhu, P. Li, and C. Xiao, “A multi-task network for joint specular highlight detection and removal,” in IEEE/CVF CVPR, 2021, pp. 7752–7761.
Z. Wu, C. Zhuang, J. Shi, J. Xiao, and J. Guo, “Deep specular highlight removal for single real-world image,” in SIGGRAPH Asia 2020 Posters, 2020, pp. 1–2.
G. Fu, Q. Zhang, L. Zhu, C. Xiao, and P. Li, “Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data,” in IEEE/CVF ICCV, 2023, pp. 12857–12865.
Z. Wu, J. Guo, C. Zhuang, J. Xiao, D.-M. Yan, and X. Zhang, “Joint specular highlight detection and removal in single images via Unet-Transformer,” Comput. Vis. Media, vol. 9, no. 1, pp. 141–154, 2023.
J. Shi, Y. Dong, H. Su, and S. X. Yu, “Learning non-lambertian object intrinsics across shapenet categories,” in IEEE CVPR, 2017, pp. 1685–1694.
Z. Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada: IEEE, Oct. 2021, pp. 9992–10002. doi: 10.1109/ICCV48922.2021.00986.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in International Conference on Learning Representations, 2020.
Y. Li, K. Zhang, J. Cao, R. Timofte, and L. Van Gool, “Localvit: Bringing locality to vision transformers,” ArXiv Prepr. ArXiv210405707, 2021.
L. Karacan, “Multi-image transformer for multi-focus image fusion,” Signal Process. Image Commun., vol. 119, p. 117058, 2023.
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE CVPR, 2018, pp. 7132–7141.
C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. Jorge Cardoso, “Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3, Springer, 2017, pp. 240–248.
H.-L. Shen, H.-G. Zhang, S.-J. Shao, and J. H. Xin, “Chromaticity-based separation of reflection components in a single image,” Pattern Recognit., vol. 41, no. 8, pp. 2461–2469, 2008.
J. Lin, M. El Amine Seddik, M. Tamaazousti, Y. Tamaazousti, and A. Bartoli, “Deep multi-class adversarial specularity removal,” in Image Analysis: 21st Scandinavian Conference, SCIA 2019, Norrköping, Sweden, June 11–13, 2019, Proceedings 21, Springer, 2019, pp. 3–15.

There are 30 citations in total.

Details

Primary Language	English
Subjects	Computer Software
Journal Section	Research Article
Authors	Levent Karacan 0000-0003-2764-5258
Early Pub Date	March 27, 2025
Publication Date	March 28, 2025
Submission Date	July 17, 2024
Acceptance Date	February 22, 2025
Published in Issue	Year 2025 Volume: 8 Issue: 1

Cite

APA	Karacan, L. (2025). Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. Sakarya University Journal of Computer and Information Sciences, 8(1), 47-57. https://doi.org/10.35377/saucis...1517723
AMA	Karacan L. Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. SAUCIS. March 2025;8(1):47-57. doi:10.35377/saucis.1517723
Chicago	Karacan, Levent. “Joint Detection and Removal of Specular Highlights Using Vision Transformer With Multi-Scale Patch Attention”. Sakarya University Journal of Computer and Information Sciences 8, no. 1 (March 2025): 47-57. https://doi.org/10.35377/saucis. 1517723.
EndNote	Karacan L (March 1, 2025) Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. Sakarya University Journal of Computer and Information Sciences 8 1 47–57.
IEEE	L. Karacan, “Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention”, SAUCIS, vol. 8, no. 1, pp. 47–57, 2025, doi: 10.35377/saucis...1517723.
ISNAD	Karacan, Levent. “Joint Detection and Removal of Specular Highlights Using Vision Transformer With Multi-Scale Patch Attention”. Sakarya University Journal of Computer and Information Sciences 8/1 (March2025), 47-57. https://doi.org/10.35377/saucis. 1517723.
JAMA	Karacan L. Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. SAUCIS. 2025;8:47–57.
MLA	Karacan, Levent. “Joint Detection and Removal of Specular Highlights Using Vision Transformer With Multi-Scale Patch Attention”. Sakarya University Journal of Computer and Information Sciences, vol. 8, no. 1, 2025, pp. 47-57, doi:10.35377/saucis. 1517723.
Vancouver	Karacan L. Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. SAUCIS. 2025;8(1):47-5.

Download Cover Image

Article Files

Full Text

INDEXING & ABSTRACTING & ARCHIVING

29070 The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License