Research Article
BibTex RIS Cite

Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention

Year 2025, Volume: 8 Issue: 1, 47 - 57, 28.03.2025
https://doi.org/10.35377/saucis...1517723

Abstract

Specular highlights play a pivotal role in comprehending scenes within developed visual environment. Nevertheless, their presence can adversely affect the efficacy of solutions in various computer vision tasks. Current methodologies typically use Convolutional Neural Network (CNN)-based Unet architectures for specular highlight detection. However, CNNs exhibit limitations in capturing global contextual information, despite excelling in local context analysis. To utilize global context information, it is proposed a novel network architecture leveraging Vision Transformers (ViTs) to jointly detect and remove specular highlights for a given image. Developed model incorporates a multi-scale patch-based self-attention mechanism to effectively capture global context, alongside a CNN-based feed-forward network for local contextual cues. Experimental results with both quantitative and qualitative evaluations demonstrate that the proposed approach achieves state-of-the-art performance.

References

  • S. Jiddi, P. Robert, and E. Marchand, “Detecting specular reflections and cast shadows to estimate reflectance and illumination of dynamic indoor scenes,” IEEE Trans. Vis. Comput. Graph., vol. 28, no. 2, pp. 1249–1260, 2020.
  • S. A. Shafer, “Using color to separate reflection components,” Color Res. Appl., vol. 10, no. 4, pp. 210–218, 1985.
  • L. T. Maloney and B. A. Wandell, “Color constancy: a method for recovering surface spectral reflectance,” in Readings in Computer Vision, Elsevier, 1987, pp. 293–297.
  • Osadchy and Ramamoorthi, “Using specularities for recognition,” in IEEE ICCV, IEEE, 2003, pp. 1512–1519.
  • J. B. Park and A. C. Kak, “A truncated least squares approach to detecting specular highlights in color images,” in IEEE ICRA, IEEE, 2003, pp. 1397–1403.
  • O. El Meslouhi, M. Kardouchi, H. Allali, T. Gadi, and Y. A. Benkaddour, “Automatic detection and inpainting of specular reflections for colposcopic images,” Cent. Eur. J. Comput. Sci., vol. 1, pp. 341–354, 2011.
  • R. Li, J. Pan, Y. Si, B. Yan, Y. Hu, and H. Qin, “Specular reflections removal for endoscopic image sequences with adaptive-RPCA decomposition,” IEEE Trans. Med. Imaging, vol. 39, no. 2, pp. 328–340, 2019.
  • W. Zhang, X. Zhao, J.-M. Morvan, and L. Chen, “Improving shadow suppression for illumination robust face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 3, pp. 611–624, 2018.
  • Q. Yang, S. Wang, and N. Ahuja, “Real-time specular highlight removal using bilateral filtering,” in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, Springer, 2010, pp. 87–100.
  • H. Kim, H. Jin, S. Hadap, and I. Kweon, “Specular reflection separation using dark channel prior,” in IEEE CVPR, 2013, pp. 1460–1467.
  • Q. Yang, J. Tang, and N. Ahuja, “Efficient and robust specular highlight removal,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1304–1311, 2014.
  • Y. Liu, Z. Yuan, N. Zheng, and Y. Wu, “Saturation-preserving specular reflection separation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3725–3733.
  • J. Suo, D. An, X. Ji, H. Wang, and Q. Dai, “Fast and high quality highlight removal from a single image,” IEEE Trans. Image Process., vol. 25, no. 11, pp. 5441–5454, 2016.
  • T. Yamamoto, T. Kitajima, and R. Kawauchi, “Efficient improvement method for separation of reflection components based on an energy function,” in 2017 IEEE international conference on image processing (ICIP), IEEE, 2017, pp. 4222–4226.
  • I. Funke, S. Bodenstedt, C. Riediger, J. Weitz, and S. Speidel, “Generative adversarial networks for specular highlight removal in endoscopic images,” in Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, SPIE, 2018, pp. 8–16.
  • S. Muhammad, M. N. Dailey, M. Farooq, M. F. Majeed, and M. Ekpanyapong, “Spec-Net and Spec-CGAN: Deep learning models for specularity removal from faces,” Image Vis. Comput., vol. 93, p. 103823, 2020.
  • G. Fu, Q. Zhang, Q. Lin, L. Zhu, and C. Xiao, “Learning to Detect Specular Highlights from Real-world Images,” in ACM Multimedia, 2020, pp. 1873–1881.
  • G. Fu, Q. Zhang, L. Zhu, P. Li, and C. Xiao, “A multi-task network for joint specular highlight detection and removal,” in IEEE/CVF CVPR, 2021, pp. 7752–7761.
  • Z. Wu, C. Zhuang, J. Shi, J. Xiao, and J. Guo, “Deep specular highlight removal for single real-world image,” in SIGGRAPH Asia 2020 Posters, 2020, pp. 1–2.
  • G. Fu, Q. Zhang, L. Zhu, C. Xiao, and P. Li, “Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data,” in IEEE/CVF ICCV, 2023, pp. 12857–12865.
  • Z. Wu, J. Guo, C. Zhuang, J. Xiao, D.-M. Yan, and X. Zhang, “Joint specular highlight detection and removal in single images via Unet-Transformer,” Comput. Vis. Media, vol. 9, no. 1, pp. 141–154, 2023.
  • J. Shi, Y. Dong, H. Su, and S. X. Yu, “Learning non-lambertian object intrinsics across shapenet categories,” in IEEE CVPR, 2017, pp. 1685–1694.
  • Z. Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada: IEEE, Oct. 2021, pp. 9992–10002. doi: 10.1109/ICCV48922.2021.00986.
  • A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in International Conference on Learning Representations, 2020.
  • Y. Li, K. Zhang, J. Cao, R. Timofte, and L. Van Gool, “Localvit: Bringing locality to vision transformers,” ArXiv Prepr. ArXiv210405707, 2021.
  • L. Karacan, “Multi-image transformer for multi-focus image fusion,” Signal Process. Image Commun., vol. 119, p. 117058, 2023.
  • J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE CVPR, 2018, pp. 7132–7141.
  • C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. Jorge Cardoso, “Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3, Springer, 2017, pp. 240–248.
  • H.-L. Shen, H.-G. Zhang, S.-J. Shao, and J. H. Xin, “Chromaticity-based separation of reflection components in a single image,” Pattern Recognit., vol. 41, no. 8, pp. 2461–2469, 2008.
  • J. Lin, M. El Amine Seddik, M. Tamaazousti, Y. Tamaazousti, and A. Bartoli, “Deep multi-class adversarial specularity removal,” in Image Analysis: 21st Scandinavian Conference, SCIA 2019, Norrköping, Sweden, June 11–13, 2019, Proceedings 21, Springer, 2019, pp. 3–15.
Year 2025, Volume: 8 Issue: 1, 47 - 57, 28.03.2025
https://doi.org/10.35377/saucis...1517723

Abstract

References

  • S. Jiddi, P. Robert, and E. Marchand, “Detecting specular reflections and cast shadows to estimate reflectance and illumination of dynamic indoor scenes,” IEEE Trans. Vis. Comput. Graph., vol. 28, no. 2, pp. 1249–1260, 2020.
  • S. A. Shafer, “Using color to separate reflection components,” Color Res. Appl., vol. 10, no. 4, pp. 210–218, 1985.
  • L. T. Maloney and B. A. Wandell, “Color constancy: a method for recovering surface spectral reflectance,” in Readings in Computer Vision, Elsevier, 1987, pp. 293–297.
  • Osadchy and Ramamoorthi, “Using specularities for recognition,” in IEEE ICCV, IEEE, 2003, pp. 1512–1519.
  • J. B. Park and A. C. Kak, “A truncated least squares approach to detecting specular highlights in color images,” in IEEE ICRA, IEEE, 2003, pp. 1397–1403.
  • O. El Meslouhi, M. Kardouchi, H. Allali, T. Gadi, and Y. A. Benkaddour, “Automatic detection and inpainting of specular reflections for colposcopic images,” Cent. Eur. J. Comput. Sci., vol. 1, pp. 341–354, 2011.
  • R. Li, J. Pan, Y. Si, B. Yan, Y. Hu, and H. Qin, “Specular reflections removal for endoscopic image sequences with adaptive-RPCA decomposition,” IEEE Trans. Med. Imaging, vol. 39, no. 2, pp. 328–340, 2019.
  • W. Zhang, X. Zhao, J.-M. Morvan, and L. Chen, “Improving shadow suppression for illumination robust face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 3, pp. 611–624, 2018.
  • Q. Yang, S. Wang, and N. Ahuja, “Real-time specular highlight removal using bilateral filtering,” in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, Springer, 2010, pp. 87–100.
  • H. Kim, H. Jin, S. Hadap, and I. Kweon, “Specular reflection separation using dark channel prior,” in IEEE CVPR, 2013, pp. 1460–1467.
  • Q. Yang, J. Tang, and N. Ahuja, “Efficient and robust specular highlight removal,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1304–1311, 2014.
  • Y. Liu, Z. Yuan, N. Zheng, and Y. Wu, “Saturation-preserving specular reflection separation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3725–3733.
  • J. Suo, D. An, X. Ji, H. Wang, and Q. Dai, “Fast and high quality highlight removal from a single image,” IEEE Trans. Image Process., vol. 25, no. 11, pp. 5441–5454, 2016.
  • T. Yamamoto, T. Kitajima, and R. Kawauchi, “Efficient improvement method for separation of reflection components based on an energy function,” in 2017 IEEE international conference on image processing (ICIP), IEEE, 2017, pp. 4222–4226.
  • I. Funke, S. Bodenstedt, C. Riediger, J. Weitz, and S. Speidel, “Generative adversarial networks for specular highlight removal in endoscopic images,” in Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, SPIE, 2018, pp. 8–16.
  • S. Muhammad, M. N. Dailey, M. Farooq, M. F. Majeed, and M. Ekpanyapong, “Spec-Net and Spec-CGAN: Deep learning models for specularity removal from faces,” Image Vis. Comput., vol. 93, p. 103823, 2020.
  • G. Fu, Q. Zhang, Q. Lin, L. Zhu, and C. Xiao, “Learning to Detect Specular Highlights from Real-world Images,” in ACM Multimedia, 2020, pp. 1873–1881.
  • G. Fu, Q. Zhang, L. Zhu, P. Li, and C. Xiao, “A multi-task network for joint specular highlight detection and removal,” in IEEE/CVF CVPR, 2021, pp. 7752–7761.
  • Z. Wu, C. Zhuang, J. Shi, J. Xiao, and J. Guo, “Deep specular highlight removal for single real-world image,” in SIGGRAPH Asia 2020 Posters, 2020, pp. 1–2.
  • G. Fu, Q. Zhang, L. Zhu, C. Xiao, and P. Li, “Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data,” in IEEE/CVF ICCV, 2023, pp. 12857–12865.
  • Z. Wu, J. Guo, C. Zhuang, J. Xiao, D.-M. Yan, and X. Zhang, “Joint specular highlight detection and removal in single images via Unet-Transformer,” Comput. Vis. Media, vol. 9, no. 1, pp. 141–154, 2023.
  • J. Shi, Y. Dong, H. Su, and S. X. Yu, “Learning non-lambertian object intrinsics across shapenet categories,” in IEEE CVPR, 2017, pp. 1685–1694.
  • Z. Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada: IEEE, Oct. 2021, pp. 9992–10002. doi: 10.1109/ICCV48922.2021.00986.
  • A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in International Conference on Learning Representations, 2020.
  • Y. Li, K. Zhang, J. Cao, R. Timofte, and L. Van Gool, “Localvit: Bringing locality to vision transformers,” ArXiv Prepr. ArXiv210405707, 2021.
  • L. Karacan, “Multi-image transformer for multi-focus image fusion,” Signal Process. Image Commun., vol. 119, p. 117058, 2023.
  • J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE CVPR, 2018, pp. 7132–7141.
  • C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. Jorge Cardoso, “Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3, Springer, 2017, pp. 240–248.
  • H.-L. Shen, H.-G. Zhang, S.-J. Shao, and J. H. Xin, “Chromaticity-based separation of reflection components in a single image,” Pattern Recognit., vol. 41, no. 8, pp. 2461–2469, 2008.
  • J. Lin, M. El Amine Seddik, M. Tamaazousti, Y. Tamaazousti, and A. Bartoli, “Deep multi-class adversarial specularity removal,” in Image Analysis: 21st Scandinavian Conference, SCIA 2019, Norrköping, Sweden, June 11–13, 2019, Proceedings 21, Springer, 2019, pp. 3–15.
There are 30 citations in total.

Details

Primary Language English
Subjects Computer Software
Journal Section Research Article
Authors

Levent Karacan 0000-0003-2764-5258

Early Pub Date March 27, 2025
Publication Date March 28, 2025
Submission Date July 17, 2024
Acceptance Date February 22, 2025
Published in Issue Year 2025Volume: 8 Issue: 1

Cite

APA Karacan, L. (2025). Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. Sakarya University Journal of Computer and Information Sciences, 8(1), 47-57. https://doi.org/10.35377/saucis...1517723
AMA Karacan L. Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. SAUCIS. March 2025;8(1):47-57. doi:10.35377/saucis.1517723
Chicago Karacan, Levent. “Joint Detection and Removal of Specular Highlights Using Vision Transformer With Multi-Scale Patch Attention”. Sakarya University Journal of Computer and Information Sciences 8, no. 1 (March 2025): 47-57. https://doi.org/10.35377/saucis. 1517723.
EndNote Karacan L (March 1, 2025) Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. Sakarya University Journal of Computer and Information Sciences 8 1 47–57.
IEEE L. Karacan, “Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention”, SAUCIS, vol. 8, no. 1, pp. 47–57, 2025, doi: 10.35377/saucis...1517723.
ISNAD Karacan, Levent. “Joint Detection and Removal of Specular Highlights Using Vision Transformer With Multi-Scale Patch Attention”. Sakarya University Journal of Computer and Information Sciences 8/1 (March 2025), 47-57. https://doi.org/10.35377/saucis. 1517723.
JAMA Karacan L. Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. SAUCIS. 2025;8:47–57.
MLA Karacan, Levent. “Joint Detection and Removal of Specular Highlights Using Vision Transformer With Multi-Scale Patch Attention”. Sakarya University Journal of Computer and Information Sciences, vol. 8, no. 1, 2025, pp. 47-57, doi:10.35377/saucis. 1517723.
Vancouver Karacan L. Joint Detection and Removal of Specular Highlights using Vision Transformer with Multi-scale Patch Attention. SAUCIS. 2025;8(1):47-5.


INDEXING & ABSTRACTING & ARCHIVING


 31045 31044  31046 31047 

31043 28939 28938


29070    The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License