- 无标题文档
查看论文信息

论文中文题名:

 视频动态目标消除方法研究    

姓名:

 王大任    

学号:

 21208223056    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085400    

学科名称:

 工学 - 电子信息    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2021    

培养单位:

 西安科技大学    

院系:

 计算机科学与技术学院    

专业:

 软件工程    

研究方向:

 计算机视觉    

第一导师姓名:

 李占利    

第一导师单位:

 西安科技大学    

第二导师姓名:

 牟琦    

论文提交日期:

 2024-12-12    

论文答辩日期:

 2024-12-03    

论文外文题名:

 Research on Video Dynamic Object Elimination Method    

论文中文关键词:

 动态目标定位 ; 视频分割 ; 视频空洞修复 ; 共享动态掩码 ; 多尺度信息聚合动态目标定位 ; 视频分割 ; 视频空洞修复 ; 共享动态掩码 ; 多尺度信息聚合    

论文外文关键词:

 Dynamic target recognition ; video segmentation ; video hole repair ; shared dynamic masks ; multi-scale information aggregation    

论文中文摘要:

视频动态目标消除任务是先对动态目标定位,然后对目标区域修复。动态目标的定位需要对动态目标做分割和运动检测,目标区域修复是利用静态背景修复目标区域,从而获得仅包含静态背景的完整图像帧。该技术在视频编辑、增强现实、三维重建和动态SLAM等领域中有着广泛的应用。

主要研究内容如下:

视频动态目标定位方法。现有的视频修复方法通常只修复掩码指定区域,无法自动定位并修复视频中所有动态目标区域。本文提出了一种视频动态目标定位方法,该方法包含目标分割与运动检测,实现对动态目标的自动定位,为后续的修复工作奠定了基础。为了提升对潜在动态目标的分割精度,在分割算法中引入特征金字塔和PSA注意力机制,实现了不同尺度上远程上下文信息的聚合,提升了对目标的分割的精度,进而准确分割出潜在的动态目标。随后采用卡尔曼滤波检测目标的运动状态,实现了对动态目标的定位。在YouTube VOS 2019和DAVIS 2017数据集上进行了实验与对比,相较于CompFeat等方法本文提出的动态目标定位方法可以准确地定位视频中的动态目标,在AP、AP50、AP75上分别有5.5%、3.7%和7.9%的提升,效果优于主流算法。

多尺度信息聚合修复方法。目前修复方法在处理背景变化显著、动态目标发生严重形变或复杂运动场景视频时,易出现纹理模糊或结构扭曲等问题。本文提出了一种多尺度信息聚合的修复算法。该方法使用生成式对抗网络的结构,通过区域归一化和门控卷积构建的生成器,有效解决了均值和方差偏移问题,提高了修复精度。利用改进的Inception模块的多尺度特征提取能力,在多个尺度上进行卷积和特征聚合,从而在填补大面积的空洞时保持纹理清晰。解码器采用Mish和ELU激活函数,减少了信息丢失,增强了网络的泛化性能。在YouTube VOS 2019、DAVIS 2017和Paris Street View等数据集上的实验结果表明,相较于STTN等算法,本文方法在大面积空洞的修复上纹理清晰,没有明显结构扭曲现象,且在PSNR指标上最高有14.08%的提升,在SSIM指标上最高有3.4%的提升。

视频动态目标消除系统。设计并实现了一个视频动态目标消除系统。该系统结合了本文提出的两种方法,成功实现了对动态目标的精准定位与有效消除。这些消除结果可以为三维重建或动态SLAM等领域提供基础支持。此外,系统还配备了用户友好的可视化界面,方便用户进行操作和分析。

论文外文摘要:

Video dynamic object elimination refers to the automatic localization of dynamic objects in a video and the use of static backgrounds to repair the target area, thereby obtaining complete image frames containing only static backgrounds. This technology has a wide range of applications in fields such as video editing, augmented reality, 3D reconstruction, and dynamic SLAM.

This paper conducts in-depth research on dynamic target elimination methods. The main research contents are as follows:

(1) Video dynamic object localization method. The existing video restoration methods usually fix mask specified areas and cannot automatically locate all dynamic targets in the video. This article proposes a video dynamic object localization method that integrates the processes of segmentation and motion detection, achieving automatic localization of dynamic objects and laying the foundation for subsequent repair work. By introducing attention mechanisms, the aggregation of remote contextual information has been achieved, improving the accuracy of segmenting irregular non rigid objects and accurately segmenting potential dynamic targets. Subsequently, Kalman filtering was used to detect the motion state of the target, achieving accurate recognition of dynamic targets. Experiments and comparisons were conducted on the YouTube VOS 2019 and DAVIS 2017 datasets. Compared with CompFeat and other methods, the dynamic object localization method proposed in this paper can accurately locate dynamic objects in videos, with improvements of 5.5%, 3.7%, and 7.9% on AP, AP50, and AP75, respectively, outperforming mainstream algorithms.

(2) Multi scale information aggregation repair method. At present, when dealing with videos with significant background changes, severe deformation of dynamic targets, or complex motion scenes, repair methods are prone to problems such as texture blur or structural distortion. This paper proposes a multi-scale information aggregation repair algorithm. This method uses the structure of a generative adversarial network and constructs a generator through region normalization and gated convolution, effectively solving the problem of mean and variance shift and improving the repair accuracy. By utilizing the multi-scale feature extraction capability of the improved Inception module, convolution and feature aggregation are performed at multiple scales to maintain clear texture while filling large areas of voids. The decoder uses Mish and ELU activation functions to reduce information loss and enhance the network's generalization performance. The experimental results on datasets such as YouTube VOS 2019, DAVIS 2017, and Paris Street View show that compared to algorithms such as STTN, our proposed method has clear texture and no obvious structural distortion in repairing large-area holes. Moreover, it shows a maximum improvement of 14.08% in PSNR and 3.4% in SSIM metrics.

(3)Video dynamic object elimination system. Designed and implemented a video dynamic object elimination system. The system combines the two methods proposed in this article and successfully achieves precise positioning and effective elimination of dynamic targets. These elimination results can provide fundamental support for fields such as 3D reconstruction or dynamic SLAM. In addition, the system is equipped with a user-friendly visual interface, which facilitates user operation and analysis.

参考文献:

[1]蔡显奇,王晓松,李玮.一种室内弱纹理环境下的视觉SLAM算法[J].机器人, 2024, 46(3):284.

[2]汪水源,侯志强,李富成,等.自适应权重更新的轻量级视频目标分割算法[J].中国图象图形学报,2023,28(12):3772-3783.

[3]彭进业, 余喆, 屈书毅等.基于深度学习的图像修复方法研究综述[J]. 西北大学学报 (自然科学版), 2024, 53(6): 943-963.

[4]Nguyen T C, Tang T N, Phan N L H, et al. 1st place solution for youtubevos challenge 2021: Video instance segmentation[J]. 2021:649-657.

[5]Bellver M, Ventura C, Silberer C, et al. A closer look at referring expressions for video object segmentation[J]. Multimedia Tools and Applications, 2023, 82(3): 4419-4438.

[6]Zeng Y, Fu J, Chao H. Learning joint spatial-temporal transformations for video inpainting[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16. Springer International Publishing, 2020: 528-543.

[7]Chen G , Zhang G , Yang Z L W .Multi-scale patch-GAN with edge detection for image inpainting[J].Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies, 2023, 53(4):3917-3932.

[8]Shen X J , Long J W , Chen H P ,et al.Otsu Thresholding Algorithm Based on Rebuilding and Dimension Reduction of the 3-Dimensional Histogram[J].Acta Electronica Sinica, 2011, 39(5):1108-1114.

[9]Nevagi U G, Shahapurkar A, Nargundkar S. Edge detection techniques: a survey[J].International journal of innovative research and development, 2016, 5(2): e86174-e86174.

[10]Wen P , Wang X , Wei H .Modified level set method with Canny operator for image noise removal[J].中国光学快报:英文版, 2010(12):4.

[11]袁春兰,熊宗龙,周雪花,等.基于Sobel算子的图像边缘检测研究[J].激光与红外, 2009, 39(1):3.

[12]姚智超,楚晓亮,范筠益,等.基于Prewitt算子的X波段雷达有效波高反演研究[J].系统工程与电子技术, 2022, 44(4):1182-1187.

[13]Klintstrm E, Klintstrm B, Smedby R ,et al.Automated region growing-based segmentation for trabecular bone structure in fresh-frozen human wrist specimens[J].BMC Medical Imaging, 2024, 24(1):101.

[14]Guo Y, Liu Y, Georgiou T, et al. A review of semantic segmentation using deep neural networks[J]. International journal of multimedia information retrieval, 2018, 7:87-93.

[15]Gadde R, Jampani V, Gehler P V. Semantic video cnns through representation warping[C]. Proceedings of the IEEE International Conference on Computer Vision. 2017: 4453-4462.

[16]Ding M, Wang Z, Zhou B, et al. Every frame counts: Joint learning of video segmentation and optical flow[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 10713-10720.

[17]Nilsson D, Sminchisescu C. Semantic video segmentation by gated recurrent flow propagation[C]. Proceedings of the IEEE conference on computer vision and patternrecognition. 2018: 6819-6828.

[18]Kundu A, Vineet V, Koltun V. Feature space optimization for semantic video segm-entation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 3168-3175.

[19]Chandra S, Couprie C, Kokkinos I. Deep spatio-temporal random fields for efficient video segmentation[C]. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018: 8915-8924.

[20]Shelhamer E, Rakelly K, Hoffman J, et al. Clockwork convnets for video semanticsegmentation[C]. Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14. Springer InternationalPublishing, 2016: 852-868.

[21]Zhu X, Xiong Y, Dai J, et al. Deep feature flow for video recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2349-2358.

[22]Xu Y S, Fu T J, Yang H K, et al. Dynamic video segmentation network[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 6556-6565.

[23]Jain S, Wang X, Gonzalez J E. Accel: A corrective fusion network for efficient semantic segmentation on video[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 8866-8875.

[24]Liu Y, Shen C, Yu C, et al. Efficient semantic video segmentation with per-frame inference[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16. Springer International Publishing, 2020: 352-368.

[25]Hu P, Caba F, Wang O, et al. Temporally distributed networks for fast video semantic segmentation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 8818-8827.

[26]刘思, 杨程方. 一种融合背景差分和帧间差分的运动目标检测方法[J]. 舰船电子工程, 2024, 44(2): 45-48.

[27]舒兆翰, 李小龙, 吴从辉. 融合两帧差分法的改进视觉背景提取算法[J]. 科学技术与工程, 2024, 24(11): 04618-08.

[28]Urrea C, Agramonte R. Kalman filter: historical overview and review of its use in robotics 60 years after its creation[J]. Journal of Sensors, 2021, 2021: 1-21.

[29]邱道尹, 张文静, 顾波, 等. 帧差法在运动目标实时跟踪中的应用[J]. 华北水利水电学院学报, 2009 (3): 45-46.

[30]刘仲民, 何胜皎, 胡文瑾. 基于 Σ-Δ 背景估计的运动目标检测算法[J]. 计算机工程与设计, 2019, 40(3): 788-794.

[31]Naigong Y U, Yuling Z, Li X U, et al. Optical flow based mobile robot obstacle avoidance method in unstructured environment[J]. 仁和测试, 2017, 43(1): 65-69.

[32]Urrea C, Agramonte R. Kalman filter: historical overview and review of its use in robotics 60 years after its creation[J]. Journal of Sensors, 2021, 2021: 1-21.

[33]Ballester C, Bertalmio M, Caselles V, et al. Filling-in by joint interpolation of vector fields and gray levels[J]. IEEE transactions on image processing, 2001, 10(8): 1200-1211.

[34]Efros A A, Leung T K. Texture synthesis by non-parametric sampling[C]. Proceedings of the seventh IEEE international conference on computer vision. IEEE, 1999, 2: 1033-1038.

[35]Fukunaga K, Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition[J]. IEEE Transactions on information theory, 1975, 21(1): 32-40.

[36]Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics (ToG), 2017, 36(4): 1-14.

[37]Liao L, Hu R, Xiao J, et al. Edge-aware context encoder for image inpainting[C]. 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018: 3156-3160.

[38]Vo H V, Duong N Q K, Pérez P. Structural inpainting[C]. Proceedings of the 26thACM international conference on Multimedia. 2018: 1948-1956.

[39]Yang J, Qi Z, Shi Y. Learning to incorporate structure knowledge for image inpainting[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07):12605-12612.

[40]Cao C, Fu Y. Learning a sketch tensor space for image inpainting of man-made scenes[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2021: 14509-14518.

[41]Yu J, Lin Z, Yang J, et al. Free-form image inpainting with gated convolution[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4471-4480.

[42]Wang Y, Tao X, Qi X, et al. Image inpainting via generative multi-column convolutional neural networks[J]. Advances in neural information processing systems, 2018:31-39.

[43]Liu H, Jiang B, Song Y, et al. Rethinking image inpainting via a mutual encoder-decoder with feature equalizations[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer International Publishing, 2020: 725-741.

[44]Cai J, Li C, Tao X, et al. Image multi-inpainting via progressive generative adversarial networks[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 978-987.

[45]Sagong M, Shin Y, Kim S, et al. Pepsi: Fast image inpainting with parallel decoding network[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 11360-11368.

[46]Yu J, Lin Z, Yang J, et al. Generative image inpainting with contextual attention[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5505-5514.

[47]Albawi S, Bayat O, Al-Azawi S, et al. Social touch gesture recognition using convolutional neural network[J]. Computational Intelligence and Neuroscience, 2018, 2018(1): 6973103.

[48]Yang C, Lu X, Lin Z, et al. High-resolution image inpainting using multi-scale neural patch synthesis[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 6721-6729.

[49]Zeng Y, Lin Z, Yang J, et al. High-resolution image inpainting with iterative confidence feedback and guided upsampling[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XIX 16. Springer International Publishing, 2020: 1-17.

[50]Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein gans[J]. Advances in neural information processing systems, 2017, 30.

[51]Song Y, Yang C, Lin Z, et al. Contextual-based image inpainting: Infer, match, and translate[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

[52]Chen G , Zhang G , Yang Z ,et al.Multi-scale patch-GAN with edge detection for image inpainting[J].Applied Intelligence, 2023, 53(4):3917-3932.

[53]Xiong W, Yu J, Lin Z, et al. Foreground-aware image inpainting[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 5840-5848.

[54]Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.

[55]Zhou Y, Barnes C, Shechtman E, et al. Transfill: Reference-guided image inpainting by merging multiple color and spatial transformations[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 2266-2276.

[56]Wang J, Chen S, Wu Z, et al. Ft-tdr: Frequency-guided transformer and top-down refinement network for blind face inpainting[J]. IEEE Transactions on Multimedia, 2022, 25: 2382-2392.

[57]Zheng C, Cham T J, Cai J, et al. Bridging global context interactions for high-fidelity image completion[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 11512-11522.

[58]Dong Q, Cao C, Fu Y. Incremental transformer structure enhanced image inpainting with masking positional encoding[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 11358-11368.

[59]Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.

[60]Ronneberger O. Invited talk: U-net convolutional networks for biomedical image segmentation[C]. Bildverarbeitung für die Medizin 2017: Algorithmen Systeme Anwendungen. Proceedings des Workshops vom 12. bis 14. März 2017 in Heidelberg. Berlin, Heidelberg: Springer Berlin Heidelberg, 2017: 3-3.

[61]Chen L C , Papandreou G , Kokkinos I ,et al.Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[J].Computer Science, 2014(4):357-361.

[62]He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]. Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.

[63]Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.

[64]Zhao H, Zhang Y, Liu S, et al. Psanet: Point-wise spatial attention network for scene parsing[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 267-283.

[65]Boffa A , Ferragina P , Tosoni F V G .CoCo-trie: Data-aware compression and indexing of strings[J].Information systems, 2024, 120(Feb.):1.1-1.16.

[66]Hussein M Y A , Al-Karablieh M , Al-Kfouf S ,et al.Machine learning-driven sustainable urban design: transforming Singapore's landscape with vertical greenery[J].Asian Journal of Civil Engineering, 2024, 25(5):3851-3863.

[67]Yang S , Fang Y , Wang X ,et al.Crossover Learning for Fast Online Video Instance Segmentation[J]. 2021: 8043-8052

[68]Fu Y , Liu S , Iqbal U ,et al.Learning to Track Instances without Video Annotations[J]. 2021:8680-8689.

[69]Fu Y, Yang L, Liu D,et al.CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation.[C].National Conference on Artificial Intelligence.2021:1361-1369

[70]Wu J , Cao J , Song L ,et al.Track to Detect and Segment: An Online Multi-Object Tracker[J]. 2021:12352-12361.

[71]Liu D , Cui Y , Tan W ,et al.SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation[J]. 2021:9816-9825.

[72]Wang Y , Xu Z , Wang X ,et al.End-to-End Video Instance Segmentation with Transformers[C].Computer Vision and Pattern Recognition.IEEE, 2021:8741-8750.

[73]Yu T, Guo Z, Jin X, et al. Region normalization for image inpainting[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12733-12740.

[74]Yu J, Lin Z, Yang J, et al. Free-form image inpainting with gated convolution[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4471-4480.

[75]Li Z , Liu W , Yi J ,et al.HSVConnect: HSV guided enhanced content generation network for image inpainting[J].Signal, Image and Video Processing, 2024, 18(3):2671-2682.

[76]Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 586-595.

[77]Sara M , Hala A G , Hassan E S .Iterative magnitude pruning-based light-version of AlexNet for skin cancer classification[J].Neural computing & applications, 2024, 36(3):1413-1428.

[78]Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.

[79]Liu G, Reda F A, Shih K J, et al. Image inpainting for irregular holes using partial convolutions[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 85-100.

[80]Shahidani F R , Ghasemi A , Haghighat A T ,et al.Task scheduling in edge-fog-cloud architecture: a multi-objective load balancing approach using reinforcement learning algorithm[J].Computing, 2023, 105(6):1337-1359.

[81]Carreira J, Zisserman A. Quo vadis, action recognition? a new model and the kinetics dataset[C]. proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 6299-6308.

[82]Oh S W, Lee S, Lee J Y, et al. Onion-peel networks for deep video completion[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4403-4412.

[83]Kim D, Woo S, Lee J Y, et al. Deep blind video decaptioning by temporal aggregation and recurrence[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 4263-4272.

[84]Yu T, Guo Z, Jin X, et al. Region normalization for image inpainting[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12733-12740.

[85]Yu J, Lin Z, Yang J, et al. Generative image inpainting with contextual attention[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5505-5514.[1]蔡显奇,王晓松,李玮.一种室内弱纹理环境下的视觉SLAM算法[J].机器人, 2024, 46(3):284.

[2]汪水源,侯志强,李富成,等.自适应权重更新的轻量级视频目标分割算法[J].中国图象图形学报,2023,28(12):3772-3783.

[3]彭进业, 余喆, 屈书毅等.基于深度学习的图像修复方法研究综述[J]. 西北大学学报 (自然科学版), 2024, 53(6): 943-963.

[4]Nguyen T C, Tang T N, Phan N L H, et al. 1st place solution for youtubevos challenge 2021: Video instance segmentation[J]. 2021:649-657.

[5]Bellver M, Ventura C, Silberer C, et al. A closer look at referring expressions for video object segmentation[J]. Multimedia Tools and Applications, 2023, 82(3): 4419-4438.

[6]Zeng Y, Fu J, Chao H. Learning joint spatial-temporal transformations for video inpainting[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16. Springer International Publishing, 2020: 528-543.

[7]Chen G , Zhang G , Yang Z L W .Multi-scale patch-GAN with edge detection for image inpainting[J].Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies, 2023, 53(4):3917-3932.

[8]Shen X J , Long J W , Chen H P ,et al.Otsu Thresholding Algorithm Based on Rebuilding and Dimension Reduction of the 3-Dimensional Histogram[J].Acta Electronica Sinica, 2011, 39(5):1108-1114.

[9]Nevagi U G, Shahapurkar A, Nargundkar S. Edge detection techniques: a survey[J].International journal of innovative research and development, 2016, 5(2): e86174-e86174.

[10]Wen P , Wang X , Wei H .Modified level set method with Canny operator for image noise removal[J].中国光学快报:英文版, 2010(12):4.

[11]袁春兰,熊宗龙,周雪花,等.基于Sobel算子的图像边缘检测研究[J].激光与红外, 2009, 39(1):3.

[12]姚智超,楚晓亮,范筠益,等.基于Prewitt算子的X波段雷达有效波高反演研究[J].系统工程与电子技术, 2022, 44(4):1182-1187.

[13]Klintstrm E, Klintstrm B, Smedby R ,et al.Automated region growing-based segmentation for trabecular bone structure in fresh-frozen human wrist specimens[J].BMC Medical Imaging, 2024, 24(1):101.

[14]Guo Y, Liu Y, Georgiou T, et al. A review of semantic segmentation using deep neural networks[J]. International journal of multimedia information retrieval, 2018, 7:87-93.

[15]Gadde R, Jampani V, Gehler P V. Semantic video cnns through representation warping[C]. Proceedings of the IEEE International Conference on Computer Vision. 2017: 4453-4462.

[16]Ding M, Wang Z, Zhou B, et al. Every frame counts: Joint learning of video segmentation and optical flow[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 10713-10720.

[17]Nilsson D, Sminchisescu C. Semantic video segmentation by gated recurrent flow propagation[C]. Proceedings of the IEEE conference on computer vision and patternrecognition. 2018: 6819-6828.

[18]Kundu A, Vineet V, Koltun V. Feature space optimization for semantic video segm-entation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 3168-3175.

[19]Chandra S, Couprie C, Kokkinos I. Deep spatio-temporal random fields for efficient video segmentation[C]. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2018: 8915-8924.

[20]Shelhamer E, Rakelly K, Hoffman J, et al. Clockwork convnets for video semanticsegmentation[C]. Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14. Springer InternationalPublishing, 2016: 852-868.

[21]Zhu X, Xiong Y, Dai J, et al. Deep feature flow for video recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2349-2358.

[22]Xu Y S, Fu T J, Yang H K, et al. Dynamic video segmentation network[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 6556-6565.

[23]Jain S, Wang X, Gonzalez J E. Accel: A corrective fusion network for efficient semantic segmentation on video[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 8866-8875.

[24]Liu Y, Shen C, Yu C, et al. Efficient semantic video segmentation with per-frame inference[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16. Springer International Publishing, 2020: 352-368.

[25]Hu P, Caba F, Wang O, et al. Temporally distributed networks for fast video semantic segmentation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 8818-8827.

[26]刘思, 杨程方. 一种融合背景差分和帧间差分的运动目标检测方法[J]. 舰船电子工程, 2024, 44(2): 45-48.

[27]舒兆翰, 李小龙, 吴从辉. 融合两帧差分法的改进视觉背景提取算法[J]. 科学技术与工程, 2024, 24(11): 04618-08.

[28]Urrea C, Agramonte R. Kalman filter: historical overview and review of its use in robotics 60 years after its creation[J]. Journal of Sensors, 2021, 2021: 1-21.

[29]邱道尹, 张文静, 顾波, 等. 帧差法在运动目标实时跟踪中的应用[J]. 华北水利水电学院学报, 2009 (3): 45-46.

[30]刘仲民, 何胜皎, 胡文瑾. 基于 Σ-Δ 背景估计的运动目标检测算法[J]. 计算机工程与设计, 2019, 40(3): 788-794.

[31]Naigong Y U, Yuling Z, Li X U, et al. Optical flow based mobile robot obstacle avoidance method in unstructured environment[J]. 仁和测试, 2017, 43(1): 65-69.

[32]Urrea C, Agramonte R. Kalman filter: historical overview and review of its use in robotics 60 years after its creation[J]. Journal of Sensors, 2021, 2021: 1-21.

[33]Ballester C, Bertalmio M, Caselles V, et al. Filling-in by joint interpolation of vector fields and gray levels[J]. IEEE transactions on image processing, 2001, 10(8): 1200-1211.

[34]Efros A A, Leung T K. Texture synthesis by non-parametric sampling[C]. Proceedings of the seventh IEEE international conference on computer vision. IEEE, 1999, 2: 1033-1038.

[35]Fukunaga K, Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition[J]. IEEE Transactions on information theory, 1975, 21(1): 32-40.

[36]Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion[J]. ACM Transactions on Graphics (ToG), 2017, 36(4): 1-14.

[37]Liao L, Hu R, Xiao J, et al. Edge-aware context encoder for image inpainting[C]. 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018: 3156-3160.

[38]Vo H V, Duong N Q K, Pérez P. Structural inpainting[C]. Proceedings of the 26thACM international conference on Multimedia. 2018: 1948-1956.

[39]Yang J, Qi Z, Shi Y. Learning to incorporate structure knowledge for image inpainting[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07):12605-12612.

[40]Cao C, Fu Y. Learning a sketch tensor space for image inpainting of man-made scenes[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2021: 14509-14518.

[41]Yu J, Lin Z, Yang J, et al. Free-form image inpainting with gated convolution[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4471-4480.

[42]Wang Y, Tao X, Qi X, et al. Image inpainting via generative multi-column convolutional neural networks[J]. Advances in neural information processing systems, 2018:31-39.

[43]Liu H, Jiang B, Song Y, et al. Rethinking image inpainting via a mutual encoder-decoder with feature equalizations[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer International Publishing, 2020: 725-741.

[44]Cai J, Li C, Tao X, et al. Image multi-inpainting via progressive generative adversarial networks[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 978-987.

[45]Sagong M, Shin Y, Kim S, et al. Pepsi: Fast image inpainting with parallel decoding network[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 11360-11368.

[46]Yu J, Lin Z, Yang J, et al. Generative image inpainting with contextual attention[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5505-5514.

[47]Albawi S, Bayat O, Al-Azawi S, et al. Social touch gesture recognition using convolutional neural network[J]. Computational Intelligence and Neuroscience, 2018, 2018(1): 6973103.

[48]Yang C, Lu X, Lin Z, et al. High-resolution image inpainting using multi-scale neural patch synthesis[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 6721-6729.

[49]Zeng Y, Lin Z, Yang J, et al. High-resolution image inpainting with iterative confidence feedback and guided upsampling[C]. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XIX 16. Springer International Publishing, 2020: 1-17.

[50]Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein gans[J]. Advances in neural information processing systems, 2017, 30.

[51]Song Y, Yang C, Lin Z, et al. Contextual-based image inpainting: Infer, match, and translate[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

[52]Chen G , Zhang G , Yang Z ,et al.Multi-scale patch-GAN with edge detection for image inpainting[J].Applied Intelligence, 2023, 53(4):3917-3932.

[53]Xiong W, Yu J, Lin Z, et al. Foreground-aware image inpainting[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 5840-5848.

[54]Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.

[55]Zhou Y, Barnes C, Shechtman E, et al. Transfill: Reference-guided image inpainting by merging multiple color and spatial transformations[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 2266-2276.

[56]Wang J, Chen S, Wu Z, et al. Ft-tdr: Frequency-guided transformer and top-down refinement network for blind face inpainting[J]. IEEE Transactions on Multimedia, 2022, 25: 2382-2392.

[57]Zheng C, Cham T J, Cai J, et al. Bridging global context interactions for high-fidelity image completion[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 11512-11522.

[58]Dong Q, Cao C, Fu Y. Incremental transformer structure enhanced image inpainting with masking positional encoding[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 11358-11368.

[59]Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.

[60]Ronneberger O. Invited talk: U-net convolutional networks for biomedical image segmentation[C]. Bildverarbeitung für die Medizin 2017: Algorithmen Systeme Anwendungen. Proceedings des Workshops vom 12. bis 14. März 2017 in Heidelberg. Berlin, Heidelberg: Springer Berlin Heidelberg, 2017: 3-3.

[61]Chen L C , Papandreou G , Kokkinos I ,et al.Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs[J].Computer Science, 2014(4):357-361.

[62]He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]. Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.

[63]Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.

[64]Zhao H, Zhang Y, Liu S, et al. Psanet: Point-wise spatial attention network for scene parsing[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 267-283.

[65]Boffa A , Ferragina P , Tosoni F V G .CoCo-trie: Data-aware compression and indexing of strings[J].Information systems, 2024, 120(Feb.):1.1-1.16.

[66]Hussein M Y A , Al-Karablieh M , Al-Kfouf S ,et al.Machine learning-driven sustainable urban design: transforming Singapore's landscape with vertical greenery[J].Asian Journal of Civil Engineering, 2024, 25(5):3851-3863.

[67]Yang S , Fang Y , Wang X ,et al.Crossover Learning for Fast Online Video Instance Segmentation[J]. 2021: 8043-8052

[68]Fu Y , Liu S , Iqbal U ,et al.Learning to Track Instances without Video Annotations[J]. 2021:8680-8689.

[69]Fu Y, Yang L, Liu D,et al.CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation.[C].National Conference on Artificial Intelligence.2021:1361-1369

[70]Wu J , Cao J , Song L ,et al.Track to Detect and Segment: An Online Multi-Object Tracker[J]. 2021:12352-12361.

[71]Liu D , Cui Y , Tan W ,et al.SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation[J]. 2021:9816-9825.

[72]Wang Y , Xu Z , Wang X ,et al.End-to-End Video Instance Segmentation with Transformers[C].Computer Vision and Pattern Recognition.IEEE, 2021:8741-8750.

[73]Yu T, Guo Z, Jin X, et al. Region normalization for image inpainting[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12733-12740.

[74]Yu J, Lin Z, Yang J, et al. Free-form image inpainting with gated convolution[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4471-4480.

[75]Li Z , Liu W , Yi J ,et al.HSVConnect: HSV guided enhanced content generation network for image inpainting[J].Signal, Image and Video Processing, 2024, 18(3):2671-2682.

[76]Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 586-595.

[77]Sara M , Hala A G , Hassan E S .Iterative magnitude pruning-based light-version of AlexNet for skin cancer classification[J].Neural computing & applications, 2024, 36(3):1413-1428.

[78]Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.

[79]Liu G, Reda F A, Shih K J, et al. Image inpainting for irregular holes using partial convolutions[C]. Proceedings of the European conference on computer vision (ECCV). 2018: 85-100.

[80]Shahidani F R , Ghasemi A , Haghighat A T ,et al.Task scheduling in edge-fog-cloud architecture: a multi-objective load balancing approach using reinforcement learning algorithm[J].Computing, 2023, 105(6):1337-1359.

[81]Carreira J, Zisserman A. Quo vadis, action recognition? a new model and the kinetics dataset[C]. proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 6299-6308.

[82]Oh S W, Lee S, Lee J Y, et al. Onion-peel networks for deep video completion[C]. Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4403-4412.

[83]Kim D, Woo S, Lee J Y, et al. Deep blind video decaptioning by temporal aggregation and recurrence[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 4263-4272.

[84]Yu T, Guo Z, Jin X, et al. Region normalization for image inpainting[C]. Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12733-12740.

[85]Yu J, Lin Z, Yang J, et al. Generative image inpainting with contextual attention[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5505-5514.

中图分类号:

 TP391.4    

开放日期:

 2024-12-13    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式