【论文学习】Fast Online Object Tracking and Segmentation: A Unifying Approach 在线快速目标跟踪与分割 -论文学习

本文主要是介绍【论文学习】Fast Online Object Tracking and Segmentation: A Unifying Approach 在线快速目标跟踪与分割 -论文学习,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

Fast Online Object Tracking and Segmentation: A Unifying Approach
在线快速目标跟踪与分割:一种通用方法

摘要

论文提出一种实时VOT和半监督VOS的通用方法。
该方法称为SiamMask,通过二值分割任务生成损失,改进了全卷积Siamese 方法的离线训练步骤。
训练完成后,SiamMask 依靠init 单个bbox并在线运行,生成与类别无关的对象分割Mask,和旋转bbox。速度可达每秒55帧。
策略实现了VOT-2018上最佳的跟踪效果。同时实现了DAVIS-2016和DAVIS-2017上半监督VOS任务的最佳性能和速度。
项目地址:http://www.robots.ox.ac.uk/˜qwang/SiamMask

1.引言

跟踪是一项基本任务。广泛应用在视频分析程序中,目标对象的某种程度推理。
跟踪允许在帧之间建立前后对象的对应关系[34]。
跟踪广泛用于各种场景,如自动监控,车辆导航,视频标签,人机交互和活动识别。
VOT的目的,在视频的第一帧中,给定任意感兴趣Object的位置,尽可能准确的预测它在所有后续帧中的位置。[48]对许多应用来说,视频流传输时的在线跟踪很重要。换句话讲,tracker 不应利用后续的帧来推断物体的当前位置[26]。
这个VOT基准所描绘的场景,代表了具有简单轴对齐(例如[56,52])或旋转[26,27] bbox 的目标对象。
这样简单的标注方法数据标注成本较低。更重要的是,它允许用户快速,简单的执行目标初始化。

2.相关工作

VOT

半监督VOS

3.方法

3.1.全卷积联合网络

【SiamFC】
作为跟踪系统的基本组成部分,离线训练的全卷积Siamese网络,可用于比较目标图像z和稍大是待搜索图像x,来获取响应 map。
z是以目标对象为中心裁剪的 w×h区域,x是以目标最新估计位置为中心裁切的较大区域。
这两个输入使用相同的CNN fθ处理,生成两个相互关联的特征图。

在这里插入图片描述

【SiamRPN】
依靠RPN大大提高了SiamFC的性能(RPN)[46,14],RPN对估算目标位置可	输出可变宽高比的bbox。尤其在SiamRPN中,每个行对一组​​k个anchor box proposals和相应的对象/背景scores 进行编码。因此,SiamRPN 对 box predictions与分类scores可并行输出。两个输出分支已使用 smooth L1 和交叉熵损失训练过[28,第3.2节]。

3.2. SiamMask

在这里插入图片描述

Loss function

在这里插入图片描述

Mask representation

Two variants

在这里插入图片描述

Box generation

3.3. Implementation details

Network architecture

Training

Inference

4.实验

4.1.VOT 评估

Datasets and settings.

How much does the object representation matter?

在这里插入图片描述

Results on VOT-2018 and VOT-2016.

4.2.半监督VOS评估

Datasets and settings.

Results on DAVIS and YouTube-VOS.

4.3.进一步分析

Network architecture

Multi-task training

Timing.

Failure cases.

对于目标模糊和非目标实例失效

结论

介绍了SiamMask,使用全卷积连体跟踪器对目标生成类别无关的二值分割Mask。
展示其如何成功的同时应用在VOT和半监督VOS任务上。
达到现有跟踪器最佳精度,同时也实现了最快的VOS。
提出的 SiamMask 的两个变种,只需一个简单地box进行初始化,在线操作,实时运行,并且无需对测试序列进行任何调整。

Acknowledgements

引用

[1] L. Bao, B. Wu, and W. Liu. Cnn in mrf: Video object segmentation via inference in a cnn-based higher-order spatiotemporal mrf. In IEEE Conference on Computer Vision and
Pattern Recognition, 2018. 2, 3, 6
[2] L. Bertinetto, J. F. Henriques, J. Valmadre, P. H. S. Torr, and
A. Vedaldi. Learning feed-forward one-shot learners. In Advances in Neural Information Processing Systems, 2016. 3
[3] L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and
P. H. Torr. Fully-convolutional siamese networks for object
tracking. In European Conference on Computer Vision workshops, 2016. 2, 3, 4, 5, 6
[4] D. S. Bolme, J. R. Beveridge, B. A. Draper, and Y. M. Lui.
Visual object tracking using adaptive correlation filters. In
IEEE Conference on Computer Vision and Pattern Recognition, 2010. 2
[5] S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixe, ´
D. Cremers, and L. Van Gool. One-shot video object segmentation. In IEEE Conference on Computer Vision and
Pattern Recognition, 2017. 7
[6] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and
A. L. Yuille. Deeplab: Semantic image segmentation with
deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018. 5, 11
[7] Y. Chen, J. Pont-Tuset, A. Montes, and L. Van Gool. Blazingly fast video object segmentation with pixel-wise metric
learning. In IEEE Conference on Computer Vision and Pattern Recognition, 2018. 2, 3, 7
[8] J. Cheng, Y.-H. Tsai, W.-C. Hung, S. Wang, and M.-H. Yang.
Fast and accurate online video object segmentation via tracking parts. In IEEE Conference on Computer Vision and Pattern Recognition, 2018. 2, 3, 6, 7
[9] J. Cheng, Y.-H. Tsai, S. Wang, and M.-H. Yang. Segflow:
Joint learning for video object segmentation and optical
flow. In IEEE International Conference on Computer Vision,
2017. 3, 7
[10] H. Ci, C. Wang, and Y. Wang. Video object segmentation by
learning location-sensitive embeddings. In European Conference on Computer Vision, 2018. 2
[11] D. Comaniciu, V. Ramesh, and P. Meer. Real-time tracking
of non-rigid objects using mean shift. In IEEE Conference
on Computer Vision and Pattern Recognition, 2000. 2
[12] M. Danelljan, G. Bhat, F. S. Khan, and M. Felsberg. Eco:
Efficient convolution operators for tracking. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.
1, 2
[13] M. Danelljan, G. Hager, F. S. Khan, and M. Felsberg. Learn- ¨
ing spatially regularized correlation filters for visual tracking. In IEEE International Conference on Computer Vision,
2015. 2, 5
[14] C. Feichtenhofer, A. Pinz, and A. Zisserman. Detect to track
and track to detect. In IEEE International Conference on
Computer Vision, 2017. 3
[15] A. He, C. Luo, X. Tian, and W. Zeng. Towards a better match
in siamese network based visual object tracker. In European
Conference on Computer Vision workshops, 2018. 2, 6, 7
[16] A. He, C. Luo, X. Tian, and W. Zeng. A twofold siamese
network for real-time object tracking. In IEEE Conference
on Computer Vision and Pattern Recognition, 2018. 3
[17] K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask r- ´
cnn. In IEEE International Conference on Computer Vision,
2017. 4
[18] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning
for image recognition. In IEEE Conference on Computer
Vision and Pattern Recognition, 2016. 5, 11
[19] D. Held, S. Thrun, and S. Savarese. Learning to track at 100
fps with deep regression networks. In European Conference
on Computer Vision, 2016. 2, 5
[20] J. F. Henriques, R. Caseiro, P. Martins, and J. Batista. Highspeed tracking with kernelized correlation filters. IEEE
Transactions on Pattern Analysis and Machine Intelligence,
2015. 2, 5
[21] Y.-T. Hu, J.-B. Huang, and A. G. Schwing. Videomatch:
Matching based video object segmentation. In European
Conference on Computer Vision, 2018. 2, 3
[22] V. Jampani, R. Gadde, and P. V. Gehler. Video propagation
networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2, 3, 7
[23] A. Khoreva, R. Benenson, E. Ilg, T. Brox, and B. Schiele.
Lucid data dreaming for object tracking. In IEEE Conference on Computer Vision and Pattern Recognition workshops, 2017. 2, 3, 6
[24] H. Kiani Galoogahi, T. Sim, and S. Lucey. Multi-channel
correlation filters. In IEEE International Conference on
Computer Vision, 2013. 2
[25] H. Kiani Galoogahi, T. Sim, and S. Lucey. Correlation filters
with limited boundaries. In IEEE Conference on Computer
Vision and Pattern Recognition, 2015. 2
[26] M. Kristan, A. Leonardis, J. Matas, M. Felsberg,
R. Pflugfelder, L. Cehovin, T. Voj ˇ ´ır, G. Hager, A. Luke ¨ zi ˇ c, ˇ
G. Fernandez, et al. The visual object tracking vot2016 chal- ´
lenge results. In European Conference on Computer Vision,
2016. 1, 3, 5
[27] M. Kristan, A. Leonardis, J. Matas, M. Felsberg,
R. Pfugfelder, L. C. Zajc, T. Vojir, G. Bhat, A. Lukezic,
A. Eldesokey, G. Fernandez, and et al. The sixth visual object
tracking vot-2018 challenge results. In European Conference
on Computer Vision workshops, 2018. 1, 3, 5, 8, 12
[28] B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu. High performance
visual tracking with siamese region proposal network. In
IEEE Conference on Computer Vision and Pattern Recognition, 2018. 2, 3, 4, 5, 7
[29] F. Li, C. Tian, W. Zuo, L. Zhang, and M.-H. Yang. Learning spatial-temporal regularized correlation filters for visual
tracking. In IEEE Conference on Computer Vision and Pattern Recognition, 2018. 2, 6, 7
[30] X. Li and C. C. Loy. Video object segmentation with joint
re-identification and attention-aware mask propagation. In
European Conference on Computer Vision, 2018. 2, 3, 6
[31] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick. Microsoft coco: Com- ´
mon objects in context. In European Conference on Computer Vision, 2014. 5
9[32] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional
networks for semantic segmentation. In IEEE Conference on
Computer Vision and Pattern Recognition, 2015. 4
[33] A. Lukezic, T. Vojir, L. C. Zajc, J. Matas, and M. Kristan.
Discriminative correlation filter with channel and spatial reliability. In IEEE Conference on Computer Vision and Pattern
Recognition, 2017. 2, 5, 6, 7
[34] T. Makovski, G. A. Vazquez, and Y. V. Jiang. Visual learning
in multiple-object tracking. PLoS One, 2008. 1
[35] K.-K. Maninis, S. Caelles, Y. Chen, J. Pont-Tuset, L. LealTaixe, D. Cremers, and L. Van Gool. Video object segmen- ´
tation without temporal information. In IEEE Transactions
on Pattern Analysis and Machine Intelligence, 2017. 2, 3, 6
[36] N. Marki, F. Perazzi, O. Wang, and A. Sorkine-Hornung. Bi- ¨
lateral space video segmentation. In IEEE Conference on
Computer Vision and Pattern Recognition, 2016. 2, 3, 6
[37] O. Miksik, J.-M. Perez-R ´ ua, P. H. Torr, and P. P ´ erez. Roam: ´
a rich object appearance model with application to rotoscoping. In IEEE Conference on Computer Vision and Pattern
Recognition, 2017. 1
[38] F. Perazzi. Video Object Segmentation. PhD thesis, ETH
Zurich, 2017. 1, 3, 6
[39] F. Perazzi, A. Khoreva, R. Benenson, B. Schiele, and
A. Sorkine-Hornung. Learning video object segmentation
from static images. In IEEE Conference on Computer Vision
and Pattern Recognition, 2017. 2, 3, 6, 7
[40] F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool,
M. Gross, and A. Sorkine-Hornung. A benchmark dataset
and evaluation methodology for video object segmentation.
In IEEE Conference on Computer Vision and Pattern Recognition, 2017. 1, 3, 6, 7, 8, 13
[41] F. Perazzi, O. Wang, M. Gross, and A. Sorkine-Hornung.
Fully connected object proposals for video segmentation. In
IEEE International Conference on Computer Vision, 2015. 3
[42] P. Perez, C. Hue, J. Vermaak, and M. Gangnet. Color-Based ´
Probabilistic Tracking. In European Conference on Computer Vision, 2002. 2
[43] P. O. Pinheiro, R. Collobert, and P. Dollar. Learning to seg- ´
ment object candidates. In Advances in Neural Information
Processing Systems, 2015. 2, 4
[44] P. O. Pinheiro, T.-Y. Lin, R. Collobert, and P. Dollar. Learn- ´
ing to refine object segments. In European Conference on
Computer Vision, 2016. 4, 7, 11
[45] J. Pont-Tuset, F. Perazzi, S. Caelles, P. Arbelaez, A. Sorkine- ´
Hornung, and L. Van Gool. The 2017 davis challenge on video object segmentation. arXiv preprint
arXiv:1704.00675, 2017. 6, 8, 13
[46] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards
real-time object detection with region proposal networks. In
Advances in Neural Information Processing Systems, 2015.
2, 3
[47] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh,
S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein,
et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2015. 5
[48] A. W. Smeulders, D. M. Chu, R. Cucchiara, S. Calderara,
A. Dehghan, and M. Shah. Visual tracking: An experimental
survey. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2014. 1, 3
[49] R. Tao, E. Gavves, and A. W. Smeulders. Siamese instance
search for tracking. In IEEE Conference on Computer Vision
and Pattern Recognition, 2016. 2
[50] Y.-H. Tsai, M.-H. Yang, and M. J. Black. Video segmentation via object flow. In IEEE Conference on Computer Vision
and Pattern Recognition, 2016. 2, 3, 6
[51] J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, and
P. H. S. Torr. End-to-end representation learning for correlation filter based tracking. In IEEE Conference on Computer
Vision and Pattern Recognition, 2017. 2
[52] J. Valmadre, L. Bertinetto, J. F. Henriques, R. Tao,
A. Vedaldi, A. Smeulders, P. H. S. Torr, and E. Gavves.
Long-term tracking in the wild: A benchmark. In European
Conference on Computer Vision, 2018. 1
[53] P. Voigtlaender and B. Leibe. Online adaptation of convolutional neural networks for video object segmentation. In
British Machine Vision Conference, 2017. 2, 3, 6, 7
[54] T. Vojir and J. Matas. Pixel-wise object segmentations for
the vot 2016 dataset. Research Report CTU-CMP-2017–01,
Center for Machine Perception, Czech Technical University,
Prague, Czech Republic, 2017. 6
[55] L. Wen, D. Du, Z. Lei, S. Z. Li, and M.-H. Yang. Jots: Joint
online tracking and segmentation. In IEEE Conference on
Computer Vision and Pattern Recognition, 2015. 2, 3, 6
[56] Y. Wu, J. Lim, and M.-H. Yang. Online object tracking: A
benchmark. In IEEE Conference on Computer Vision and
Pattern Recognition, 2013. 1, 3
[57] S. Wug Oh, J.-Y. Lee, K. Sunkavalli, and S. Joo Kim. Fast
video object segmentation by reference-guided mask propagation. In IEEE Conference on Computer Vision and Pattern
Recognition, 2018. 2, 3, 7
[58] N. Xu, L. Yang, Y. Fan, J. Yang, D. Yue, Y. Liang, B. Price,
S. Cohen, and T. Huang. Youtube-vos: Sequence-tosequence video object segmentation. In European Conference on Computer Vision, 2018. 2, 5, 6
[59] L. Yang, Y. Wang, X. Xiong, J. Yang, and A. K. Katsaggelos.
Efficient video object segmentation via network modulation.
In IEEE Conference on Computer Vision and Pattern Recognition, June 2018. 2, 3, 7
[60] T. Yang and A. B. Chan. Learning dynamic memory networks for object tracking. In European Conference on Computer Vision, 2018. 2, 3
[61] D. Yeo, J. Son, B. Han, and J. H. Han. Superpixel-based
tracking-by-segmentation using markov chains. In IEEE
Conference on Computer Vision and Pattern Recognition,
2017. 2
[62] J. S. Yoon, F. Rameau, J. Kim, S. Lee, S. Shin, and I. S.
Kweon. Pixel-level matching for video object segmentation
using convolutional neural networks. In IEEE International
Conference on Computer Vision, 2017. 7
[63] Z. Zhu, Q. Wang, B. Li, W. Wu, J. Yan, and W. Hu.
Distractor-aware siamese networks for visual object tracking. In European Conference on Computer Vision, 2018. 2,
3, 5, 6, 7

A. Architectural details

Network backbone

Network heads

Mask refinement module

B. Further qualitative results

Different masks at different locations

Benchmark sequences

这篇关于【论文学习】Fast Online Object Tracking and Segmentation: A Unifying Approach 在线快速目标跟踪与分割 -论文学习的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!


原文地址:
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.chinasem.cn/article/1019029

相关文章

基于Python实现一个简单的题库与在线考试系统

《基于Python实现一个简单的题库与在线考试系统》在当今信息化教育时代,在线学习与考试系统已成为教育技术领域的重要组成部分,本文就来介绍一下如何使用Python和PyQt5框架开发一个名为白泽题库系... 目录概述功能特点界面展示系统架构设计类结构图Excel题库填写格式模板题库题目填写格式表核心数据结构

Go学习记录之runtime包深入解析

《Go学习记录之runtime包深入解析》Go语言runtime包管理运行时环境,涵盖goroutine调度、内存分配、垃圾回收、类型信息等核心功能,:本文主要介绍Go学习记录之runtime包的... 目录前言:一、runtime包内容学习1、作用:① Goroutine和并发控制:② 垃圾回收:③ 栈和

Android学习总结之Java和kotlin区别超详细分析

《Android学习总结之Java和kotlin区别超详细分析》Java和Kotlin都是用于Android开发的编程语言,它们各自具有独特的特点和优势,:本文主要介绍Android学习总结之Ja... 目录一、空安全机制真题 1:Kotlin 如何解决 Java 的 NullPointerExceptio

MybatisX快速生成增删改查的方法示例

《MybatisX快速生成增删改查的方法示例》MybatisX是基于IDEA的MyBatis/MyBatis-Plus开发插件,本文主要介绍了MybatisX快速生成增删改查的方法示例,文中通过示例代... 目录1 安装2 基本功能2.1 XML跳转2.2 代码生成2.2.1 生成.xml中的sql语句头2

8种快速易用的Python Matplotlib数据可视化方法汇总(附源码)

《8种快速易用的PythonMatplotlib数据可视化方法汇总(附源码)》你是否曾经面对一堆复杂的数据,却不知道如何让它们变得直观易懂?别慌,Python的Matplotlib库是你数据可视化的... 目录引言1. 折线图(Line Plot)——趋势分析2. 柱状图(Bar Chart)——对比分析3

一文教你Java如何快速构建项目骨架

《一文教你Java如何快速构建项目骨架》在Java项目开发过程中,构建项目骨架是一项繁琐但又基础重要的工作,Java领域有许多代码生成工具可以帮助我们快速完成这一任务,下面就跟随小编一起来了解下... 目录一、代码生成工具概述常用 Java 代码生成工具简介代码生成工具的优势二、使用 MyBATis Gen

使用animation.css库快速实现CSS3旋转动画效果

《使用animation.css库快速实现CSS3旋转动画效果》随着Web技术的不断发展,动画效果已经成为了网页设计中不可或缺的一部分,本文将深入探讨animation.css的工作原理,如何使用以及... 目录1. css3动画技术简介2. animation.css库介绍2.1 animation.cs

重新对Java的类加载器的学习方式

《重新对Java的类加载器的学习方式》:本文主要介绍重新对Java的类加载器的学习方式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录1、介绍1.1、简介1.2、符号引用和直接引用1、符号引用2、直接引用3、符号转直接的过程2、加载流程3、类加载的分类3.1、显示

SpringBoot快速搭建TCP服务端和客户端全过程

《SpringBoot快速搭建TCP服务端和客户端全过程》:本文主要介绍SpringBoot快速搭建TCP服务端和客户端全过程,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,... 目录TCPServerTCPClient总结由于工作需要,研究了SpringBoot搭建TCP通信的过程

一文教你Python如何快速精准抓取网页数据

《一文教你Python如何快速精准抓取网页数据》这篇文章主要为大家详细介绍了如何利用Python实现快速精准抓取网页数据,文中的示例代码简洁易懂,具有一定的借鉴价值,有需要的小伙伴可以了解下... 目录1. 准备工作2. 基础爬虫实现3. 高级功能扩展3.1 抓取文章详情3.2 保存数据到文件4. 完整示例