关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

History

unknown 2c59ce36c1 update		1 year ago
..
configs	update	1 year ago

data	update	1 year ago

output/BMN_best	update	1 year ago

paddlevideo	update	1 year ago

tools	update	1 year ago

LICENSE	update	1 year ago

MANIFEST.in	update	1 year ago

README.md	update	1 year ago

README_cn.md	update	1 year ago

__init__.py	update	1 year ago

main.py	update	1 year ago

requirements.txt	update	1 year ago

run.sh	update	1 year ago

setup.py	update	1 year ago

README.md

PaddleVideo

简体中文 | English

PaddleVideo

Introduction

PaddleVideo is a toolset for video recognition, action localization, and spatio temporal action detection tasks prepared for the industry and academia. This repository provides examples and best practice guildelines for exploring deep learning algorithm in the scene of video area. We devote to support experiments and utilities which can significantly reduce the "time to deploy". By the way, this is also a proficiency verification and implementation of the newest PaddlePaddle 2.0 in the video field.

If you think this repo is helpful to you, welcome to star us~ ⭐

Features

Various dataset and models
PaddleVideo supports more datasets and models, including Kinetics400, UCF101, YoutTube8M, NTU-RGB+D datasets, and video recognition models, such as TSN, TSM, SlowFast, TimeSformer, AttentionLSTM, ST-GCN and action localization model, like BMN.
Higher performance
PaddleVideo has built-in solutions to improve accuracy on recognition models. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 76.16%.
Faster training strategy
PaddleVideo suppors faster training strategy, such as AMP training, Distributed training, Multigrid method for Slowfast, OP fusion method, Faster reader and so on.
Deployable
PaddleVideo is powered by the Paddle Inference. There is no need to convert the model to ONNX format when deploying it, all you want can be found in this repository.
Applications
PaddleVideo provides some interesting and practical projects that are implemented using video recognition and detection techniques, such as FootballAction and VideoTag.

Overview of the performance

Field	Model	Dataset	Metrics	ACC%
action recognition	PP-TSM	Kinetics-400	Top-1	76.16
action recognition	PP-TSN	Kinetics-400	Top-1	75.06
action recognition	AGCN	FSD	Top-1	90.66
action recognition	ST-GCN	FSD	Top-1	86.66
action recognition	TimeSformer	Kinetics-400	Top-1	77.29
action recognition	SlowFast	Kinetics-400	Top-1	75.84
action recognition	TSM	Kinetics-400	Top-1	71.06
action recognition	TSN	Kinetics-400	Top-1	69.81
action recognition	AttentionLSTM	Youtube-8M	Hit@1	89.0
action detection	BMN	ActivityNet	AUC	67.23

Changelog

release/2.1 was released in 20/05/2021. Please refer to release notes for details.

Community

Scan the QR code below with your Wechat and reply "video", you can access to official technical exchange group. Look forward to your participation.

Applications

VideoTag: 3k Large-Scale video classification model

FootballAction: Football action detection model

Tutorials and Docs

Tutorials and Slides
- 2021.01
- Summarize of video understanding
Quick Start
- Install
- Start
Project design
- Modular design
- Configuration design
Model zoo
- recognition
  - TimeSformer
  - Attention-LSTM
  - TSN
  - TSM
  - PP-TSM
  - PP-TSN
  - SlowFast
- Localization
  - BMN
- Skeleton-based action recognition
  - ST-GCN
  - AGCN
- Spatio temporal action detection
  - Coming Soon!
- ActBERT: Learning Global-Local Video-Text Representations
  - Coming Soon!
Practice
Others
- Benchmark
- Tools

License

PaddleVideo is released under the Apache 2.0 license.

Contributing

This poject welcomes contributions and suggestions. Please see our contribution guidelines.

Many thanks to mohui37 for contributing the code for prediction.

基于飞桨实现乒乓球时序动作定位大赛：B榜第2名方案单模型，无TTA，A榜第一，B榜第二

Python Jupyter Notebook Markdown Text Shell

How to access data resources in code