Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
huolongshe 630e913e29 | 2 months ago | |
---|---|---|
app | 3 months ago | |
demo_data | 3 months ago | |
docs | 3 months ago | |
.gitignore | 3 months ago | |
Dockerfile | 3 months ago | |
LICENSE | 3 months ago | |
README.md | 2 months ago | |
application.yml | 3 months ago | |
build-docker.sh | 3 months ago | |
pack_model.py | 3 months ago | |
pip-install-reqs.sh | 3 months ago | |
requirements.txt | 3 months ago | |
run_model_server.py | 3 months ago |
The OWLv2 model (short for Open-World Localization) was proposed in Scaling Open-Vocabulary Object Detection by Matthias Minderer, Alexey Gritsenko, Neil Houlsby. OWLv2, like OWL-ViT, is a zero-shot text-conditioned object detection model that can be used to query an image with one or multiple text queries.
The model uses CLIP as its multi-modal backbone, with a ViT-like Transformer to get visual features and a causal language model to get the text features. To use CLIP for detection, OWL-ViT removes the final token pooling layer of the vision model and attaches a lightweight classification and box head to each transformer output token. Open-vocabulary classification is enabled by replacing the fixed classification layer weights with the class-name embeddings obtained from the text model. The authors first train CLIP from scratch and fine-tune it end-to-end with the classification and box heads on standard detection datasets using a bipartite matching loss. One or multiple text queries per image can be used to perform zero-shot text-conditioned object detection.
The model uses a CLIP backbone with a ViT-L/14 Transformer architecture as an image encoder and uses a masked self-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss. The CLIP backbone is trained from scratch and fine-tuned together with the box and class prediction heads with an object detection objective.
模型来源: https://hf-mirror.com/google/owlv2-large-patch14
本模型基于 ServiceBoot微服务引擎 进行服务化封装,参见: 《CubeAI模型开发指南》
$ sh pip-install-reqs.sh
$ serviceboot start
或
$ python3 run_model_server.py
一键式本地容器化部署和运行,参见: 《CubeAI模型独立部署指南》 或 CubeAI Docker Builder
本模型服务可一键发布至 CubeAI智立方平台 进行共享和部署,参见: 《CubeAI模型发布指南》
No Description
Text Python Shell Dockerfile
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》