Skip to content

Zhennor/Multimodal-Video-Retrieval-Engine-with-Vision-and-Text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

265 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Video Retrieval Engine with Vision and Text

Demo

Setup

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update
pip install -r requirements.txt
gdown --folder https://drive.google.com/drive/folders/1n5sLuf9YckDArIktTLd4WRwhl1XxUy7B?hl=vi -O data/bin
gdown --folder https://drive.google.com/drive/folders/16GueLfWnK4yQtPbsaBe8QNk-zUga-QDo?dmr=1&ec=wgc-drive-globalnav-goto -O data/bin
                

Run

sudo service elasticsearch start
curl -X GET "localhost:9200/"
uvicorn main:app --reload

Demo Video

Demo Video Thumbnail

Click here to watch the video

About

A video search engine combining OCR, ASR, CLIP, Image Captioning, Object & Color Detection. It enables accurate retrieval based on text, speech, images, objects, and colors in video content.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors