Vector Embedding Search Engine

Project Description

A Vector Embedding Search Engine is a database that utilizes Vector Embeddings in order to support a semantic search in otherwise unstructured data like images, videos or text. Machine Learning Models can extract features and semantic information from this unstructured data and encode it in a high dimensional vector. By indexing these vectors, the Search Engine can search through millions of vectors in just a few milliseconds and find results that are semantically related to the input query. Input queries can be text, images or even videos which are first encoded as a vector and then compared to all vectors in the database.

The goal of this project was to create a Web-Platform in which anyone can easily upload and manage their images, automatically create embeddings and index the data with just a few clicks. The data can be explored in an interactive visualization or queried through an API.

Key Features:

Versatile Data Import: Seamlessly import data from multiple sources such as AWS S3, GCP Buckets, and direct file uploads.
Automated Infrastructure Management: The application takes care of all necessary infrastructure, including servers and storage by utilizing various cloud services, ensuring a minimal setup process.
User-Friendly Interface: Access and utilize the application’s capabilities easily via the web interface at alpha.markuslaubenthal.de.

Tech Stack

Web Application: Nodejs / Expressjs REST API, MongoDB, React
Embedding and Query Service: Python, Tensorflow, FastAPI, AWS Batch+EC2, AWS S3
Deployment: Docker, Gitlab CI/CD

Current Status:

The web application is currently in its alpha stage. At this point, only a subset of the functionality is available for testing.

Explore the functionality and potential of this project by visiting the web application. For any further questions or inquiries, feel free to contact me.

Vector Embedding Search Engine