Bridging the gap between inspiration and purchase. This AI-powered visual search engine lets customers find products by uploading any image — a photo from social media, a screenshot, or a camera capture. Multimodal CLIP embeddings match visual features across thousands of e-commerce SKUs, creating a discovery experience that text search alone cannot deliver.
Fetch products from Shopify Admin API with tag-based filtering (clothing, footwear). Download product images at 512px width, convert to RGB format. Supports pagination across the full catalog.
Each image is processed through Jina CLIP v2 — a multimodal model that produces 512-dimensional vectors. Embeddings are L2-normalized for cosine distance computation. Supports CUDA acceleration when available.
Normalized embeddings are batch-upserted to Pinecone (100 vectors per batch) with rich metadata: product title, vendor, price, tags, handle, image URL, and creation date.
User uploads an image via the Gradio interface on HuggingFace Spaces. The same CLIP model generates a query embedding, and Pinecone returns the most visually similar products ranked by cosine similarity.
Multimodal embedding model from Jina AI. Encodes both images and text into a shared 512-dimensional vector space. Multilingual support with trust_remote_code for custom model classes.
Managed vector database for similarity search at scale. Stores 512-dim embeddings with product metadata. Cosine similarity metric, batch upsert, and sub-second query latency.
Gradio-powered web interface deployed on HuggingFace Spaces. Image upload, real-time embedding generation, and search results display. ZeroGPU support for free-tier GPU inference.
Products fetched via Shopify Admin API (2024-01). Tag-based filtering, multi-image support, pagination at 250 items per request. Rich metadata extraction per product.
Jina CLIP v2 encodes images and text into the same vector space, enabling both visual similarity and text-to-image search capabilities.
Skip-existing mode avoids re-processing already indexed products. Resume capability retries failed items from logged IDs.
Audit tool compares Shopify catalog against Pinecone index. Detects missing products, extra vectors, and generates re-index lists.
Each vector stores product title, vendor, price, tags, handle, image URL, and position. Enables filtered search and direct product links.
Configurable batch sizes for vector upserts (100 per batch). Dry-run mode for testing. Error logging with automatic retry support.
Automatic CUDA device detection. HuggingFace ZeroGPU support for free-tier GPU inference on Spaces deployments.