Multimodal RAG + LLM

Thelist - Tablet Screens
Thelist - Tablet Screens
Thelist - Tablet Screens

Category:

Retrieval Augmented Generation

Client:

Sunim

Duration:

2 weeks

Problem

Transforming product ideas into detailed user stories is inefficient and time-consuming. This project automates the creation of user stories from text and mockup images using advanced AI models.


Project Overview

The Retrieval-Augmented Generation (RAG) with Large Language Models (LLM) project aims to revolutionize how product ideas are transformed into actionable user stories and tickets. By integrating cutting-edge AI technologies from OpenAI, Google, and Anthropic, this project supports multi-modal inputs, accepting both textual descriptions and mockup images. The core stack includes FastAPI for robust API development, LangChain for implementing RAG functionalities, and FAISS for efficient database management.


Key Features

Multi-Modal Input Support: Accepts product ideas in text and mockup (whiteboard) images.

Automated Ticket Generation: Converts inputs into editable tickets/user stories.

Fine-Tuning with User Interactions: Continuously improves the model based on feedback.

AI Model Integration: Leverages multiple AI models for enhanced performance and flexibility.

Parameter Customization: Allows tweaking of local and remote model parameters.

Versatile Output Formats: Generates outputs in text, JSON, CSV, and XLSX formats.


Technical Stack

FastAPI: For creating API endpoints.

LangChain: For RAG implementation.

FAISS: As the database for efficient document indexing and retrieval.


Project Requirements

1. RAG and LLM Components: Develop and implement retrieval and generation functionalities.

2. Multi-Modal Input Handling: Process text and image inputs to generate tickets/user stories.

3. Knowledge Base Construction: Build a comprehensive knowledge base from guidelines and principles.

4. Design Template Database: Create a database of design templates for mockup images.

5. Model Documentation: Document strategies and performance metrics for various models.

6. Model Selection and Parameter Tweaking: Support selection and customization of AI models.