AI Data Processing Block, its implementation and Cost effective strategies for Clappia.

In the ever-evolving landscape of AI, our team embarked on a journey to enhance our platform by incorporating a practical AI block feature. The possibilities seemed limitless, ranging from facial recognition for attendance monitoring to extracting vital information from images, such as analyzing the quality and detecting diseases in plants through image analysis. After successfully validating these use-cases through a Proof of Concept (POC), we were eager to integrate this AI capability into our platform.

Choosing the right AI API was a critical decision, and after careful consideration, we opted for the OpenAI ChatGPT API. Our exploration led us to this choice, as alternatives like Bard fell short in delivering satisfactory responses, and Gemini had not yet opened up its API to the public. OpenAI's GPT API demonstrated superior performance in handling diverse tasks. With a clear direction, we set our sights on leveraging the power of the OpenAI GPT API to bring our AI block feature to life.

Text Analysis Implementation:

For text analysis, we've adopted the straightforward approach of employing the gpt-3.5-turbo model. Input for text analysis is facilitated through dedicated text blocks, allowing users to provide instructions for the subsequent evaluation. We've strategically implemented constraints, capping the maximum input length at 500 characters, and limiting the output to 100 characters for succinct and manageable results.

Untitled

Image Analysis Implementation:

For our image analysis feature, we've incorporated the powerful 'gpt-4-vision-preview' model, an extension of GPT-4 Turbo with advanced image comprehension capabilities. Currently, it references 'gpt-4-1106-vision-preview.' Users can seamlessly upload images using our existing file block, which is then directed to S3.

Here's an overview of our image analysis workflow:

File Upload and Storage:
- Users utilize the existing file block to upload images, which are then stored on S3, ensuring efficient and reliable file management.
Microservices Architecture:
- Leveraging microservices, the AiAnalysisService takes charge when the Ai block is triggered. This architecture allows for modular and scalable image analysis operations.
Cost Optimization Checks:
- Before processing, our system performs a series of cost optimization checks. These checks, which contribute to efficient resource utilization, will be discussed in detail later.
Image Processing:
- Images are retrieved from S3, converted to base64 format, and then processed by the 'gpt-4-1106-vision-preview' model. It's worth noting that our system adheres to GPT's 20 MB limit for image file sizes, allowing inputs of up to 10 MB in base64 format.
Supported File Types:
- Our image analysis feature supports a variety of file types, including jpeg, jpg, png, webp, and gif.

Optimizations for Multiple Images:

Downloading Unique Images:
- To enhance efficiency, our system downloads only unique images from the S3 bucket, minimizing redundancy and conserving resources.
Parallel Image Downloads:
- It concurrently downloads multiple images leveraging the benefits of concurrency to expedite the overall image analysis process.