In the ever-evolving landscape of AI, our team embarked on a journey to enhance our platform by incorporating a practical AI block feature. The possibilities seemed limitless, ranging from facial recognition for attendance monitoring to extracting vital information from images, such as analyzing the quality and detecting diseases in plants through image analysis. After successfully validating these use-cases through a Proof of Concept (POC), we were eager to integrate this AI capability into our platform.

Choosing the right AI API was a critical decision, and after careful consideration, we opted for the OpenAI ChatGPT API. Our exploration led us to this choice, as alternatives like Bard fell short in delivering satisfactory responses, and Gemini had not yet opened up its API to the public. OpenAI's GPT API demonstrated superior performance in handling diverse tasks. With a clear direction, we set our sights on leveraging the power of the OpenAI GPT API to bring our AI block feature to life.

Text Analysis Implementation:

For text analysis, we've adopted the straightforward approach of employing the gpt-3.5-turbo model. Input for text analysis is facilitated through dedicated text blocks, allowing users to provide instructions for the subsequent evaluation. We've strategically implemented constraints, capping the maximum input length at 500 characters, and limiting the output to 100 characters for succinct and manageable results.

Untitled

Image Analysis Implementation:

For our image analysis feature, we've incorporated the powerful 'gpt-4-vision-preview' model, an extension of GPT-4 Turbo with advanced image comprehension capabilities. Currently, it references 'gpt-4-1106-vision-preview.' Users can seamlessly upload images using our existing file block, which is then directed to S3.

Here's an overview of our image analysis workflow:

  1. File Upload and Storage:
  2. Microservices Architecture:
  3. Cost Optimization Checks:
  4. Image Processing:
  5. Supported File Types:

Optimizations for Multiple Images: