Boards

Feature Requests

Feedback

Powered by Canny

Feature Requests

Support for OpenAI o3 please!

Bug fix: you cannot add Anthropic claude to mindmac

Due to deprecated models it will fail initial test by app when adding API key please remove models like 1 and 2 which are deprecated to avoid this issue.

Support xAI API

I have api keys and would like to support xAI

Support for new Google Gemini models - Sept 2024

Google updated Gemini 1.5 Pro and Flash to the 002 version: * gemini-1.5-pro-002 * gemini-1.5-flash-002 Follows docs about Vertex AI models https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-1.5-pro https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-1.5-flash Follows the public announcement https://developers.googleblog.com/en/updated-production-ready-gemini-models-reduced-15-pro-pricing-increased-rate-limits-and-more/ It would be great for MindMac to support these.

Add Mistral Pixtral-12B support

the first-ever multimodal Mistral model. Pixtral 12B in short: Natively multimodal, trained with interleaved image and text data Strong performance on multimodal tasks, excels in instruction following Maintains state-of-the-art performance on text-only benchmarks Architecture: New 400M parameter vision encoder trained from scratch 12B parameter multimodal decoder based on Mistral Nemo Supports variable image sizes and aspect ratios Supports multiple images in the long context window of 128k tokens

Add new claude models

Either allow for claude-3-5-sonnet-latest or make it so we're allowed to add our own Claude models in the model list.

VertexAI new models

I want to add gemini-1.5-pro-002 and gemini-pro-experimental / to be able to add new models manually

Add Groq LLaMA 3.2 90B-text-preview model support

The two largest models of the Llama 3.2 suite, 11B and 90B, support image reasoning use cases while the lightweight 1B and 3B models are text only. Today, developers can access llama-3.2-1b-preview, llama-3.2-3b-preview, llama-3.2-11b-text-preview, and llama-3.2-90b-text-preview on Groq.

openai GPT-4o-2024-08-06 support

Groq Add multi-modal model llava-v1.5-7b-4096-preview support

LLaVA stands for Large Language and Vision Assistant, a powerful multimodal model that combines the strengths of language and vision. Based on OpenAI’s CLIP and a fine-tuned version of Meta’s Llama 2 7B model, LLaVA uses visual instruction tuning to support image-based natural instruction following and visual reasoning capabilities. This allows LLaVA to perform a range of tasks, including: Visual question answering: answering questions based on image content Caption generation: generating text descriptions of images Optical Character Recognition: identifying text in image Multimodal dialogue: engaging in conversations that involve both text and images

Load More

→

Powered by Canny