Add Mistral Pixtral-12B support | Voters

Add Mistral Pixtral-12B support

complete

teslop

the first-ever multimodal Mistral model.
Pixtral 12B in short:
Natively multimodal, trained with interleaved image and text data
Strong performance on multimodal tasks, excels in instruction following
Maintains state-of-the-art performance on text-only benchmarks
Architecture:
New 400M parameter vision encoder trained from scratch
12B parameter multimodal decoder based on Mistral Nemo
Supports variable image sizes and aspect ratios
Supports multiple images in the long context window of 128k tokens

September 19, 2024

MindMac

marked this post as

complete

Now Pixtral Large & Pixtral 12B have been added since the version 1.9.23. Please upgrade on your end.

MindMac

marked this post as

in progress