I did some experimentation deriving a data model iteratively (something I am currently calling “data model distillation”) by sequentially passing multiple images (could work with text as well) to a language model and prompting it to improve the schema using any new learnings from the current image. Results so far have been unimpressive.
I’ve been hearing good things about mistral-large-2
.
I’m working on adding it to bots-doing-things
but have had a bit of dependency trouble so far.