Setup a Temporal worker in Ruby and got familiar with its ergonomics.
Tried out this gpt-4v demo repo
Experimented with OCR capabilities of open source multi-modal language models.
Tried llava:32b
(1.6) and bakllava
but neither seemed to touch gpt-4-vison
’s performance.
It was cool to see the former run on a macbook though.