2024-07-10

Logs

VLMs are Blind showed a number of interesting cases where vision language models fail to solve problems that humans can easily solve. I spent some time trying to build examples with additional context that could steer the model to correctly complete Task 1: Counting line intersections, but didn’t have much success.

✎ Edit

Raw