Image → Text in your browser (Transformers.js + WebGPU)

Probing environment…

Caption an image (file upload)

preview will appear here

Output

Loading model…

Model: Xenova/vit-gpt2-image-captioning
Backend: