ViSketch-GPT: Collaborative Multi-Scale Feature Extraction for Hand-Drawn Sketch Retrieval
Click Here to...
Learning Egocentric In-Hand Object Segmentation through Weak Supervision from Human Narrations
Click Here to...
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework
Click Here to...