'ComputerVision' 태그의 글 목록

Recent Posts

Archives

Tags more

Link

관리 메뉴

글쓰기
방명록
RSS
관리

목록ComputerVision (1)

IT 정리용 블로그!

[논문] Generic Attention-model Explainability for Interpreting Bi-Model and Encoder-Decoder Transformers 설명,정리

Introduction 기존의 computer vision model들은 주로 고정된 수의 label에 task specific하게 훈련됐다. 하지만 transformer을 사용한 image-text model을 사용하고 각 modality를 enocde하면, 추가 훈련 없이 수많은 downstream task를 수행할 수 있다. 첫 방법은 텍스트를 transformer로, image를 resnet이나 transformer로 encode한다. 둘째 방법은 quantize된 image representation을 text token에 concate하고 transformer model을 사용한다. 이 외에도 텍스트와 이미지를 combine하는 다른 방법들이 있겠지만, 두 input과 prediction을 m..

Computer Vision 2021. 7. 20. 19:57

Prev 1 Next

목록ComputerVision (1)

IT 정리용 블로그!

티스토리툴바