VisQA: X-raying Vision and Language Reasoning in Transformers

For more information on this tool and how to use it, please refer to our vidéo demo and paper.

K=0

k<12

k<25

k<50

k<99

K=100
Masked Heads:

Per Operation
Per group
0
(No Attention)
0.5
(Mid Attention)
1
(Full Attention)

Authors

Théo Jaunet

Corentin Kervadec

Romain Vuillemot

Grigory Antipov

Moez Baccouche

Christian Wolf

BibTex

                    
@inproceedings{Jaunet2021VisQA,
    author = {Theo Jaunet, Corentin Kervadec, Romain Vuillemot, Grigory Antipov, Moez Baccouche, and Christian Wolf},
    title = {VisQA: X-raying Vision and Language Reasoning in Transformers},
    journal={IEEE Transactions on Visualization and Computer Graphics (TVCG)},
    intype = {to appear in},
    year={2021},
    publisher={IEEE}
}