Research about Visual Question Answering Published in ArXiv
Visual Question Answering (VQA) is a recent topic in computer vision and natural language processing that has attracted a great deal of attention from deep learning, computer vision and natural language processing communities. (Kafle and Kanan, 2017). I have tried to collect and curate some publications form Arxiv that related to the visual question answering, and the results were listed here. Please enjoy it!
Last updated: August 14, 2020
Source : ArXiv
No. | Year | Title | URL |
---|---|---|---|
1 | 2020 | Visual Question Answering Using Semantic Information from Image Descriptions | View |
2 | 2020 | Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing | View |
3 | 2020 | Generating Rationales in Visual Question Answering | View |
4 | 2020 | PathVQA: 30000+ Questions for Medical Visual Question Answering | View |
5 | 2020 | RUBi: Reducing Unimodal Biases in Visual Question Answering | View |
6 | 2020 | VQA-LOL: Visual Question Answering under the Lens of Logic | View |
7 | 2020 | Component Analysis for Visual Question Answering Architectures | View |
8 | 2020 | Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning Approach | View |
9 | 2020 | Robust Explanations for Visual Question Answering | View |
10 | 2020 | Generating Question Relevant Captions to Aid Visual Question Answering | View |
11 | 2019 | Assessing the Robustness of Visual Question Answering | View |
12 | 2019 | Self-Critical Reasoning for Robust Visual Question Answering | View |
13 | 2019 | Learning Sparse Mixture of Experts for Visual Question Answering | View |
14 | 2019 | Inverse Visual Question Answering with Multi-Level Attentions | View |
15 | 2019 | Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering | View |
16 | 2019 | VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering | View |
17 | 2019 | Fusion of Detected Objects in Text for Visual Question Answering | View |
18 | 2019 | An Empirical Study on Leveraging Scene Graphs for Visual Question Answering | View |
19 | 2019 | A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data | View |
20 | 2019 | Quantifying and Alleviating the Language Prior Problem in Visual Question Answering | View |
21 | 2019 | GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering | View |
22 | 2019 | Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention | View |
23 | 2018 | Textually Enriched Neural Module Networks for Visual Question Answering | View |
24 | 2018 | Faithful Multimodal Explanation for Visual Question Answering | View |
25 | 2018 | Question-Guided Hybrid Convolution for Visual Question Answering | View |
26 | 2018 | Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining | View |
27 | 2018 | Learning Visual Question Answering by Bootstrapping Hard Attention | View |
28 | 2018 | Question Relevance in Visual Question Answering | View |
29 | 2018 | Learning Visual Knowledge Memory Networks for Visual Question Answering | View |
30 | 2018 | Think Visually: Question Answering through Virtual Imagery | View |
31 | 2018 | R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering | View |
32 | 2018 | Reciprocal Attention Fusion for Visual Question Answering | View |
33 | 2018 | Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering | View |
34 | 2018 | Attention on Attention: Architectures for Visual Question Answering (VQA) | View |
35 | 2018 | Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering | View |
36 | 2018 | Learning to Count Objects in Natural Images for Visual Question Answering | View |
37 | 2018 | Dual Recurrent Attention Units for Visual Question Answering | View |
38 | 2017 | Interpretable Counting for Visual Question Answering | View |
39 | 2017 | Don’t Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering | View |
40 | 2017 | Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge | View |
41 | 2017 | MemexQA: Visual Memex Question Answering | View |
42 | 2017 | Visual Question Answering with Memory-Augmented Networks | View |
43 | 2017 | Learning Convolutional Text Representations for Visual Question Answering | View |
44 | 2017 | Survey of Visual Question Answering: Datasets and Techniques | View |
45 | 2017 | Speech-Based Visual Question Answering | View |
46 | 2017 | The Promise of Premise: Harnessing Question Premises in Visual Question Answering | View |
47 | 2017 | C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1 | View |
48 | 2017 | Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets | View |
49 | 2017 | An Analysis of Visual Question Answering Algorithms | View |
50 | 2017 | Recurrent and Contextual Models for Visual Question Answering | View |
51 | 2017 | VQABQ: Visual Question Answering by Basic Questions | View |
52 | 2017 | Task-driven Visual Saliency and Attention-based Visual Question Answering | View |
53 | 2016 | VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering | View |
54 | 2016 | Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering | View |
55 | 2016 | Zero-Shot Visual Question Answering | View |
56 | 2016 | Hierarchical Question-Image Co-Attention for Visual Question Answering | View |
57 | 2016 | Proposing Plausible Answers for Open-ended Visual Question Answering | View |
58 | 2016 | Visual Question Answering: Datasets, Algorithms, and Future Challenges | View |
59 | 2016 | The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA) | View |
60 | 2016 | Graph-Structured Representations for Visual Question Answering | View |
61 | 2016 | Measuring Machine Intelligence Through Visual Question Answering | View |
62 | 2016 | Interpreting Visual Question Answering Models | View |
63 | 2016 | Analyzing the Behavior of Visual Question Answering Models | View |
64 | 2016 | Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? | View |
65 | 2016 | Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding | View |
66 | 2016 | Hierarchical Co-Attention for Visual Question Answering | View |
67 | 2016 | Ask Your Neurons: A Deep Learning Approach to Visual Question Answering | View |
68 | 2016 | A Focused Dynamic Attention Model for Visual Question Answering | View |
69 | 2016 | Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering | View |
70 | 2016 | VQA: Visual Question Answering | View |
71 | 2016 | Dynamic Memory Networks for Visual and Textual Question Answering | View |