Research about Visual Question Answering Published in ArXiv
Visual Question Answering (VQA) is a recent topic in computer vision and natural language processing that has attracted a great deal of attention from deep learning, computer vision and natural language processing communities. (Kafle and Kanan, 2017). I have tried to collect and curate some publications form Arxiv that related to the visual question answering, and the results were listed here. Please enjoy it!
Last updated: August 14, 2020
Source : ArXiv
| No. | Year | Title | URL |
|---|---|---|---|
| 1 | 2020 | Visual Question Answering Using Semantic Information from Image Descriptions | View |
| 2 | 2020 | Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing | View |
| 3 | 2020 | Generating Rationales in Visual Question Answering | View |
| 4 | 2020 | PathVQA: 30000+ Questions for Medical Visual Question Answering | View |
| 5 | 2020 | RUBi: Reducing Unimodal Biases in Visual Question Answering | View |
| 6 | 2020 | VQA-LOL: Visual Question Answering under the Lens of Logic | View |
| 7 | 2020 | Component Analysis for Visual Question Answering Architectures | View |
| 8 | 2020 | Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning Approach | View |
| 9 | 2020 | Robust Explanations for Visual Question Answering | View |
| 10 | 2020 | Generating Question Relevant Captions to Aid Visual Question Answering | View |
| 11 | 2019 | Assessing the Robustness of Visual Question Answering | View |
| 12 | 2019 | Self-Critical Reasoning for Robust Visual Question Answering | View |
| 13 | 2019 | Learning Sparse Mixture of Experts for Visual Question Answering | View |
| 14 | 2019 | Inverse Visual Question Answering with Multi-Level Attentions | View |
| 15 | 2019 | Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering | View |
| 16 | 2019 | VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering | View |
| 17 | 2019 | Fusion of Detected Objects in Text for Visual Question Answering | View |
| 18 | 2019 | An Empirical Study on Leveraging Scene Graphs for Visual Question Answering | View |
| 19 | 2019 | A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data | View |
| 20 | 2019 | Quantifying and Alleviating the Language Prior Problem in Visual Question Answering | View |
| 21 | 2019 | GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering | View |
| 22 | 2019 | Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention | View |
| 23 | 2018 | Textually Enriched Neural Module Networks for Visual Question Answering | View |
| 24 | 2018 | Faithful Multimodal Explanation for Visual Question Answering | View |
| 25 | 2018 | Question-Guided Hybrid Convolution for Visual Question Answering | View |
| 26 | 2018 | Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining | View |
| 27 | 2018 | Learning Visual Question Answering by Bootstrapping Hard Attention | View |
| 28 | 2018 | Question Relevance in Visual Question Answering | View |
| 29 | 2018 | Learning Visual Knowledge Memory Networks for Visual Question Answering | View |
| 30 | 2018 | Think Visually: Question Answering through Virtual Imagery | View |
| 31 | 2018 | R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering | View |
| 32 | 2018 | Reciprocal Attention Fusion for Visual Question Answering | View |
| 33 | 2018 | Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering | View |
| 34 | 2018 | Attention on Attention: Architectures for Visual Question Answering (VQA) | View |
| 35 | 2018 | Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering | View |
| 36 | 2018 | Learning to Count Objects in Natural Images for Visual Question Answering | View |
| 37 | 2018 | Dual Recurrent Attention Units for Visual Question Answering | View |
| 38 | 2017 | Interpretable Counting for Visual Question Answering | View |
| 39 | 2017 | Don’t Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering | View |
| 40 | 2017 | Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge | View |
| 41 | 2017 | MemexQA: Visual Memex Question Answering | View |
| 42 | 2017 | Visual Question Answering with Memory-Augmented Networks | View |
| 43 | 2017 | Learning Convolutional Text Representations for Visual Question Answering | View |
| 44 | 2017 | Survey of Visual Question Answering: Datasets and Techniques | View |
| 45 | 2017 | Speech-Based Visual Question Answering | View |
| 46 | 2017 | The Promise of Premise: Harnessing Question Premises in Visual Question Answering | View |
| 47 | 2017 | C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1 | View |
| 48 | 2017 | Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets | View |
| 49 | 2017 | An Analysis of Visual Question Answering Algorithms | View |
| 50 | 2017 | Recurrent and Contextual Models for Visual Question Answering | View |
| 51 | 2017 | VQABQ: Visual Question Answering by Basic Questions | View |
| 52 | 2017 | Task-driven Visual Saliency and Attention-based Visual Question Answering | View |
| 53 | 2016 | VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering | View |
| 54 | 2016 | Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering | View |
| 55 | 2016 | Zero-Shot Visual Question Answering | View |
| 56 | 2016 | Hierarchical Question-Image Co-Attention for Visual Question Answering | View |
| 57 | 2016 | Proposing Plausible Answers for Open-ended Visual Question Answering | View |
| 58 | 2016 | Visual Question Answering: Datasets, Algorithms, and Future Challenges | View |
| 59 | 2016 | The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA) | View |
| 60 | 2016 | Graph-Structured Representations for Visual Question Answering | View |
| 61 | 2016 | Measuring Machine Intelligence Through Visual Question Answering | View |
| 62 | 2016 | Interpreting Visual Question Answering Models | View |
| 63 | 2016 | Analyzing the Behavior of Visual Question Answering Models | View |
| 64 | 2016 | Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? | View |
| 65 | 2016 | Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding | View |
| 66 | 2016 | Hierarchical Co-Attention for Visual Question Answering | View |
| 67 | 2016 | Ask Your Neurons: A Deep Learning Approach to Visual Question Answering | View |
| 68 | 2016 | A Focused Dynamic Attention Model for Visual Question Answering | View |
| 69 | 2016 | Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering | View |
| 70 | 2016 | VQA: Visual Question Answering | View |
| 71 | 2016 | Dynamic Memory Networks for Visual and Textual Question Answering | View |