Universal Image Segmentation is not a new concept. Past attempts to unify image segmentation in the last decades include scene parsing, panoptic segmentation, and, more recently, new panoptic architectures. We propose OneFormer, a universal image …
In this paper, we revisited the core design ideas of stateof-the-art deep inpainting networks. We propose an intuitive and effective inpainting architecture that augments the powerful comodulated StyleGAN2 generator with the high receptiveness …
Embodied Instruction Following (EIF) is a challenging problem requiring an agent to infer a sequence of actions to achieve a goal environment state from complex language and visual inputs. We propose a generalised Language Guided Meta-Controller …
In this paper, we address the problem of offensive language detection on Twitter, while also detecting the type and the target of the offence. We propose a novel approach called SyLSTM, which integrates syntactic features in the form of the …
Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Although adapters are known to facilitate adapting to many downstream tasks compared to …
The visual relationship recognition (VRR) task aims at understanding the pairwise visual relationships between interacting objects in an image. This paper shows that modeling an effective message-passing flow through an attention mechanism can be …
Finetuning a pretrained backbone in the encoder part of an image transformer network has been the traditional approach for the semantic segmentation task. However, such an approach leaves out the semantic context that an image provides during the …
Several approaches have been proposed in recent literature to alleviate the long-tail problem, mainly in object classification tasks. In this paper, we make the first large-scale study concerning the task of Long-Tail Visual Relationship Recognition …
Stock price movement and volatility prediction aim to predict stocks' future trends to help investors make sound investment decisions and model financial risk. Companies' earnings calls are a rich, underexplored source of multimodal information for …
Recent approaches for learning policies to improve caching, target just one out of the prefetching, admission and eviction processes. In contrast, we propose an end to end pipeline to learn all three policies using machine learning. We also take …