To perceive the world and model humans’ behavior, the information about such phenomena can be gathered from different types of sensors and under different environmental/social conditions. All such variations produce different data modalities and sources (such as videos, optical/thermal/depth images, body poses, text) which can bring complemental patterns and cues to face the problem of interest. PAVIS is the expert in developing computational pattern recognition techniques to aggregate and combine multi-source data in order to automatically inspect and retrieve useful information in a multi-modal learning paradigm.
With sensory inputs of different modalities including visual, audio and range, etc..., PAVIS develops algorithms that are able to digitalize and model the physical world either in a dense manner as point clouds or in a compact manner as scene graphs with spatial semantics. We believe for a complete scene understanding, the scene representation should encode not only the geometrical attributes, but also scene semantics in its static or dynamic form, which empowers any AI system with answers to queries, such as ‘where are things? what are their functions? and what are the dynamics of the scene?’.
Through the usage of machine learning models, complex and high-level tasks (such as classification, detection or retrieval) can be accomplished by just discovering the information, which is conveyed in the data itself. PAVIS has the experience and know-how in machine learning approaches which take into account geometrical cues, common knowledge and physical principles. The objective is to deploy models that are able to generalize as far as possible beyond the distribution of data used for training. Having such property is pivotal to ensure methods to be applicable “into-the-wild”, in order to avoid re-training or fine-tuning models when new applicative scenarios are prospected.
PAVIS is developing deep learning solutions to obtain data-driven representations built in a hierarchical fashion at increasing level of abstractions, while also solving the task of interest (such as classification, regression or ranking) with an end-to-end solution. In order to be more effective, they are also a convenient replacement for hand-crafted representations which, instead, require human expertise. At PAVIS, we investigate several deep learning methods in order to better accommodate the selected application and research fields. We also investigate deep learning from a more theoretical perspective, in order to shed light to its principles and to make the models more explicable.