Brain Networks Laboratory (Choe Lab)

[Deepmind] Perceiver IO

Aug 11, 2021

Wow, this look impressive! A single model to learn from many different types of data sets (video, audio, point cloud, etc.) and also learn complex tasks like game playing.

The model is based on transformers.

Dan Lovy shared a link to the group: Bio - A.I.

Admin

Most deep learning models we build these days are highly optimized for a specific type of dataset. Architectures that are good at processing textual data cant be applied to computer vision or audio analysis. That level of specialization naturally influences the creation of models highly specialized in a given task and that are not able to adapt to other tasks. This constraint highly contrasts with human cognition in which many tasks require diverse inputs such as vision and audio. Recently, DeepMind published two papers unveiling general-purpose architectures that can process different types of input datasets.

The first paper titled “Perceiver: General Perception with Iterative Attention” introduces Perceiver, a transformer architecture that can process data including images, point clouds, audio, video, and their combinations but its limited to simple tasks such as classification. In “Perceiver IO: A General Architecture for Structured Inputs & Outputs”, DeepMind presents Perceiver IO, a more general version of the Perceiver model that can be applied to complex multi-modal tasks such as computer games.

PUB.TOWARDSAI.NET

DeepMind’s New Super Model: Perceiver IO is a Transformer that can Handle Any Dataset

https://pub.towardsai.net/deepminds-new-super-model-perceiver-io-is-a-transformer-that-can-handle-any-dataset-dfcffa85fe61


← Back to all articles         Quick Navigation:    Next:[ j ] – Prev:[ k ] – List:[ l ]