Deep learning methods are machine learning methods using multiple processing layers or levels of abstraction. Deep learning algorithms are usually further characterized by having a simple and versatile structure. Specifically, deep learning is usually based on
feed forward or recursive multilayer neural networks to learn a particular model. A successful application of deep learning technologies consists in selecting a good architecture for the neural network as well as an effective training procedure to learn the parameters of the network.
In recent years, modeling using neural networks has emerged again very strongly thanks to the results on effective learning algorithms for deep and recursive neural networks. Other important factors of this renaissance are the availability of higher computing power and large databases. Large databases are necessary to train multilayer structures with a large number of parameters and computational resources make this process possible in a reasonable time. Although its widespread use started a few years ago, and
despite the difficulty of analyzing the behavior of deep learning algorithms, the impact of deep learning is already very important in areas as image, speech and text processing in both research and commercial applications. In speech recognition, for example,
we have now systems based on a simple generic deep learning architecture that outperform traditional speech recognition systems based on a complex architecture with many speech-specific processing modules.
This project proposes the development of new deep learning methods for speech and audio processing, exploring new applications and continuing the initial work of the research team and the international community.
The project includes a comprehensive work package dedicated to deep learning and four other work packages dedicated to speech and speaker recognition, acoustic event detection, voice synthesis and voice translation. In the first work package we will derive
new architectures and learning algorithms, taking into account the computational cost and the scalability to large databases, while the next work packages will explore their application in speech and audio processing.
In the framework of this work we plan to continue with the dissemination of the results and the collaboration with other national and international research groups. We will also expect to transfer the results to the multiple companies already interested in the theme of the project and its results. Specifically, the work plan details the cooperation with the Hospital Sant Joan de Déu of Barcelona in the monitorization of the neonatal intensive care unit.
We will put also emphasis on the evaluation of the results