Databases for Speech Recognition
Spanish
Catalan
- EUROM 1
The EUROM 1 Spanish Database was recorded in the framework of the European ESPRIT Project 6819 SAM-A.The Database was designed for Automatic Speech Recognition assessment purposes. Contains recordings from sixty speakers in a true anecoic room. Speakers were selected to obtain a good dialectal coverage. The database contains numbers, passages, sentences, CVCV and CVCV in carrier phrases. The sampling frequency is 20 KHz.
- Albayzin
Sentences. 200 speakers. Silent room. 20 kHz.
The Albayzin Spanish corpus consists of 3 sub-corpora of 16 kHz 16 bits signals, recorded by 304 Castillian speakers.- SpeechDat Spanish FDB - 4000 speakers.
In the framework of the SpeechDat (II) project (LE2-4001), which was funded by the EC, a large telephone speech corpus has been collected and processed. Recording was done using an ISDN telephone interface, yielding 8 KHz, 8 bit/sample A-law coded signals. The corpus contains the speech of 4000 speakers (about half male and half female). Equivalent corpora have been collected for other European languages.
The corpora are designed to support the creation of voice driven teleservices. The callers spoke 40 items, comprising isolated and connected digits, natural numbers, money amounts, spellings, time and data phrases, confirmation/rejections, forenames and surnames, city names, company names, common applications words, application words in phrases and phonetically rich sentences. Most items are read, some are spontaneously spoken. The recordings come with extensive and standardised documentation. All speech is carefully transcribed on the orthographic level; in addition, a number of clearly audible non – speech events are included in the transcription. Moreover, age and regional background of the speakers are provided. A pronunciation dictionary is added, containing all words that occur in the corpus, with a corresponding SAMPA broad - class phonemic transcription. The data files are formatted according to the ESPRIT Project SAM standards.- SpeechDat Car Spanish
This database comprises recordings from 306 speakers recorded in 600 different sessions. Speech signals were recorded in a car and simultaneously transmitted by GSM and recorded in a fixed platform connected to an ISDN line. The SpeechDat Car Spanish Database was recorded within the scope of the SpeechDat Car project (LE4-8334) which was sponsored by the European Commission and the Spanish Government. Collection was performed at the Department of Signal Theory and Communications of the Technical University of Catalonia (UPC) (Spain) with the collaboration of SEAT and Volkswagen. The owner of the database is UPC.
- SALA Venezuela FDB - 1000 speakers.
This database comprises telephone recordings from 1000 speakers recorded directly over the fixed PSTN using two analogue lines. Signals were sampled at 8 KHz and mu-law encoded without automatic gain control. The SALA Spanish Venezuelan Database for Fixed Telephone Network was recorded within the scope of the SALA project and supported by the Spanish Government. The design of the corpus and the collection was performed at the Universidad de los Andes (ULA), Mérida Venezuela. Transcription and formatting was performed at the Technical University of Catalonia (UPC), Spain. The owner of the database is the Technical University of Catalonia (UPC), Spain.- European Parliament Transcriptions DB
EPPS Transcriptions database was done within the scope of the TC-STAR project (FP6-506738) which was sponsored by European commission. The database comprises recordings of members of the European Parliament speaking in the parliamentary plenary sessions (EPPS) as well as recordings of interpreters. Recordings of Spanish Parliament (PARL) are also included to achive a total of 100 speech hours transcribed. Transcription was performed by Applied technologies on Language and Speech, S.L. (ATLAS), from Spain.The owner of the transcriptions is Universitat Politecnica de Catalunya, from Spain.
- SpeechDat Catalan FDB-1005 speakers
The SpeechDat Catalan Database for Fixed Telephone Network was collected at the TALP Research Center of the Universitat Politècnica de Catalunya (UPC) . The production of this database has been partially funded by the Centre de Referència en Enginyeria Lingüística (CREL). The Laboratori de fonètica de la Universitat Autònoma de Barcelona (UAB) has collaborated in the phonetic work.
This database comprises telephone recordings from 1005 speakers recorded directly over the fixed PSTN using an E-1 interface.
This database was designed following the specifications given in the SpeechDat II project. Equivalent corpora have been collected for other European languages (e.g. Spanish) .
The corpora are designed to support the creation of voice driven teleservices in Catalan. The callers spoke 40 items, comprising isolated and connected digits, natural numbers, money amounts, spellings, time and data phrases, confirmation/rejections, forenames and surnames, city names, company names, common applications words, application words in phrases and phonetically rich sentences. Most items are read, some are spontaneously spoken. The recordings come with extensive and standardised documentation. All speech is carefully transcribed on the orthographic level; in addition, a number of clearly audible non – speech events are included in the transcription. Moreover, age and regional background of the speakers are provided. A pronunciation dictionary is added, containing all words that occur in the corpus, with a corresponding SAMPA broad - class phonemic transcription. The data files are formatted according to the ESPRIT Project SAM standards.- SpeechDat Catalan FDB-2000 speakers
The Universitat Politècnica de Catalunya (UPC) and Applied Tecnologies on Language and Speech (ATLAS) have recorded and processed a large oral database inside this project, funded by the Generalitat de Catalunya. The recordings have been made using an ISDN telephone interface with a sampling rate of 8KHz, 8 bit for each sample and A-law encoding.
The corpora contains the voice of 2000 persons, half of them women and the other half men. Equivalent corpora have been collected for other European languages.
The corpus has been designed to support the creation of teleservices commanded by voice. The callers (the persons whose voice is recorded) spoke 40 items, comprising isolated and connected digits, natural numbers, money amounts, spellings, time and data phrases, confirmation/rejections, forenames and surnames, city names, company names, common applications words, application words in phrases and phonetically rich sentences. Most items are read, some are spontaneously spoken. The recordings come with extensive and standardised documentation. All speech is carefully transcribed on the orthographic level; in addition, a number of clearly audible non – speech events are included in the transcription. Moreover, age and regional background of the speakers are provided. A pronunciation dictionary is added, containing all words that occur in the corpus, with a corresponding SAMPA broad - class phonemic transcription. The data files are formatted according to the ESPRIT Project SAM standards.- SpeechDat Catalan MDB-2000 speakers
The Universitat Politècnica de Catalunya (UPC) and Applied Tecnologies on Language and Speech (ATLAS) have recorded and processed a large oral database inside this project, funded by the Generalitat de Catalunya. The recordings have been made using an ISDN telephone interface with a sampling rate of 8KHz, 8 bit for each sample and A-law encoding.
The corpus contains the voice of 2000 persons, half of them women and the other half men. Equivalent corpora have been collected for other European languages.
The corpus has been designed to support the creation of teleservices commanded by voice using mobile telephones. The callers (the persons whose voice is recorded) spoke 40 items, comprising isolated and connected digits, natural numbers, money amounts, spellings, time and data phrases, confirmation/rejections, forenames and surnames, city names, company names, common applications words, application words in phrases and phonetically rich sentences. Most items are read, some are spontaneously spoken. The recordings come with extensive and standardised documentation. All speech is carefully transcribed on the orthographic level; in addition, a number of clearly audible non – speech events are included in the transcription. Moreover, age and regional background of the speakers are provided. A pronunciation dictionary is added, containing all words that occur in the corpus, with a corresponding SAMPA broad - class phonemic transcription. The data files are formatted according to the ESPRIT Project SAM standards.- SpeechDat Car Catalan
The Universitat Politècnica de Catalunya (UPC) and Applied Tecnologies on Language and Speech (ATLAS) have recorded and processed a large oral database inside this project, funded by the Generalitat de Catalunya.
This database contains the recordings of 600 different sessions made with 300 informants. Each session consists of 119 read phrases and other spontaneous phrases that have been recorded using 4 microphones installed in cars.
- SpeeCon Catalan
The Universitat Politècnica de Catalunya (UPC) and Applied Tecnologies on Language and Speech (ATLAS) have recorded and processed a large oral database inside this project, funded by the Generalitat de Catalunya.The corpus contains the voice of 550 persons, each one recorded in 1 session, where approximately the half of them will be women and the other half men. One session consists of about 291 read phrases and a maximum of 30 more of spontaneous speech recorded with 4 microphones using a mobile platfom.