Publication

IEEE White Paper

Current

Note: Latest version: IEEE White Paper

Existing or new amendments and versions must be purchased separately.

Language
Services

Abstract

- Active. This report summarizes the findings and recommendations of the speech subcommittee of the IEEE prestandardization effort on standards for Indian language resources. Speech processing technology is all about recognizing spoken words. Technologies like language detection, speech transcription, speech synthesis, etc., are part of speech processing technology. In the context of the official Indian languages, there is an identified gap in the status of such technologies for Indian languages. The report focuses exclusively on official Indian languages. It summarizes the available standards and practices, key use cases driving the deployment of such technology, priority gaps that need to be addressed to promote the adoption of speech technologies, resources available, and resources required toward the development of speech technologies, current metrics, and datasets used for evaluation of various technologies and identifies gaps that need to be addressed.

Products specifications

  • Publication from IEEE
  • Published:
  • Document type: IS
  • Pages
  • Publisher: IEEE
  • Distributor: IEEE
  • National Committee: IEEE-SASB / Industry Connections Committee

Product Relations

  • Refers: [9] Pronunciation Lexicon Specification (PLS) Version 1.0, W3C Recommendation, 14 October 2008. Available: https://www.w3.org/TR/2008/REC-pronunciation-lexicon-20081014/
  • Refers: [6] Shrivastava, Rajeev, “Architecture for Cognitive Contact Center — Part 1,” Aug 1, 2020. Available: https://medium.com/@rshriv/reference-architecture-for-cognitive-contact-center-part-1-970fabf773a2
  • Refers: [13] Catalogue of Language Resources, ELRA. Available: http://www.elra.info/en/catalogues/catalogue-language-resources/
  • Refers: [23] MUCS 2021: MUltilingual and Code-Switching ASR Challenges for Low Resource Indian Languages, 12-13 August, 2021. Available: https://navana-tech.github.io/MUCS2021/
  • Refers: [4] Human-Centric Interfaces for Ambient Intelligence, Hamid Aghajan, Ramón López-Cózar Delgado and Juan Carlos Augusto, 2010. Available: https://www.sciencedirect.com/topics/computer-science/speaker-identification
  • Refers: [15] Validation of Content and Quality of Existing SLR: Overview and Methodology, ELRA, Ref. ELRA/9901/VAL-1/D1.1 Date:15-07-02. Available: http://www.elra.info/media/filer_public/2013/12/04/d1-1.pdf
  • Refers: [22] Hindi-Tamil-English ASR Challenge, Speech Processing Lab -IITM. Available: https://sites.google.com/view/indian-language-asrchallenge/home
  • Refers: [20] SamudraVijaya K., “Indian Language Speech Label (ILSL): A de facto National Standard,” in Advances in Speech and Music Technology, Eds. A.Biswas, E.Wenneks, T.P.Hong and A. Wieczorkowska, Advances in Intelligent Systems and Computing (AISC), Springer; Preprint. Available: https://www.iitg.ac.in/clst/visitors/samudravijaya/publ/20FRSM_ILSL.pdf
  • Refers: [7] Speech Recognition Grammar Specification, Version 1.0, W3C Recommendation, 16 March 2004. Available: https://www.w3.org/TR/2004/REC-speech-grammar-20040316/
  • Refers: [10] Speech Synthesis Markup Language (SSML) Version 1.1, W3C Recommendation, 7 September 2010. Available: https://www.w3.org/TR/speech-synthesis11/
  • Refers: [18] Methodology for a Quick Quality Check of SLR and Phonetic Lexicons, D1.2, ELRA/0201/VAL-1, 18 Aug 2004. Available: http://www.elra.info/media/filer_public/2013/12/04/d12v23.doc
  • Refers: [3] Rix, A. W., J. G. Beerends, M. P. Hollier, and A. P. Hekstra, “Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs,” 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), Salt Lake City, UT, USA, 2001, pp. 749–752 vol.2. Available: https://ieeexplore.ieee.org/document/941023
  • Refers: [14] Standards and Best Practices, ELRA. Available: http://www.elra.info/en/services-around-lrs/validation/standards-best-practices/
  • Refers: [11] CSS Speech Module, W3C Candidate Recommendation, 10 March 2020. Available: https://www.w3.org/TR/2020/CR-css-speech-1-20200310/
  • Refers: [2] Block Scheme of Speech Recognition, Speech Processing Laboratory, Department of Circuit Theory, Technická 2, 160 00 Prague 6, Project: Continuous and spontaneous speech recognition. Available: http://www.fel.cvut.cz/en/research/teams/speechlab/hmm_recog_eng-1.jpg
  • Refers: [5] Wellbourn, Robert, “Avoiding Contact Center IVR Hell with WebRTC.” Available: https://webrtchacks.com/webrtc-contact-center/.
  • Refers: [17] Validation criteria, INCO-COP-977017, ED1.4.2, EECHDAT(E), Dated 27-Oct-1999. Available: http://lands.let.ru.nl/spex/validationcentre/ed142v13.pdf
  • Refers: [8] Semantic Interpretation for Speech Recognition (SISR) Version 1.0, W3C Recommendation, 5 April 2007. Available: https://www.w3.org/TR/2007/REC-semantic-interpretation-20070405/
  • Refers: [19] European Language Grid, v: Release1.1.1, Copyright 2020, ELG Technical Team. Revision 752bf184. Available: https://european-language-grid.readthedocs.io/en/release1.1.1/all/2_Using/Browse.html
  • Refers: [1] Language matrix: International typography on the Web, World Wide Web Consortium, W3C®, 2017. Available: https://www.w3.org/International/typography/gap-analysis/language-matrix.html
  • Refers: [16] Speech Driven Interfaces for Consumer Applications, SPEECON Deliverable 41, IST-1999-10003, SPEECON. Available: http://lands.let.ru.nl/spex/validationcentre/D41_V2.0_Post-final.pdf
  • Refers: [12] Quality Control of Language Resources at ELRA, Available: https://hdl.handle.net/2066/76444
  • Refers: [21] A Study on Open Voice Data in Indian Languages, Feb 08, 2021. Available: https://indiaai.gov.in/research-reports/a-study-on-open-voice-data-in-indian-languages