Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System

Authors: Jazmín Vidal, Cyntia Bonomi, Marcelo Sancinetti, Luciana Ferrer

Abstract:

In today’s globalized world being able to communicate in English is crucial to many people. Computer assisted pronunciation training (CAPT) systems can help students achieve English proficiency by providing an accessible way to practice, offering personalized feedback. However, phone-level pronunciation scoring is still a very challenging task, with performance far from that of human annotators. In this paper we compare and present results on the Spanish subset of the L2-ARCTIC corpus and the new Epa-DB database, both containing non-native English speech by native Spanish speakers and intended for the development of pronunciation scoring systems. We show the most frequent errors in each database and compare performance of a state-of-the-art goodness of pronunciation (GOP) system. Results show that both databases have similar error patterns and that performance is similar for most phones, despite differences in recording conditions. For the EpaDB database we also present an analysis of the errors per target phone. This study validates the EpaDB collection and annotations, providing initial results and contributing to the advancement of a challenging low-resource task.

More information: https://www.isca-speech.org/archive/interspeech_2021/vidal21_interspeech.html

Andres Juarez2022-05-06T14:33:05-03:00 6/mayo/2022|Papers|

Activity homogeneity: a measure for comparing time discretization and state quantization in ODE simulation

A note on busy beaver bounds

EnCodecMAE: Leveraging Neural Codecs for Universal Audio Representation Learning

Are Optimal Algorithms Still Optimal? Rethinking Sorting in LLM-Based Pairwise Ranking with Batching and Caching

Modal Abstractions for Smart Contract Validation

Integrating Bayesian and neural networks models for eye movement prediction in hybrid search

Algorithms to prove the maximum number of MUBs in arbitrary dimensión

Rauzy complexity and block entropy

Hybrid resource allocation control in cyber-physical systems: a novel simulation-driven methodology with applications to UAVs

Mapping Semantic Segmentation to Point Clouds Using Structure from Motion for Forest Analysis

A multi-scale agent-based model of aerosol-mediated indoor infections in heterogeneous scenarios

Non-crossing H-graphs: a generalization of proper interval graphs admitting FPT algorithms

The discrepancy estimate of the Champernowne constant

No Need for Ad-hoc Substitutes: The Expected Cost is a Principled All-purpose Classification Metric

Low-cost algorithms for clinical notes phenotype classification to enhance epidemiological surveillance: A case study

Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System

Compartir en las redes

Related Posts