Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Incremental learning is one paradigm to enable model building and updating at scale with streamin... more Incremental learning is one paradigm to enable model building and updating at scale with streaming data. For end-to-end automatic speech recognition (ASR) tasks, the absence of human annotated labels along with the need for privacy preserving policies for model building makes it a daunting challenge. Motivated by these challenges, in this paper we use a cloud based framework for production systems to demonstrate insights from privacy preserving incremental learning for automatic speech recognition (ILASR). By privacy preserving, we mean, usage of ephemeral data which are not human annotated. This system is a step forward for production level ASR models for incremental/continual learning that offers near realtime test-bed for experimentation in the cloud for end-to-end ASR, while adhering to privacy-preserving policies. We show that the proposed system can improve the production models significantly (3%) over a new time period of six months even in the absence of human annotated labels with varying levels of weak supervision and large batch sizes in incremental learning. This improvement is 20% over test sets with new words and phrases in the new time period. We demonstrate the effectiveness of model building in a privacy-preserving incremental fashion for ASR while further exploring the utility of having an effective teacher model and use of large batch sizes. CCS CONCEPTS • Computing methodologies → Speech recognition; Neural networks; Semi-supervised learning settings; • Security and privacy → Privacy-preserving protocols.
The energy consumption of consumer electronics has skyrocketed. According to IEA (International E... more The energy consumption of consumer electronics has skyrocketed. According to IEA (International Energy Agency), it will triple over the next two decades, reaching a level equivalent to the present total home electricity consumption of the U.S. and Japan combined. Therefore, improving the energy efficiency of these devices has become critical. Our project, called ugreen, seeks to motivate power consumption awareness and behavioral change among laptop computer users via human-computer interaction (HCI) feedback mechanisms with college students as the initial target group. We have chosen college students because they are avid laptop users with 87.7% of them reporting laptop ownership, and user behaviors play an important role in the total laptop system power consumption.
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Incremental learning is one paradigm to enable model building and updating at scale with streamin... more Incremental learning is one paradigm to enable model building and updating at scale with streaming data. For end-to-end automatic speech recognition (ASR) tasks, the absence of human annotated labels along with the need for privacy preserving policies for model building makes it a daunting challenge. Motivated by these challenges, in this paper we use a cloud based framework for production systems to demonstrate insights from privacy preserving incremental learning for automatic speech recognition (ILASR). By privacy preserving, we mean, usage of ephemeral data which are not human annotated. This system is a step forward for production level ASR models for incremental/continual learning that offers near realtime test-bed for experimentation in the cloud for end-to-end ASR, while adhering to privacy-preserving policies. We show that the proposed system can improve the production models significantly (3%) over a new time period of six months even in the absence of human annotated labels with varying levels of weak supervision and large batch sizes in incremental learning. This improvement is 20% over test sets with new words and phrases in the new time period. We demonstrate the effectiveness of model building in a privacy-preserving incremental fashion for ASR while further exploring the utility of having an effective teacher model and use of large batch sizes. CCS CONCEPTS • Computing methodologies → Speech recognition; Neural networks; Semi-supervised learning settings; • Security and privacy → Privacy-preserving protocols.
The energy consumption of consumer electronics has skyrocketed. According to IEA (International E... more The energy consumption of consumer electronics has skyrocketed. According to IEA (International Energy Agency), it will triple over the next two decades, reaching a level equivalent to the present total home electricity consumption of the U.S. and Japan combined. Therefore, improving the energy efficiency of these devices has become critical. Our project, called ugreen, seeks to motivate power consumption awareness and behavioral change among laptop computer users via human-computer interaction (HCI) feedback mechanisms with college students as the initial target group. We have chosen college students because they are avid laptop users with 87.7% of them reporting laptop ownership, and user behaviors play an important role in the total laptop system power consumption.
Uploads
Papers by Pankaj Sitpure