본문 바로가기
반응형

arxiv572

[2022-08-15] 오늘의 자연어처리 Speech Synthesis with Mixed Emotions Emotional speech synthesis aims to synthesize human voices with various emotional effects. The current studies are mostly focused on imitating an averaged style belonging to a specific emotion type. In this paper, we seek to generate speech with a mixture of emotions at run-time. We propose a novel formulation that measures the relative difference between the.. 2022. 8. 15.
[2022-08-15] 오늘의 자연어처리 RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling We present RealityTalk, a system that augments real-time live presentations with speech-driven interactive virtual elements. Augmented presentations leverage embedded visuals and animation for engaging and expressive storytelling. However, existing tools for live presentations often lack interactivity and improv.. 2022. 8. 15.
[2022-08-11] 오늘의 자연어처리 CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning We introduce CLEVR-Math, a multi-modal math word problems dataset consisting of simple math word problems involving addition/subtraction, represented partly by a textual description and partly by an image illustrating the scenario. The text describes actions performed on the scene that is depicted in the image. S.. 2022. 8. 11.
[2022-08-10] 오늘의 자연어처리 Information Extraction from Scanned Invoice Images using Text Analysis and Layout Features While storing invoice content as metadata to avoid paper document processing may be the future trend, almost all of daily issued invoices are still printed on paper or generated in digital formats such as PDFs. In this paper, we introduce the OCRMiner system for information extraction from scanned document.. 2022. 8. 10.
반응형