본문 바로가기

분류 전체보기599

[2022-08-30] 오늘의 자연어처리 MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages Owing to the lack of corpora for low-resource languages, current works on dialogue generation have mainly focused on English. In this paper, we present mDIA, the first large-scale multilingual benchmark for dialogue generation across low- to high-resource languages. It covers real-life conversations in 46 languages across 19 .. 2022. 8. 30.
[2022-08-30] 오늘의 자연어처리 Cross-Modality Gated Attention Fusion for Multimodal Sentiment Analysis Multimodal sentiment analysis is an important research task to predict the sentiment score based on the different modality data from a specific opinion video. Many previous pieces of research have proved the significance of utilizing the shared and unique information across different modalities. However, the high-order combi.. 2022. 8. 30.
[2022-08-29] 오늘의 자연어처리 Training a T5 Using Lab-sized Resources Training large neural language models on large datasets is resource- and time-intensive. These requirements create a barrier to entry, where those with fewer resources cannot build competitive models. This paper presents various techniques for making it possible to (a) train a large language model using resources that a modest research lab might have, and .. 2022. 8. 29.
[2022-08-29] 오늘의 자연어처리 Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks Indigenous African languages are categorized as under-served in Artificial Intelligence and suffer poor digital inclusivity and information access. The challenge has been how to use machine learning and deep learning models without the requisite data. Kencorpus is a Kenyan Language corpus that .. 2022. 8. 29.