본문 바로가기
반응형

arxiv572

[2022-08-29] 오늘의 자연어처리 Training a T5 Using Lab-sized Resources Training large neural language models on large datasets is resource- and time-intensive. These requirements create a barrier to entry, where those with fewer resources cannot build competitive models. This paper presents various techniques for making it possible to (a) train a large language model using resources that a modest research lab might have, and .. 2022. 8. 29.
[2022-08-29] 오늘의 자연어처리 Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks Indigenous African languages are categorized as under-served in Artificial Intelligence and suffer poor digital inclusivity and information access. The challenge has been how to use machine learning and deep learning models without the requisite data. Kencorpus is a Kenyan Language corpus that .. 2022. 8. 29.
[2022-08-28] 오늘의 자연어처리 DPTDR: Deep Prompt Tuning for Dense Passage Retrieval Deep prompt tuning (DPT) has gained great success in most natural language processing~(NLP) tasks. However, it is not well-investigated in dense retrieval where fine-tuning~(FT) still dominates. When deploying multiple retrieval tasks using the same backbone model~(e.g., RoBERTa), FT-based methods are unfriendly in terms of deployment cost: e.. 2022. 8. 28.
[2022-08-27] 오늘의 자연어처리 Universality and diversity in word patterns Words are fundamental linguistic units that connect thoughts and things through meaning. However, words do not appear independently in a text sequence. The existence of syntactic rules induce correlations among neighboring words. Further, words are not evenly distributed but approximately follow a power law since terms with a pure semantic content appe.. 2022. 8. 27.
반응형