본문 바로가기
반응형

논문572

[2023-05-24] 오늘의 자연어처리 GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference. We (1) propose a recipe for uptraining existing multi-head languag.. 2023. 5. 24.
[2023-05-23] 오늘의 자연어처리 HELMA: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, \ie content that conflicts with the source or cannot be verified by the factual knowledge. To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation for Large Lan.. 2023. 5. 23.
[2023-05-22] 오늘의 자연어처리 Silver Syntax Pre-training for Cross-Domain Relation Extraction Relation Extraction (RE) remains a challenging task, especially when considering realistic out-of-domain evaluations. One of the main reasons for this is the limited training size of current RE datasets: obtaining high-quality (manually annotated) data is extremely expensive and cannot realistically be repeated for each new domain. .. 2023. 5. 22.
[2023-05-21] 오늘의 자연어처리 Generalized Multiple Intent Conditioned Slot Filling Natural language understanding includes the tasks of intent detection (identifying a user's objectives) and slot filling (extracting the entities relevant to those objectives). Prior slot filling methods assume that each intent type cannot occur more than once within a message, however this is often not a valid assumption for real-world settin.. 2023. 5. 21.
반응형