반응형 오늘의 자연어 처리572 [2024-01-03] 오늘의 자연어처리 SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection Abstract:Multi-modal intent detection aims to utilize various modalities to understand the user's intentions, which is essential for the deployment of dialogue systems in real-world scenarios. The two core challenges for multi-modal intent detection are (1) how to effectively align and fuse d.. 2024. 1. 3. [2024-01-02] 오늘의 자연어처리 Principled Gradient-based Markov Chain Monte Carlo for Text Generation Abstract:Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence. However, as we show in this paper, previous attempts on this approach to text generation all fail to sample correctly from the t.. 2024. 1. 2. [2024-01-01] 오늘의 자연어처리 OmniDialog: An Omnipotent Pre-training Model for Task-Oriented Dialogue System Abstract:Pre-trained conversation models (PCMs) have demonstrated remarkable results in task-oriented dialogue (TOD) systems. Many PCMs focus predominantly on dialogue management tasks like dialogue state tracking, dialogue generation tasks like response generation, or both. However, the existing PCMs seldom consider .. 2024. 1. 1. [2023-12-31] 오늘의 자연어처리 Spike No More: Stabilizing the Pre-training of Large Language Models Abstract:The loss spike often occurs during pre-training of a large language model. The spikes degrade the performance of a large language model, and sometimes ruin the pre-training. Since the pre-training needs a vast computational budget, we should avoid such spikes. To investigate a cause of loss spikes, we focus on gradient.. 2023. 12. 31. 이전 1 2 3 4 5 6 ··· 143 다음 반응형