[2022-09-26] 오늘의 자연어처리

Learning to Write with Coherence From Negative Examples

Coherence is one of the critical factors that determine the quality of writing. We propose writing relevance (WR) training method for neural encoder-decoder natural language generation (NLG) models which improves coherence of the continuation by leveraging negative examples. WR loss regresses the vector representation of the context and generated sentence toward positive continuation by contrasting it with the negatives. We compare our approach with Unlikelihood (UL) training in a text continuation task on commonsense natural language inference (NLI) corpora to show which method better models the coherence by avoiding unlikely continuations. The preference of our approach in human evaluation shows the efficacy of our method in improving coherence.

일관성은 품질을 결정하는 중요한 요소 중 하나이다. 쓰기. 우리는 신경에 대한 쓰기 관련성(WR) 훈련 방법을 제안한다. 향상된 인코더-디코더 자연어 생성(NLG) 모델 부정적인 예를 활용하여 연속성의 일관성을 유지한다. WR손실 문맥과 생성된 문장의 벡터 표현을 회귀시킨다. 부정적인 것과 대조함으로써 긍정적인 지속을 지향한다. 우리는 비교한다 텍스트 연속 작업에서 UL(Unlikable) 교육을 사용한 우리의 접근 방식 상식적인 자연어 추론(NLI) 말뭉치는 어떤 방법을 보여준다. 발생할 수 없는 연속을 방지하여 일관성을 더 잘 모델링합니다. 선호 인간 평가에 대한 우리의 접근 방식은 우리의 방법의 효과를 보여준다. 일관성 향상

Semantically Consistent Data Augmentation for Neural Machine Translation via Conditional Masked Language Model

This paper introduces a new data augmentation method for neural machine translation that can enforce stronger semantic consistency both within and across languages. Our method is based on Conditional Masked Language Model (CMLM) which is bi-directional and can be conditional on both left and right context, as well as the label. We demonstrate that CMLM is a good technique for generating context-dependent word distributions. In particular, we show that CMLM is capable of enforcing semantic consistency by conditioning on both source and target during substitution. In addition, to enhance diversity, we incorporate the idea of soft word substitution for data augmentation which replaces a word with a probabilistic distribution over the vocabulary. Experiments on four translation datasets of different scales show that the overall solution results in more realistic data augmentation and better translation quality. Our approach consistently achieves the best performance in comparison with strong and recent works and yields improvements of up to 1.90 BLEU points over the baseline.

이 논문은 신경 기계를 위한 새로운 데이터 확대 방법을 소개한다. 와 내에서 모두 더 강력한 의미적 일관성을 강제할 수 있는 번역 언어를 초월하여 우리의 방법은 조건부 마스크 언어 모델을 기반으로 한다. (CMLM) 양방향이며 좌우 모두 조건부일 수 있습니다. 컨텍스트 및 레이블. 우리는 CMLM이 좋은 기술이라는 것을 입증한다. 문맥 의존적인 단어 분포를 생성합니다. 특히, 우리는 다음을 보여준다. CMLM은 두 가지 조건을 모두 충족함으로써 의미론적 일관성을 시행할 수 있다. 소스와 타겟은 대체 중에 표시됩니다. 게다가, 다양성을 향상시키기 위해, 우리는 데이터 증대를 위한 소프트 워드 대체의 개념을 통합한다. 단어를 어휘에 대한 확률적 분포로 대체합니다. 서로 다른 규모의 4개의 번역 데이터 세트에 대한 실험은 전반적인 솔루션을 통해 보다 현실적인 데이터 확대 및 개선 번역 품질 NAT의 접근 방식은 지속적으로 최고의 성능을 달성합니다. 강렬하고 최근의 작품과의 비교 및 최대 1.90의 개선 효과 BLEU 점이 기준선을 초과합니다.

Improving Attention-Based Interpretability of Text Classification Transformers

Transformers are widely used in NLP, where they consistently achieve state-of-the-art performance. This is due to their attention-based architecture, which allows them to model rich linguistic relations between words. However, transformers are difficult to interpret. Being able to provide reasoning for its decisions is an important property for a model in domains where human lives are affected, such as hate speech detection and biomedicine. With transformers finding wide use in these fields, the need for interpretability techniques tailored to them arises. The effectiveness of attention-based interpretability techniques for transformers in text classification is studied in this work. Despite concerns about attention-based interpretations in the literature, we show that, with proper setup, attention may be used in such tasks with results comparable to state-of-the-art techniques, while also being faster and friendlier to the environment. We validate our claims with a series of experiments that employ a new feature importance metric.

변압기는 NLP에서 널리 사용되며, 여기서 지속적으로 달성된다. 최첨단의 공연 이것은 그들의 주의력 기반 때문이다. 건축, 그것은 그들이 사이의 풍부한 언어적 관계를 모델링할 수 있게 한다. 그러나 변압기는 해석하기 어렵다. 제공할 수 있는 능력 그것의 결정에 대한 추론은 도메인에서 모델의 중요한 속성이다. 혐오 발언 탐지 및 바이오의약품과 같은 인간의 삶이 영향을 받는 곳. 이러한 분야에서 변압기가 널리 사용됨에 따라, 그것들에 맞춘 해석성 기법이 생겨난다. 의 효과 텍스트 트랜스포머에 대한 주의 기반 해석성 기술 분류는 이 연구에서 연구된다. 주의력 기반에 대한 우려에도 불구하고 문헌의 해석은 적절한 설정으로, 우리는 주의를 기울인다는 것을 보여준다. 최첨단 기술에 필적하는 결과를 가진 이러한 작업에 사용될 수 있다. 기술은 환경에 더 빠르고 친숙합니다. 우리가 새로운 특징을 이용한 일련의 실험으로 우리의 주장을 입증하다. 중요도 측정법

'오늘의 자연어 처리' 카테고리의 다른 글

[2022-09-28] 오늘의 자연어처리 (0)	2022.09.28
[2022-09-27] 오늘의 자연어처리 (0)	2022.09.27
[2022-09-25] 오늘의 자연어처리 (1)	2022.09.25
[2022-09-25] 오늘의 자연어처리 (0)	2022.09.25
[2022-09-24] 오늘의 자연어처리 (1)	2022.09.24

잡다한 이야기

[2022-09-26] 오늘의 자연어처리

Learning to Write with Coherence From Negative Examples

Semantically Consistent Data Augmentation for Neural Machine Translation via Conditional Masked Language Model

Improving Attention-Based Interpretability of Text Classification Transformers

'오늘의 자연어 처리' 카테고리의 다른 글

댓글

티스토리툴바

[2022-09-26] 오늘의 자연어처리

Learning to Write with Coherence From Negative Examples

Semantically Consistent Data Augmentation for Neural Machine Translation via Conditional Masked Language Model

Improving Attention-Based Interpretability of Text Classification Transformers

'오늘의 자연어 처리' 카테고리의 다른 글

관련글

댓글

티스토리툴바