본문 바로가기


[2022-10-19] 오늘의 자연어처리 KPI-EDGAR: A Novel Dataset and Accompanying Metric for Relation Extraction from Financial Documents We introduce KPI-EDGAR, a novel dataset for Joint Named Entity Recognition and Relation Extraction building on financial reports uploaded to the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system, where the main objective is to extract Key Performance Indicators (KPIs) from financia.. 2022. 10. 19.
[2022-10-18] 오늘의 자연어처리 HashFormers: Towards Vocabulary-independent Pre-trained Transformers Transformer-based pre-trained language models are vocabulary-dependent, mapping by default each token to its corresponding embedding. This one-to-one mapping results into embedding matrices that occupy a lot of memory (i.e. millions of parameters) and grow linearly with the size of the vocabulary. Previous work on on-device tra.. 2022. 10. 18.
[2022-10-15] 오늘의 자연어처리 CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation Named entity recognition (NER) suffers from the scarcity of annotated training data, especially for low-resource languages without labeled data. Cross-lingual NER has been proposed to alleviate this issue by transferring knowledge from high-resource languages to low-resource languages via aligne.. 2022. 10. 15.
[2022-10-14] 오늘의 자연어처리 GMP*: Well-Tuned Global Magnitude Pruning Can Outperform Most BERT-Pruning Methods We revisit the performance of the classic gradual magnitude pruning (GMP) baseline for large language models, focusing on the classic BERT benchmark on various popular tasks. Despite existing evidence in the literature that GMP performs poorly, we show that a simple and general variant, which we call GMP*, can mat.. 2022. 10. 14.