[2023-08-27] 오늘의 자연어처리

Code Llama: Open Foundation Models for Code

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama - Python), and instruction-following models (Code Llama - Instruct) with 7B, 13B and 34B parameters each. All models are trained on sequences of 16k tokens and show improvements on inputs with up to 100k tokens. 7B and 13B Code Llama and Code Llama - Instruct variants support infilling based on surrounding content. Code Llama reaches state-of-the-art performance among open models on several code benchmarks, with scores of up to 53% and 55% on HumanEval and MBPP, respectively. Notably, Code Llama - Python 7B outperforms Llama 2 70B on HumanEval and MBPP, and all our models outperform every other publicly available model on MultiPL-E. We release Code Llama under a permissive license that allows for both research and commercial use.

우리는 코드를 위한 큰 언어 모델군인 코드 라마를 출시한다 개방형 모델 중 최첨단 성능을 제공하는 Lama 2, filling 기능, 대규모 입력 컨텍스트 지원 및 제로샷 명령 다음과 같은 프로그래밍 작업 능력이 있습니다. 우리는 A를 커버하기 위해 다양한 맛을 제공한다 광범위한 응용 프로그램: 기초 모델(코드 라마), 파이썬 전문화(코드 라마 - 파이썬) 및 명령어 후속 모델(코드) 라마 - 강사) 각 7B, 13B 및 34B 매개 변수 포함. 모든 모델이 교육을 받습니다 16,000개의 토큰 시퀀스에서 최대 100,000개의 입력에 대한 개선을 보여줍니다 토큰. 7B 및 13B 코드 라마 및 코드 라마 - 변형 지원 지시 주변 내용을 기반으로 채우기. 코드 라마가 최첨단 기술에 도달 여러 코드 벤치마크에서 개방형 모델 간의 성능, 최대 점수 HumanEval과 MBPP에서 각각 53%와 55%였다. 특히 코드 라마 - 파이썬 7B는 휴먼에벌 및 MBPP에서 라마 270B를 능가하며, 우리의 모든 모델은 성능을 능가한다 MultiPL-E에서 공개적으로 사용할 수 있는 모든 모델. 우리는 아래에서 코드 라마를 출시한다 연구용과 상업용 모두를 허용하는 허가된 면허증.

Probabilistic Method of Measuring Linguistic Productivity

In this paper I propose a new way of measuring linguistic productivity that objectively assesses the ability of an affix to be used to coin new complex words and, unlike other popular measures, is not directly dependent upon token frequency. Specifically, I suggest that linguistic productivity may be viewed as the probability of an affix to combine with a random base. The advantages of this approach include the following. First, token frequency does not dominate the productivity measure but naturally influences the sampling of bases. Second, we are not just counting attested word types with an affix but rather simulating the construction of these types and then checking whether they are attested in the corpus. Third, a corpus-based approach and randomised design assure that true neologisms and words coined long ago have equal chances to be selected. The proposed algorithm is evaluated both on English and Russian data. The obtained results provide some valuable insights into the relation of linguistic productivity to the number of types and tokens. It looks like burgeoning linguistic productivity manifests itself in an increasing number of types. However, this process unfolds in two stages: first comes the increase in high-frequency items, and only then follows the increase in low-frequency items.

이 논문에서 나는 언어적 생산성을 측정하는 새로운 방법을 제안한다 새로운 복합체를 만드는 데 사용되는 접사의 능력을 객관적으로 평가하다 단어와 다른 인기 있는 조치와 달리 토큰에 직접적으로 의존하지 않는다 빈도수. 구체적으로 언어적 생산성을 볼 수 있다고 제안한다 임의의 베이스와 결합하는 접사의 확률로서. 의 장점 이 접근법은 다음을 포함한다. 첫째, 토큰 빈도가 우세하지 않습니다 생산성 측정치는 자연스럽게 기저의 표본 추출에 영향을 미친다. 둘째로, 우리는 단순히 접사로 증명된 단어 유형을 세는 것이 아니라 오히려 이러한 유형의 구성을 시뮬레이션한 다음 해당 여부를 확인합니다 말뭉치에서 증언하다. 셋째, 말뭉치 기반 접근 및 무작위 설계 진정한 신조어와 오래 전에 만들어진 단어들이 동등한 가능성이 있음을 보장하다 선택된. 제안된 알고리듬은 영어와 러시아 데이터에서 모두 평가된다. 얻은 결과는 다음의 관계에 대한 몇 가지 가치 있는 통찰력을 제공한다 유형과 토큰 수에 대한 언어적 생산성. 와 같다 급성장하는 언어 생산성은 점점 더 많은 수에서 나타난다 종류들. 그러나, 이 과정은 두 단계로 전개된다: 먼저 증가한다 고주파 항목, 그리고 오직 저주파의 증가를 따른다 항목들.

Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models

Recent researches indicate that Pre-trained Large Language Models (LLMs) possess cognitive constructs similar to those observed in humans, prompting researchers to investigate the cognitive aspects of LLMs. This paper focuses on explicit and implicit social bias, a distinctive two-level cognitive construct in psychology. It posits that individuals' explicit social bias, which is their conscious expression of bias in the statements, may differ from their implicit social bias, which represents their unconscious bias. We propose a two-stage approach and discover a parallel phenomenon in LLMs known as "re-judge inconsistency" in social bias. In the initial stage, the LLM is tasked with automatically completing statements, potentially incorporating implicit social bias. However, in the subsequent stage, the same LLM re-judges the biased statement generated by itself but contradicts it. We propose that this re-judge inconsistency can be similar to the inconsistency between human's unaware implicit social bias and their aware explicit social bias. Experimental investigations on ChatGPT and GPT-4 concerning common gender biases examined in psychology corroborate the highly stable nature of the re-judge inconsistency. This finding may suggest that diverse cognitive constructs emerge as LLMs' capabilities strengthen. Consequently, leveraging psychological theories can provide enhanced insights into the underlying mechanisms governing the expressions of explicit and implicit constructs in LLMs.

최근의 연구에 따르면 사전 훈련된 큰 언어 모델(LLM)이 있다 인간에서 관찰되는 것과 유사한 인지 구조를 가지고 있어, 유발한다 LLM의 인지적 측면을 조사하기 위한 연구자들. 이 논문은 에 초점을 맞춘다 명시적이고 암묵적인 사회적 편견, 독특한 2단계 인지 구조 심리학에 있어서. 그것은 개인의 명백한 사회적 편견, 즉 그들의 것을 상정한다 진술에서 편견의 의식적 표현은 암묵적인 것과 다를 수 있다 그들의 무의식적인 편견을 나타내는 사회적 편견. 우리는 2단계를 제안한다 "재판단"으로 알려진 LLM에서 유사한 현상을 발견하고 접근한다 사회적 편견의 비일관성" 초기 단계에서 LLM은 다음과 같은 임무를 수행합니다 자동으로 진술을 완료하고 잠재적으로 암묵적인 사회적 관계를 통합합니다 편견. 그러나 다음 단계에서는 동일한 LLM이 편향된 사람을 다시 판단합니다 문장은 자체적으로 생성되지만 모순됩니다. 우리는 이 재판을 다시 할 것을 제안한다 불일치는 인간의 인식하지 못하는 것 사이의 불일치와 유사할 수 있다 암묵적인 사회적 편견과 그들의 인식하는 명백한 사회적 편견. 실험적 일반적인 성 편견에 대한 ChatGPT 및 GPT-4에 대한 조사 심리학은 반복적인 불일치의 매우 안정적인 특성을 확증한다. 이 연구결과는 다양한 인지구조가 LLMs로 나타남을 시사할 수 있다 역량이 강화되다. 결과적으로 심리학 이론을 활용하면 그것을 지배하는 근본적인 메커니즘에 대한 향상된 통찰력을 제공한다 LLM의 명시적 및 암묵적 구성의 표현식입니다.

'오늘의 자연어 처리' 카테고리의 다른 글

[2023-08-29] 오늘의 자연어처리 (1)	2023.08.29
[2023-08-28] 오늘의 자연어처리 (0)	2023.08.28
[2023-08-26] 오늘의 자연어처리 (0)	2023.08.26
[2023-08-25] 오늘의 자연어처리 (0)	2023.08.25
[2023-08-24] 오늘의 자연어처리 (0)	2023.08.24

잡다한 이야기

[2023-08-27] 오늘의 자연어처리

Code Llama: Open Foundation Models for Code

Probabilistic Method of Measuring Linguistic Productivity

Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models

'오늘의 자연어 처리' 카테고리의 다른 글

댓글

티스토리툴바

[2023-08-27] 오늘의 자연어처리

Code Llama: Open Foundation Models for Code

Probabilistic Method of Measuring Linguistic Productivity

Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models

'오늘의 자연어 처리' 카테고리의 다른 글

관련글

댓글

티스토리툴바