[논문리뷰/NLP/IR/NLG] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

2020 NeurIPS 발표 논문입니다

Patrick Lewis, Ethan Perez.. 저자로

Facebook AI Research; University College London; New York University 에서 작성하였습니다.

Retrieval 문제와 Generation 문제를 함께 이용하여 Knowledge Intensive Task를 해결하는 모델을 제안합니다.

외부 지식 베이스에서 Input과 관련된 문서를 Retrieval하여 실제 생성 태스크를 수행하는 새로운 프레임워크를 제시하였습니다.

문맥 정보를 반영하는 고정된 임배딩 모델 pre-trained neural retriever(Bert)과 seq2seq transformer(Bart)를 결합하여, task specific task에서 더 좋은 성능을 보이는 모델 RAG를 제안하였습니다.

RAG(Retrieval-Augmented Generation) 모델은

non-parametric memory를 사용한 pre-trained parametric memory generation model로

(parametric memory는 Pre-trained seq2seq transformer이고

non-parametric memory는 pre-trained neural retriever를 사용하는 wikipedia의 dense vector index입니다.)

위의 그림과 같이 모델은 두 가지로 구성되어있습니다.

(i) a retriever pη(z|x) with parameters η that returns (top-K truncated) distributions over text passages given a query x

(ii) a generator pθ(yi |x, z, y1:i−1) parametrized by θ that generates a current token based on a context of the previous i − 1 tokens y1:i−1, the original input x and a retrieved passage z.

retriever와 generator로 각 나누어서 보겠습니다.

Retriever

retriever은 BERT 기반의 DPR 모델을 사용합니다.

query encoder, doc encoder에 서로 다른 pre-trained bert 모델을 사용해 임베딩을 얻습니다.

input으로 들어오는 query (검색하려는 것)가 x, document가 z

가장 높은 확률의 P(z|x)를 찾기 위해 maximum inner product search를 사용합니다.

두 임배딩 내적이 커지는 k개의 z를 찾는 알고리즘이고, input과 가장 similar한 z를 구합니다.

학습 시에는 d(z)는 고정시키고 q(x)만 학습시킵니다.

z와 x를 concat 해서 나온 output을 generator의 input으로 이어줍니다.

Generator

generator은 BART 모델을 사용합니다.

input x와 검색된 q를 concat해서 generator인 BART에 넣습니다.

target 값 y의 이전 시점을 input으로 함께 넣어서, y를 예측합니다.

정리)

seq-2-seq 모델로 Retrieval과 Generator로 specific task에서 더 좋은 성능을 보이는 모델 RAG.

저작자표시 (새창열림)

'인공지능 AI > 자연어처리' 카테고리의 다른 글

[NLP/RE] RCL: Relation Contrastive Learning for Zero-Shot Relation Extraction (0)	2023.03.09
[NLP/RSNs/RE] Neural Snowball for Few-Shot Relation Learning (2)	2023.02.22
[논문리뷰/NLP/IR] Dense passage retrieval for Open-Domain QA (1)	2023.02.02
[논문리뷰/NLP] Prototypical Representation Learning for Relation Extraction (0)	2023.02.01
ComDensE : Combined Dense Embedding of Relation-aware and Common Features for Knowledge Graph Completion (0)	2022.10.20

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

[논문리뷰/NLP/IR/NLG] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Retriever

Generator

Training

RAG-Sequence Model

RAG-Token Model

Decoder

RAG-Token

RAG-Sequence

'인공지능 AI > 자연어처리' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역