순환 신경망

"오늘의AI위키"는 AI 기술로 일관성 있고 체계적인 최신 지식을 제공하는 혁신 플랫폼입니다.
"오늘의AI위키"의 AI를 통해 더욱 풍부하고 폭넓은 지식 경험을 누리세요.

1. 개요

순환 신경망(RNN)은 1980년대에 등장하여, 시간적 시퀀스 데이터를 처리하기 위해 설계된 인공 신경망의 한 종류이다. RNN은 이전 단계의 출력을 다음 단계의 입력으로 사용하는 순환 연결을 통해 정보를 기억하고, 시퀀스 내의 컨텍스트를 유지하는 특징을 가지고 있다. 홉필드 네트워크, LSTM, GRU 등이 RNN의 대표적인 예시이며, LSTM은 기울기 소실 문제를 해결하여 음성 인식, 기계 번역, 언어 모델링 등 다양한 분야에서 괄목할 만한 성능 향상을 이끌었다. 양방향 RNN과 어텐션 메커니즘은 RNN의 성능을 더욱 향상시켰으며, 인코더-디코더 구조는 기계 번역과 같은 시퀀스 변환 문제에 활용된다. RNN은 경사 하강법, 교사 강요, CTC 등의 방법으로 훈련되며, 복사 작업과 같은 평가 방법을 통해 모델의 기억 능력을 측정한다. 파이토치, 텐서플로, 케라스와 같은 다양한 딥러닝 라이브러리에서 RNN 모델을 지원한다.

더 읽어볼만한 페이지

arXiv 내용이 누락된 문서 - 사회 연결망 분석
사회 연결망 분석은 개인이나 집단 간 관계를 분석하여 사회 구조와 패턴을 파악하는 방법론으로, 사회학 이론을 기반으로 발전하여 소셜 네트워크 서비스 발전과 함께 중요성이 부각되고 있으며, 크기, 동질성, 중심성 등의 주요 분석 요소를 가진다.
arXiv 내용이 누락된 문서 - 알큐비에레 드라이브
알큐비에레 드라이브는 미구엘 알큐비에레가 1994년에 제안한 이론적인 초광속 우주선 추진 방식으로, 워프 버블 내에서 시공간을 왜곡하여 빛보다 빠르게 이동하는 방식이지만, 막대한 음의 에너지 필요, 인과율 위반 문제 등으로 현재는 이론적 개념으로 남아있다.
통계학에 관한 - 비지도 학습
비지도 학습은 레이블이 없는 데이터를 통해 패턴을 발견하고 데이터 구조를 파악하는 것을 목표로 하며, 주성분 분석, 군집 분석, 차원 축소 등의 방법을 사용한다.
통계학에 관한 - 회귀 분석
회귀 분석은 종속 변수와 하나 이상의 독립 변수 간의 관계를 모델링하고 분석하는 통계적 기법으로, 최소 제곱법 개발 이후 골턴의 연구로 '회귀' 용어가 도입되어 다양한 분야에서 예측 및 인과 관계 분석에 활용된다.
인공신경망 - 인공 뉴런
인공 뉴런은 인공신경망의 기본 요소로서, 입력 신호에 가중치를 곱하고 합산하여 활성화 함수를 거쳐 출력을 생성하며, 생물학적 뉴런을 모방하여 설계되었다.
인공신경망 - 퍼셉트론
퍼셉트론은 프랭크 로젠블랫이 고안한 인공신경망 모델로, 입력 벡터에 가중치를 곱하고 편향을 더한 값을 활성화 함수에 통과시켜 이진 분류를 수행하는 선형 분류기 학습 알고리즘이며, 초기 신경망 연구의 중요한 모델로서 역사적 의미를 가진다.

순환 신경망
지도
기본 정보
유형	인공 신경망
구조	피드백 연결을 가진 노드
주요 특징	시간적 정보 처리 가변 길이 입력 처리 순차적 데이터 모델링
역사
최초 제안	1980년대
발전	장단기 메모리(LSTM) 개발 게이트 순환 유닛(GRU) 개발
작동 원리
작동 방식	이전 단계의 출력을 현재 단계의 입력으로 사용 은닉 상태를 통해 시간적 정보 저장 및 처리 순차적 데이터를 처리하는데 적합
종류
주요 종류	단순 순환 신경망 (Simple RNN) 장단기 메모리 (LSTM) 게이트 순환 유닛 (GRU)
응용 분야
주요 응용 분야	자연어 처리 (기계 번역, 텍스트 생성) 음성 인식 시계열 데이터 예측 비디오 분석 필기 인식
장점
장점	시간적 의존성 모델링에 적합 가변 길이 입력 처리 가능 순차 데이터의 패턴 학습
단점
단점	학습이 어려울 수 있음 (기울기 소실/폭발 문제) 긴 시퀀스 학습에 어려움 (장기 의존성 문제) 병렬 처리 어려움
주요 연구
관련 연구	시퀀스-투-시퀀스 모델 어텐션 메커니즘 트랜스포머 모델
추가 정보
관련 분야	딥 러닝, 인공 신경망, 기계 학습

2. 역사

순환 신경망(RNN)의 역사는 1980년대 데이비드 루멜하르트와 존 홉필드의 초기 연구에서 시작되어, 1993년 1000개 이상의 층을 가진 RNN을 사용한 "Very Deep Learning" 구현으로 이어졌다. 1997년 LSTM이 발명된 이후, 음성 인식, 필기 인식, 기계 번역, 언어 모델링, 자동 이미지 캡셔닝 등 다양한 분야에서 획기적인 발전을 이루었다. 특히, 2000년대 후반부터 LSTM은 음성 인식 분야에서 기존 모델을 능가하며, 바이두, 구글 안드로이드 등에서 활용되었다. 양방향 RNN(BRNN)은 입력 데이터를 양방향으로 처리하여 성능을 더욱 향상시켰다.

2. 1. 초기 연구

RNN은 1986년 데이비드 루멜하르트의 연구에 기반을 둔다.^[253] 1982년 존 홉필드는 홉필드 네트워크를 제안하며 순환 신경망 연구의 초기 발판을 마련했다.^[254] 1993년에는 신경망 기반의 역사 압축 시스템이 등장하여 순환 신경망의 가능성을 제시했는데,^[254] 이 시스템은 시간에 따라 전개된 RNN에서 1000개가 넘는 후속 층을 필요로 하는 "매우 심층적인 학습" 과제를 해결했다.^[34]

2. 2. LSTM의 등장과 발전

1997년 호크라이터와 슈미트후버가 장단기 메모리(LSTM)를 발명하면서, 기존 순환 신경망의 문제점을 해결하고 장기 의존성을 학습할 수 있게 되었다.^[255] LSTM은 여러 응용 분야에서 뛰어난 정확도를 보여주었다.^[35]^[36]

2007년 전후, LSTM은 음성 인식 분야에서 기존 모델들을 능가하는 성능을 보여주었다.^[256] 2009년에는 CTC(Connectionist temporal classification) 기술로 훈련된 LSTM이 필기체 인식 대회에서 처음으로 우승하여 패턴 인식 분야에서 뛰어난 기능을 입증했다.^[257]^[258] 2014년, 바이두는 CTC로 훈련된 RNN만을 사용하여 음성 인식 벤치마크를 갱신했다.^[259]

LSTM은 대규모 단어 음성 인식^[260]^[261]과 음성 합성^[262] 분야에서도 발전하여 구글 안드로이드에 응용되고 있다.^[257]^[263] 2015년에는 구글의 음성 인식 능력이 CTC 기반 LSTM을 통해 크게 향상되었다.^[264]

기계 번역,^[265] 언어 모델링,^[266] 다국어 언어 처리 분야에서도 LSTM은 뛰어난 성능으로 기록을 갱신했다.^[267] 합성곱 신경망과 함께 응용되어 자동 이미지 캡셔닝 분야에서도 큰 발전을 이루었다.^[268] LSTM을 실행하는 데 필요한 계산량을 줄이기 위해 하드웨어 가속기를 사용하는 연구도 진행되고 있다.^[269]

2. 3. 양방향 RNN과 Attention 메커니즘

양방향 순환 신경망(BRNN, Bidirectional Recurrent Neural Network)은 길이가 정해진 데이터 순열을 통해 어떤 값이 들어오기 전과 후의 정보를 모두 학습하는 방식의 알고리즘이다. 이를 위해 순열을 왼쪽에서 오른쪽으로 읽을 RNN 하나와, 오른쪽에서 왼쪽으로 읽을 RNN 하나를 필요로 한다. 이 둘의 출력값을 조합한 뒤 지도된 결과와 비교하여 학습한다. LSTM과 병용할 때 특히 좋은 성능을 낸다는 사실이 증명되었다.^[284]^[285]

3. 구조

인공 신경망은 입력을 선형 변환하는 처리 단위로 구성된 네트워크이다. 네트워크 내에 순환이 존재하면, 즉, 단위의 출력이 어떤 경로를 통해 다시 자신에게 입력되는 경우, 이를 '''순환 신경망'''(RNN)이라고 한다.^[139] 순환이 없는 순전파 신경망(Feed-Forward Network; '''FFN''')과 대비된다.

RNN은 임의의 연속적인 입력을 처리하기 위해 내부 상태(기억)를 사용할 수 있어 시간에 따른 동적인 행동을 보인다.^[140] 필기체 인식^[141] 및 음성 인식^[142]^[143]과 같은 과제에 적용될 수 있다.

"순환 신경망"이라는 용어는 유사한 일반 구조를 가진 두 가지 광범위한 네트워크 클래스( 유한 임펄스, 무한 임펄스)를 구분 없이 지칭하는 데 사용된다. 두 네트워크 클래스 모두 시간에 따른 동적인 행동을 보인다.^[144] 유한 임펄스 순환 네트워크는 유향 비순환 그래프인 반면, 무한 임펄스 순환 네트워크는 유향 순환 그래프이다.

유한 및 무한 임펄스 순환 네트워크는 모두 추가적인 저장 상태를 가질 수 있으며, 신경망에 의해 직접적으로 제어될 수 있다. 저장소는 다른 네트워크나 그래프(시간 지연, 피드백 루프)로 대체될 수 있다. 이러한 제어된 상태는 게이트 상태 또는 게이트 메모리라고 불리며, 장단기 기억(LSTM) 및 게이트형 순환 단위(GRU)의 일부이다.

RNN에는 많은 파생 형태가 있다. RNN 기반 모델은 구성(configuration)과 아키텍처(architecture) 두 부분으로 나눌 수 있다. 여러 개의 RNN을 데이터 흐름으로 결합할 수 있으며, 이 데이터 흐름 자체가 구성이다. 각 RNN 자체는 LSTM, GRU 등을 포함한 다양한 아키텍처를 가질 수 있다.

'''적층 순환 신경망'''(Stacked RNN) 또는 '''심층 순환 신경망'''(Deep RNN)은 여러 개의 RNN을 서로 위에 쌓아 구성된다. 각 층은 독립적인 순환 신경망으로 작동하며, 각 층의 출력 시퀀스는 상위 층의 입력 시퀀스로 사용된다.

3. 1. 기본 구조

RNN은 뉴런과 유사한 노드들이 연속된 레이어로 구성된 구조를 가진다. 각 노드는 다음 단계의 모든 레이어와 단방향 그래프를 이루며, 시간에 따라 변하는 실수값의 활성화(activation) 값을 가진다. 또한 각 노드 간 연결은 실수값의 가중치를 가지며, 이 값은 계속해서 바뀐다. 노드의 종류는 네트워크 외부에서 값을 받는 입력 노드, 결과값을 내는 출력 노드, 입력 노드와 출력 노드 사이에 있는 은닉 노드로 나뉜다.

지도 학습의 경우, 단위 시간마다 실수 벡터 하나가 입력 노드로 들어온다. 매 순간 입력 노드를 제외한 모든 노드는 연결된 노드들로부터 활성화 값을 각 연결의 가중치와 함께 가중합으로 받아, 비선형 함수를 통해 활성화 값을 계산하고 저장한다. 지도 학습을 위해 주어진 데이터는 특정 순간마다 출력 노드 값이 가져야 할 값(target)을 알려준다. 신경망이 생성한 출력값과 원래 가져야 할 값 사이의 차이를 합하여 오차를 정의하고, 이 오차를 줄이는 방향으로 학습이 진행된다.

강화 학습 설정에서는, 출력 값이 어떠해야 하는지에 대한 정보를 주지 않는다. 대신 RNN의 성능을 평가하기 위한 fitness function|적합성 함수^영어나 보상 함수가 출력값을 받아들인 뒤, 액츄에이터를 통해 환경에 영향을 주어 입력값을 변화시킨다.

기본적인 RNN은 연속적인 "층"으로 구성된 뉴런적 노드의 네트워크이며, 특정 층의 개별 노드는 다음 층의 모든 노드와 유향(일방향) 연결로 연결되어 있다. 개별 노드(뉴런)는 시간에 따라 변하는 실수값의 활성화를 갖는다. 개별 연결(시냅스)은 변경 가능한 실수값의 Weighting|가중치^영어를 갖는다. 노드는 (네트워크 외부에서 데이터를 받는) 입력 노드, (결과를 얻는) 출력 노드, (입력에서 출력으로 데이터를 수정하는) 은닉 노드 중 하나이다.

이산 시간 설정에서의 지도 학습을 위해, 실수값 입력 벡터의 배열이 입력 노드에 도착한다(한 번에 하나의 벡터). 임의의 시간 단계에서, 각 비입력 유닛은 그것에 연결된 모든 유닛의 활성화의 가중 합의 비선형 함수로서 그 현재 활성화(결과)를 계산한다. 어떤 시간 단계에서 일부 출력 유닛에 대해 교사가 목표 활성화를 제공할 수 있다.

강화 학습 설정에서는, 교사가 목표 신호를 제공하지 않는다. 대신, fitness function|적합도 함수^영어 또는 보상 함수가 RNN의 성능을 평가하는 데 사용될 수 있다.

3. 2. 엘만 네트워크와 조던 네트워크

Elman^영어 네트워크와 Jordan^영어 네트워크는 초기 순환 신경망 모델로, 단순 순환망(Simple Recurrent Network, SRN)이라고도 불린다.^[270]

엘만 네트워크는 3층 네트워크에 "맥락 유닛"(''u'')을 추가한 형태이다.^[159] 은닉층은 이 맥락 유닛에 연결되며, 가중치는 1로 고정되어 있다.^[159] 각 시간 단계에서 입력은 순방향으로 전달되고 학습 규칙이 적용된다. 고정된 역방향 연결은 맥락 유닛에 은닉 유닛의 이전 값을 저장한다. 이를 통해 네트워크는 일종의 상태를 유지하여 시계열 예측과 같은 작업을 수행할 수 있다.^[270]

Jordan^영어 네트워크는 엘만 네트워크와 유사하지만, 맥락 유닛이 은닉층이 아닌 출력층에서 입력을 받는다는 차이점이 있다. 조던 네트워크의 맥락 유닛은 상태층이라고도 불리며, 자신에게로의 재귀적 연결을 가진다.^[270]^[159]

엘만 네트워크와 조던 네트워크의 구조는 다음과 같이 표현할 수 있다.

; 엘만 신경망^[160]

:

\begin{align}h_t &= \sigma_h(W_{h} x_t + U_{h} h_{t-1} + b_h) \\y_t &= \sigma_y(W_{y} h_t + b_y)\end{align}

; 조르단 신경망^[161]

:

\begin{align}h_t &= \sigma_h(W_{h} x_t + U_{h} y_{t-1} + b_h) \\y_t &= \sigma_y(W_{y} h_t + b_y)\end{align}

위 식에서 사용된 기호는 다음과 같다.

$x_t$ : 입력 벡터
$h_t$ : 은닉층 벡터
$y_t$ : 출력 벡터
$W$ , $U$ , $b$ : 매개변수 행렬과 벡터
$\sigma_h$ , $\sigma_y$ : 활성화 함수

3. 3. LSTM (Long Short-Term Memory)

LSTM(Long Short-Term Memory, 장단기 메모리)은 기울기 소실 문제를 해결하기 위해 고안된 딥 러닝 시스템이다. LSTM은 망각 게이트(forget gate)를 추가하여 역전파 시 기울기 값이 급격하게 사라지거나 증가하는 문제를 방지한다.^[273]^[274]

LSTM은 기존의 순환 신경망(RNN)이 먼 과거의 정보를 학습하기 어려운 문제를 해결하여, 수백만 단위 시간 전의 사건도 학습할 수 있게 되었다. 이는 저주파 신호와 고주파 신호를 모두 처리할 수 있게 해 성능을 크게 향상시켰다.^[275] LSTM과 유사한 구조를 가진 신경망들도 많이 발표되고 있다.^[276]

LSTM을 쌓은 뒤 Connectionist temporal classification(CTC)로 학습시키는 방식은 실제 연구 분야에 많이 사용된다.^[277]^[278] CTC는 정렬과 인식에서 좋은 결과를 제공하며, 은닉 마르코프 모형(HMM)으로는 불가능했던 문맥 의존 언어 학습을 가능하게 한다.^[279]

3. 4. GRU (Gated Recurrent Unit)

게이트형 순환 유닛(GRU, Gated Recurrent Units)은 2014년에 발표된 순환 신경망의 게이트 메커니즘이다.^[184]^[185] 장단기 기억(LSTM)과 성능이 유사하며, 출력 게이트가 없어 LSTM보다 매개변수가 적다.^[187] 주로 다성 음악 모델링 및 음성 신호 모델링에 사용된다.^[187]

3. 5. 양방향 RNN (Bidirectional RNN)

양방향 순환 신경망(biRNN)은 두 개의 순환 신경망으로 구성되는데, 하나는 입력 시퀀스를 정방향으로 처리하고 다른 하나는 역방향으로 처리한다.

순방향 순환 신경망: $f_{\theta}(x_0, h_0) = (y_0, h_{1}), f_{\theta}(x_1, h_1) = (y_1, h_{2}), \dots$
역방향 순환 신경망: $f'_{\theta'}(x_N, h_N') = (y'_N, h_{N-1}'), f'_{\theta'}(x_{N-1}, h_{N-1}') = (y'_{N-1}, h_{N-2}'), \dots$

두 신경망의 출력 시퀀스를 결합하여 전체 출력을 제공한다:

((y_0, y_0'), (y_1, y_1'), \dots, (y_N, y_N'))

.

양방향 순환 신경망은 모델이 토큰을 이전과 이후 문맥에서 모두 처리할 수 있게 한다. 여러 개의 양방향 순환 신경망을 쌓으면 모델이 토큰을 더욱 문맥적으로 처리할 수 있다. ELMo 모델(2018)^[48]은 문자 수준 입력을 받아 단어 수준 임베딩을 생성하는 쌓인 양방향 LSTM이다.

양방향 RNN은 유한한 배열을 사용하여 요소의 과거 및 미래 문맥을 기반으로 배열의 각 요소를 예측하거나 라벨링한다. 이는 두 개의 RNN 출력을 통합하여 수행하는데, 하나는 배열을 왼쪽에서 오른쪽으로, 다른 하나는 오른쪽에서 왼쪽으로 처리한다. 통합된 출력은 교사가 제공한 대상 신호의 예측값이다. 이 기법은 LSTM RNN과 결합했을 때 특히 유용한 것으로 입증되었다.^[188]^[189]

3. 6. 인코더-디코더 (Encoder-Decoder)

인코더-디코더는 두 개의 순환 신경망(RNN)을 연결하여 구성하는 방식이다. 인코더 RNN은 입력 시퀀스를 처리하여 일련의 은닉 벡터로 압축하고, 디코더 RNN은 이를 바탕으로 (선택적으로 어텐션 메커니즘을 사용하여) 출력 시퀀스를 생성한다. 이러한 구조는 2014년부터 2017년까지 신경 기계 번역 분야에서 널리 사용되었으며, 트랜스포머 개발의 중요한 기반이 되었다.^[49]

4. 훈련

현대 순환 신경망(RNN)은 주로 LSTM과 양방향 순환 신경망(BRNN)이라는 두 가지 주요 구조를 기반으로 훈련된다.^[32]

1980년대에 신경망 연구가 다시 활발해지면서 순환 신경망도 재조명되었다. 초기에는 조던 네트워크(1986)와 엘만 네트워크(1990)가 인지 심리학 연구에 RNN을 활용했다.^[33] 1993년에는 신경망 기반의 역사 압축 시스템이 RNN에서 1000개가 넘는 층을 필요로 하는 "매우 심층적인 학습" 과제를 해결하기도 했다.^[34]

1995년, 호크라이터와 슈미트후버가 발명한 장단기 기억(LSTM) 네트워크는 여러 응용 분야에서 높은 정확도를 보이며 RNN 구조의 기본으로 자리 잡았다.^[35]^[36]

양방향 순환 신경망(BRNN)은 같은 입력을 서로 반대 방향으로 처리하는 두 개의 RNN을 사용한다.^[37] 이 두 가지를 결합하여 양방향 LSTM 구조를 만들기도 한다.

2006년경, 양방향 LSTM은 특정 음성 응용 프로그램에서 기존 모델보다 뛰어난 성능을 보이며 음성 인식 분야에 큰 영향을 주었다.^[38]^[39] 구글 음성 검색과 안드로이드 기기의 받아쓰기에도 사용되었으며,^[41] 기계 번역^[42], 언어 모델링^[43], 다국어 언어 처리^[44] 등에서도 좋은 성과를 냈다. 또한 LSTM은 합성곱 신경망(CNN)과 결합하여 자동 이미지 캡션 생성 성능을 향상시키기도 했다.^[45]

2010년대 초에는 인코더-디코더 시퀀스 변환 아이디어가 개발되었다. 2014년에 발표된 두 논문^[46]^[47]은 seq2seq 구조의 기원으로 자주 언급된다. seq2seq 구조는 보통 LSTM인 두 개의 RNN, 즉 "인코더"와 "디코더"를 사용하여 기계 번역과 같은 시퀀스 변환 작업을 수행한다. 이 구조는 기계 번역 분야에서 높은 수준의 기술로 평가받고 있으며, 어텐션 메커니즘과 트랜스포머 개발에 중요한 영향을 미쳤다.

4. 1. 교사 강요 (Teacher Forcing)

'''교사 강요(Teacher forcing)'''는 순환 신경망(RNN) 훈련 과정에서 모델의 예측값 대신 실제 정답값을 다음 시간 단계의 입력으로 사용하는 기법이다.^[51]

예를 들어, 영어 단어 시퀀스

(x_1, x_2, \dots, x_n)

가 주어졌을 때 프랑스어 단어 시퀀스

(y_1, \dots, y_m)

를 생성하는 기계 번역 문제를 seq2seq 모델로 해결한다고 가정하자.

훈련 과정에서 모델의 인코더는 먼저

(x_1, x_2, \dots, x_n)

을 입력받고, 디코더는 시퀀스

(\hat y_1, \hat y_2, \dots, \hat y_{l})

을 생성한다. 이때, 모델이 초기에

\hat y_2

에서 실수를 하면, 이후의 토큰들도 잘못될 가능성이 높다. 이는 모델이 주로

\hat y_2

를

y_2

로 이동시키는 것만 학습하고 다른 토큰들은 학습하지 않아 학습 신호를 얻는 데 비효율적일 수 있다.

교사 강요는 이러한 문제를 해결하기 위해 디코더가 다음 항목을 생성할 때 올바른 출력 시퀀스를 사용하도록 한다. 예를 들어,

\hat y_{k+1}

을 생성하기 위해 이전 예측값

(\hat y_1, \dots, \hat y_{k})

대신 실제 정답

(y_1, \dots, y_{k})

를 참조한다.

4. 2. 경사 하강법 (Gradient Descent)

경사 하강법(Gradient descent)은 함수의 최솟값을 찾는 1차 반복 최적화 알고리즘이다. 신경망에서, 비선형 활성화 함수가 미분 가능하다면, 각 가중치를 해당 가중치에 대한 오차의 미분에 비례하여 변경함으로써 오차 항을 최소화하는 데 사용할 수 있다.^[76]^[77]

RNN을 훈련하는 표준 방법은 "시간에 따른 역전파"(BPTT) 알고리즘이며, 이는 역전파의 일반 알고리즘의 특수한 경우이다.^[76]^[77] 이보다 계산 비용이 더 많이 드는 온라인 변형은 "실시간 순환 학습"(Real-Time Recurrent Learning, RTRL)이라고 하며, 쌓인 접선 벡터를 사용한 순방향 축적 모드의 자동 미분의 한 예이다. BPTT와 달리, RTRL은 시간적으로는 지역적이지만 공간적으로는 지역적이지 않다.

공간적으로 지역적이라는 것은 단위의 가중치 벡터를 연결된 단위와 단위 자체에 저장된 정보만 사용하여 업데이트할 수 있음을 의미하며, 단일 단위의 업데이트 복잡도는 가중치 벡터의 차원에 선형적이다. 시간적으로 지역적이라는 것은 업데이트가 지속적으로(온라인) 발생하고 BPTT와 같이 주어진 시간 지평선 내의 여러 시간 단계가 아니라 가장 최근의 시간 단계에만 의존한다는 것을 의미한다. 생물학적 신경망은 시간과 공간 모두에 대해 지역적인 것으로 보인다.^[78]^[79]

RTRL은 재귀적으로 편미분을 계산하기 위해 야코비 행렬을 계산하는 데 시간 복잡도가 시간 단계당 O(은닉층 수 × 가중치 수)인 반면, BPTT는 주어진 시간 지평선 내의 모든 순방향 활성화를 저장하는 비용으로 시간 단계당 O(가중치 수)만 걸린다.^[80] BPTT와 RTRL의 중간 복잡도를 가진 온라인 하이브리드 및 연속 시간에 대한 변형도 존재한다.^[81]^[82]^[83]

표준 RNN 아키텍처에 대한 경사 하강법의 주요 문제는 중요한 이벤트 사이의 시간 지연의 크기에 따라 오차 기울기가 기하급수적으로 사라진다는 것이다.^[96]^[84] LSTM과 BPTT/RTRL 하이브리드 학습 방법의 조합은 이러한 문제를 극복하려고 시도한다.^[36] 이 문제는 또한 뉴런의 맥락을 자체 과거 상태로 줄이고, 교차 뉴런 정보는 다음 층에서 탐색될 수 있는 독립적으로 순환하는 신경망(IndRNN)에서 해결된다.^[93] 장기 기억을 포함한 다양한 범위의 기억은 기울기 소멸 및 폭발 문제 없이 학습될 수 있다.

인과적 재귀 역전파(CRBP)는 국소적으로 순환하는 네트워크에 대해 BPTT 및 RTRL 패러다임을 구현하고 결합한다.^[85] CRBP는 가장 일반적인 국소적으로 순환하는 네트워크에서 작동하며, 전역 오차 항을 최소화할 수 있다. 이는 알고리즘의 안정성을 향상시켜 국소 피드백이 있는 순환 네트워크에 대한 기울기 계산 기법에 대한 통합적인 관점을 제공한다.

임의의 아키텍처를 가진 RNN에서 기울기 정보 계산에 대한 한 가지 접근 방식은 신호 흐름 그래프 도식적 유도를 기반으로 한다.^[86] 네트워크 민감도 계산에 대한 Lee의 정리를 기반으로 하는 BPTT 배치 알고리즘을 사용한다.^[87] Wan과 Beaufays가 제안했고, 빠른 온라인 버전은 Campolucci, Uncini 및 Piazza가 제안했다.^[87]

4. 3. 연결주의 시간 분류 (Connectionist Temporal Classification, CTC)

연결주의 시간 분류(Connectionist Temporal Classification, CTC)는 시계열 데이터의 길이가 가변적인 시퀀스 모델링 문제에서 순환 신경망(RNN)을 훈련하는 데 사용되는 특수한 손실 함수이다.^[88]^[89]

5. 응용 분야

순환 신경망은 다음과 같은 다양한 분야에서 활용되고 있다.

자연어 처리: 기계 번역^[42], 언어 모델링^[266], 다국어 언어 처리^[44], 자동 이미지 캡션 생성^[45] 등에 활용된다.
음성 인식 및 합성: 음성 인식^[120]^[39]^[121], 음성 합성^[122], 구글 음성 검색과 안드로이드 기기의 받아쓰기^[41] 등에 사용된다.
시계열 데이터 분석: 시계열 예측^[117]^[118]^[119], 시계열 이상 탐지^[124] 등에 활용된다.
로봇 제어: 로봇 제어^[116]에 활용된다.
기타: 리듬 학습^[126], 작곡^[127], 문법 학습^[128]^[129]^[130], 필기 인식^[131]^[132], 인간 행동 인식^[133], 단백질 상동성 탐지^[134], 단백질의 세포 내 위치 예측^[135], 업무 프로세스 관리 분야의 여러 예측 작업^[136], 의료 관리 경로 예측^[137], 원자로에서의 핵융합 플라즈마 파괴 예측^[138] 등에도 활용된다.

특히, 1997년 호크라이터와 슈미트후버가 발명한 LSTM은 음성 인식^[38]^[39], 필기체 인식^[257]^[258], 기계 번역^[42] 등 여러 분야에서 뛰어난 성능을 보여주었다. 2014년에는 바이두가 CTC(Connectionist temporal classification) 기술로 훈련된 RNN을 활용하여 음성 인식 벤치마크를 갱신하기도 했다.^[259]

6. 평가

RNN 모델의 성능은 다양한 작업과 지표를 사용하여 평가된다. 다음은 그 예시이다.

Copying 작업은 순차 처리 모델의 기억력을 평가하기 위해 "처음 제시된 숫자의 순서를 마지막에 기억해내는" 작업이다.^[222] 모델에는 먼저 $\{1,\ ...,\ 8\}$ 에서 무작위로 샘플링된 10개의 입력이 연속적으로 전달되고(기억 단계), 그 다음 L개의 $0$ 이 전달되며(유지 단계), 마지막으로 $9$ 가 10번 연속으로 전달된다(상기 단계). 모델은 처음 10개의 숫자를 기억하고, L단계에 걸쳐 이어지는 $0$ 동안 그것을 기억해 두었다가, $9$ 에 응답하여 처음 10개의 숫자를 순서대로 출력해야 한다.^[223]

Copying 작업은 장기간의 시간 지연을 거쳐 기억을 유지하는 작업이며,^[224] 장기 기억을 직접 평가하는 표준적인 작업이다. 간단하지만 어려운 것으로 알려져 있으며, 엘만 네트워크와 같은 단순한 RNN은 이 작업을 해결할 수 없고, LSTM도 L=100을 부분적으로만 학습할 수 있다는 것이 알려져 있다.^[225]^[226]

7. 관련 라이브러리

파이토치(PyTorch), 텐서플로(TensorFlow), 케라스(Keras) 등 다양한 딥러닝 라이브러리에서 순환 신경망 모델을 지원한다.^[115] 이러한 라이브러리들은 RNN의 학습 및 추론을 지원하며, 일부는 즉시 컴파일(JIT)을 통해 성능을 최적화한다.

다음은 순환 신경망을 지원하는 주요 라이브러리 목록이다.

라이브러리	설명
아파치 싱가(Apache Singa)	딥 러닝 프레임워크
카페(Caffe)	버클리 비전 및 학습 센터(BVLC) 개발. CPU 및 GPU 지원. C++ 기반, 파이썬(Python) 및 MATLAB 래퍼 제공.
체이너(Chainer)	파이썬(Python) 기반. CPU, GPU, 분산 학습 지원.
딥러닝4j(Deeplearning4j)	스파크 상에서 자바(Java)와 스칼라(Scala)로 딥 러닝 수행. 다중 GPU 지원.
플럭스(Flux)(Flux (machine-learning framework))	줄리아(Julia) 기반. GRU 및 LSTM 포함 RNN 인터페이스 제공.
케라스(Keras)	다른 딥 러닝 라이브러리들에 대한 래퍼를 제공하는 고급 API.
마이크로소프트 코그니티브 툴킷(Microsoft Cognitive Toolkit)	딥 러닝 프레임워크
MXNet	오픈 소스 딥 러닝 프레임워크.
파이토치(PyTorch)	GPU 가속 기능을 갖춘 파이썬 텐서 및 동적 신경망.
텐서플로(TensorFlow)	Apache 2.0 라이선스. CPU, GPU, 텐서 처리 장치(TPU) 지원.
테아노(Theano)(Theano (software))	넘파이(NumPy)와 유사한 API를 가진 파이썬 딥 러닝 라이브러리.
토치(Torch)(Torch (machine learning))	C와 루아(Lua) 기반 과학 컴퓨팅 프레임워크. 머신 러닝 알고리즘 지원.

참조

_[1] 논문 Time series forecasting using artificial neural networks methodologies: A systematic review 2018-12-01
_[2] 논문 A Novel Connectionist System for Improved Unconstrained Handwriting Recognition http://www.idsia.ch/[...]
_[3] 웹사이트 Long Short-Term Memory recurrent neural network architectures for large scale acoustic modeling https://research.goo[...] Google Research
_[4] arXiv Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition 2014-10-15
_[5] 논문 A thorough review on the current advance of neural network structures. https://www.scienced[...] 2019
_[6] 논문 State-of-the-art in artificial neural network applications: A survey 2018-11-01
_[7] 논문 The Importance of Cajal's and Lorente de Nó's Neuroscience to the Birth of Cybernetics http://journals.sage[...] 2023-07-05
_[8] 서적 Histologie du système nerveux de l'homme & des vertébrés https://archive.org/[...] Paris : A. Maloine 1909
_[9] 논문 Vestibulo-Ocular Reflex Arc http://archneurpsyc.[...] 1933-08-01
_[10] 논문 Some predictions of Rafael Lorente de Nó 80 years later 2014-12-03
_[11] 웹사이트 reverberating circuit https://www.oxfordre[...] 2024-07-27
_[12] 논문 A logical calculus of the ideas immanent in nervous activity http://link.springer[...] December 1943
_[13] 논문 On the legacy of W.S. McCulloch https://linkinghub.e[...] April 2007
_[14] 논문 Warren McCulloch's Search for the Logic of the Nervous System https://muse.jhu.edu[...] December 2000
_[15] 논문 Central Effects of Centripetal Impulses in Axons of Spinal Ventral Roots https://www.physiolo[...] 1946-05-01
_[16] 논문 Recurrent Neural Networks 2013-02-22
_[17] 서적 Perceptual Generalization over Transformation Groups Pergamon Press
_[18] 서적 DTIC AD0256582: PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS https://archive.org/[...] Defense Technical Information Center 1961-03-15
_[19] 서적 Pattern Recognition and Machine Learning 1971
_[20] 논문 Associatron-A Model of Associative Memory 1972
_[21] 논문 Learning patterns and pattern sequences by self-organizing nets of threshold elements 1972
_[22] 논문 The Existence of Persistent States in the Brain
_[23] 논문 Beiträge zum Verständnis der magnetischen Eigenschaften in festen Körpern
_[24] 논문 Beitrag zur Theorie des Ferromagnetismus
_[25] 논문 History of the Lenz-Ising Model
_[26] 논문 Roy J. Glauber "Time-Dependent Statistics of the Ising Model" https://aip.scitatio[...] 2021-03-21
_[27] 논문 Solvable Model of a Spin-Glass https://link.aps.org[...] 1975-12-29
_[28] 논문 Neural networks and physical systems with emergent collective computational abilities 1982
_[29] 논문 Neurons with graded response have collective computational properties like those of two-state neurons 1984
_[30] 서적 Statistical mechanics of learning Cambridge University Press 2001
_[31] 논문 Statistical mechanics of learning from examples https://journals.aps[...] 1992-04-01
_[32] 서적 Dive into deep learning Cambridge University Press 2024
_[33] 논문 Learning representations by back-propagating errors https://www.nature.c[...] October 1986
_[34] 서적 Habilitation thesis: System modeling and optimization ftp://ftp.idsia.ch/p[...]
_[35] Wikidata
_[36] 논문 Long Short-Term Memory 1997-11-01
_[37] 논문 Bidirectional recurrent neural networks https://www.research[...] 1997
_[38] 논문 Framewise phoneme classification with bidirectional LSTM and other neural network architectures 2005-07-01
_[39] 학회발표 An Application of Recurrent Neural Networks to Discriminative Keyword Spotting http://dl.acm.org/ci[...] Springer-Verlag 2007
_[40] 학회발표 Photo-Real Talking Head with Deep Bidirectional LSTM 2015
_[41] 웹사이트 Google voice search: faster and more accurate http://googleresearc[...] 2015-09
_[42] 논문 Sequence to Sequence Learning with Neural Networks https://papers.nips.[...] 2014
_[43] arXiv Exploring the Limits of Language Modeling 2016-02-07
_[44] arXiv Multilingual Language Processing From Bytes 2015-11-30
_[45] arXiv Show and Tell: A Neural Image Caption Generator 2014-11-17
_[46] arXiv Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation 2014-06-03
_[47] arXiv Sequence to sequence learning with neural networks 2014-12-14
_[48] arXiv Deep contextualized word representations 2018
_[49] 논문 Attention is All you Need https://proceedings.[...] Curran Associates, Inc. 2017
_[50] 논문 Pixel Recurrent Neural Networks https://proceedings.[...] PMLR 2016-06-11
_[51] 서적 Neural Networks as Cybernetic Systems http://www.brains-mi[...]
_[52] 논문 Finding Structure in Time 1990
_[53] 학회발표 Neural-Network Models of Cognition — Biobehavioral Foundations 1997-01-01
_[54] 논문 Learning Precise Timing with LSTM Recurrent Networks http://www.jmlr.org/[...] 2002
_[55] 서적 Artificial Neural Networks – ICANN 2009 http://mediatum.ub.t[...] Springer 2009-09-14
_[56] 학회발표 Sequence labelling in structured domains with hierarchical recurrent neural networks https://www.ijcai.or[...] 2007
_[57] arXiv Simplified Minimal Gated Unit Variations for Recurrent Neural Networks 2017-01-12
_[58] arXiv Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks 2017-01-20
_[59] 웹사이트 Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU/LSTM RNN with Python and Theano – WildML http://www.wildml.co[...] 2015-10-27
_[60] arXiv Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling 2014
_[61] 논문 Are GRU cells more specific and LSTM cells more sensitive in motive classification of text? 2020
_[62] 논문 Bidirectional associative memories 1988
_[63] 논문 Exponential stability for markovian jumping stochastic BAM neural networks with mode-dependent probabilistic time-varying delays and impulse control 2015-01-02
_[64] 서적 Neural networks: a systematic introduction https://books.google[...] Springer 1996
_[65] 논문 Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication 2004-04-02
_[66] 논문 Real-time computing without stable states: a new framework for neural computation based on perturbations https://igi-web.tugr[...] 2002
_[67] 학회발표 Proceedings of International Conference on Neural Networks (ICNN'96) 1996
_[68] MSc The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors University of Helsinki 1970
_[69] 서적 Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation https://books.google[...] SIAM 2008
_[70] 학회발표 28th International Conference on Machine Learning (ICML 2011) 2011
_[71] 논문 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank http://nlp.stanford.[...]
_[72] arXiv Neural Turing Machines
_[73] 논문 Hybrid computing using a neural network with dynamic external memory http://www.nature.co[...] 2016-10-12
_[74] 서적 Adaptive Processing of Sequences and Data Structures Springer
_[75] 논문 Turing machines are recurrent neural networks 1996
_[76] 서적 The Utility Driven Dynamic Error Propagation Network https://books.google[...] Department of Engineering, University of Cambridge
_[77] 서적 Backpropagation: Theory, Architectures, and Applications https://books.google[...] Psychology Press 2013-02-01
_[78] 논문 A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks 1989-01-01
_[79] 서적 Neural and adaptive systems: fundamentals through simulations https://books.google[...] Wiley
_[80] arXiv Training recurrent networks online without backtracking 2015-07-28
_[81] 논문 A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks 1992-03-01
_[82] 보고서 Complexity of exact gradient computation algorithms for recurrent neural networks http://citeseerx.ist[...] Northeastern University, College of Computer Science 1989
_[83] 논문 Learning State Space Trajectories in Recurrent Neural Networks http://repository.cm[...] 1989-06-01
_[84] 서적 A Field Guide to Dynamical Recurrent Networks John Wiley & Sons 2001-01-15
_[85] 논문 On-Line Learning Algorithms for Locally Recurrent Neural Networks
_[86] 논문 Diagrammatic derivation of gradient algorithms for neural networks
_[87] 논문 A Signal-Flow-Graph Approach to On-line Gradient Calculation
_[88] 학회논문 Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks https://axon.cs.byu.[...]
_[89] 논문 Sequence Modeling with CTC https://distill.pub/[...] 2017-11-27
_[90] citation IJCAI 99 Morgan Kaufmann
_[91] MSc Applying Genetic Algorithms to Recurrent Neural Networks for Learning Network Parameters and Architecture http://arimaa.com/ar[...] Department of Electrical Engineering, Case Western Reserve University 1995-05
_[92] 논문 Accelerated Neural Evolution Through Cooperatively Coevolved Synapses https://www.jmlr.org[...] 2008-06
_[93] arXiv Independently Recurrent Neural Network (IndRNN): Building a Longer and Deeper RNN
_[94] 논문 Learning complex, extended sequences using the principle of history compression ftp://ftp.idsia.ch/p[...]
_[95] 논문 Deep Learning
_[96] Diploma Untersuchungen zu dynamischen neuronalen Netzen http://people.idsia.[...] Institut f. Informatik, Technische University Munich 1991
_[97] 논문 Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks https://clgiles.ist.[...]
_[98] 논문 Constructing Deterministic Finite-State Automata in Recurrent Neural Networks
_[99] 논문 How Hierarchical Control Self-organizes in Artificial Adaptive Systems 2005-09-01
_[100] 웹사이트 A Bergson-Inspired Adaptive Time Constant for the Multiple Timescales Recurrent Neural Network Model. JNNS https://www.research[...]
_[101] 논문 Forecasting CPI inflation components with Hierarchical Recurrent Neural Networks 2023
_[102] 서적 Recurrent Multilayer Perceptrons for Identification and Control: The Road to Applications University of Würzburg Am Hubland 1995-06
_[103] 논문 Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment 2008-11-07
_[104] 논문 The hierarchical and functional connectivity of higher-order cognitive mechanisms: neurorobotic model to investigate the stability and flexibility of working memory
_[105] 웹사이트 Proceedings of the 28th Annual Conference of the Japanese Neural Network Society (October, 2018) http://jnns.org/conf[...]
_[106] 논문 Cortical computing with memristive nanodevices http://www.scidacrev[...] 2019-09-06
_[107] 논문 The complex dynamics of memristive circuits: analytical results and universal slow relaxation
_[108] 간행물 3rd international conference on Simulation of adaptive behavior: from animals to animats 3
_[109] 컨퍼런스 Evolving communication without dedicated communication channels
_[110] 논문 The dynamics of adaptive behavior: A research program
_[111] 컨퍼런스 Deriving the Recurrent Neural Network Definition and RNN Unrolling Using Signal Processing https://www.research[...] 2018-12-07
_[112] 논문 Computational Capabilities of Recurrent NARX Neural Networks https://books.google[...]
_[113] 논문 Comparative analysis of Recurrent and Finite Impulse Response Neural Networks in Time Series Prediction http://www.ijcse.com[...] 2012-02-01
_[114] 논문 Brain inspired neuronal silencing mechanism to enable reliable sequence identification 2022-09-29
_[115] 뉴스 Google Built Its Very Own Chips to Power Its AI Bots https://www.wired.co[...] 2016-05-18
_[116] 컨퍼런스 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems 2006-10-01
_[117] 컨퍼런스 Evolino: Hybrid Neuroevolution/Optimal Linear Search for Sequence Learning https://www.academia[...]
_[118] 논문 Recurrent neural networks for time series forecasting 2019-01-01
_[119] 논문 Recurrent Neural Networks for Time Series Forecasting: Current Status and Future Directions
_[120] 논문 Framewise phoneme classification with bidirectional LSTM and other neural network architectures
_[121] 컨퍼런스 Speech recognition with deep recurrent neural networks
_[122] 논문 Speech synthesis from neural decoding of spoken sentences 2019-04-24
_[123] 논문 Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria 2021-07-15
_[124] 컨퍼런스 Long Short Term Memory Networks for Anomaly Detection in Time Series https://books.google[...] Ciaco 2015-04-01
_[125] 웹사이트 Papers with Code - DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction https://paperswithco[...] 2022-10-13
_[126] 논문 Learning precise timing with LSTM recurrent networks http://www.jmlr.org/[...]
_[127] 서적 Artificial Neural Networks — ICANN 2002 Springer 2002-08-28
_[128] 논문 Learning nonregular languages: A comparison of simple recurrent networks and LSTM
_[129] 논문 LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages ftp://ftp.idsia.ch/p[...] 2017-12-12
_[130] 논문 Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets
_[131] 컨퍼런스 Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks http://papers.neurip[...] MIT Press
_[132] 컨퍼런스 Unconstrained Online Handwriting Recognition with Recurrent Neural Networks http://dl.acm.org/ci[...] Curran Associates
_[133] 서적 Human Behavior Unterstanding Springer
_[134] 논문 Fast model-based protein homology detection without alignment
_[135] 논문 Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins 2007-07-01
_[136] 서적 Advanced Information Systems Engineering
_[137] 논문 Doctor AI: Predicting Clinical Events via Recurrent Neural Networks http://proceedings.m[...]
_[138] 웹사이트 Artificial intelligence helps accelerate progress toward efficient fusion reactions https://www.princeto[...] 2023-06-12
_[139] 서적 Serial order: A parallel distributed processing approach University of California, Institute for Cognitive Science
_[140] 서적 Robust Automatic Speech Recognition Academic Press
_[141] 논문 A Novel Connectionist System for Improved Unconstrained Handwriting Recognition http://www.idsia.ch/[...]
_[142] 웹사이트 Long Short-Term Memory recurrent neural network architectures for large scale acoustic modeling https://static.googl[...] 2014
_[143] arxiv Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition 2014-10-15
_[144] 논문 Comparative analysis of Recurrent and Finite Impulse Response Neural Networks in Time Series Prediction http://www.ijcse.com[...] 2012-02-01
_[145] 서적
_[146] 논문 ニューラルネットワークによる構造学習の発展 http://id.nii.ac.jp/[...]
_[147] 논문 Learning representations by back-propagating errors https://www.nature.c[...] 1986-10-01
_[148] 서적 Habilitation thesis: System modeling and optimization ftp://ftp.idsia.ch/p[...]
_[149] 논문 An Application of Recurrent Neural Networks to Discriminative Keyword Spotting http://dl.acm.org/ci[...] Springer-Verlag 2007
_[150] 논문 Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks https://papers.nips.[...] 2009
_[151] arxiv Deep Speech: Scaling up end-to-end speech recognition 2014-12-17
_[152] 논문 Photo-Real Talking Head with Deep Bidirectional LSTM
_[153] 웹사이트 Unidirectional Long Short-Term Memory Recurrent Neural Network with Recurrent Output Layer for Low-Latency Speech Synthesis https://static.googl[...] ICASSP 2015
_[154] 웹사이트 Google voice search: faster and more accurate http://googleresearc[...] 2015-09-01
_[155] 논문 Sequence to Sequence Learning with Neural Networks https://papers.nips.[...] 2014
_[156] arxiv Exploring the Limits of Language Modeling 2016-02-07
_[157] arxiv Multilingual Language Processing From Bytes 2015-11-30
_[158] arxiv Show and Tell: A Neural Image Caption Generator 2014-11-17
_[159] 서적 Neural Networks as Cybernetic Systems http://www.brains-mi[...]
_[160] 논문 Finding Structure in Time
_[161] 서적 Serial Order: A Parallel Distributed Processing Approach 1997-01-01
_[162] 논문 Bidirectional associative memories 1988
_[163] 논문 Exponential stability for markovian jumping stochastic BAM neural networks with mode-dependent probabilistic time-varying delays and impulse control 2015-01-02
_[164] 서적 Neural networks: a systematic introduction https://books.google[...] Springer
_[165] 논문 Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication 2004-04-02
_[166] 논문 A fresh look at real-time computation in generic recurrent neural circuits http://www.lsm.tugra[...] TU Graz
_[167] 논문 Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
_[168] 논문 Learning task-dependent distributed representations by backpropagation through structure https://pdfs.semanti[...]
_[169] 간행물 The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors
_[170] 서적 Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation https://books.google[...] SIAM
_[171] 논문 28th International Conference on Machine Learning (ICML 2011)
_[172] 논문 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank http://nlp.stanford.[...]
_[173] 논문 Learning complex, extended sequences using the principle of history compression ftp://ftp.idsia.ch/p[...]
_[174] 논문 Deep Learning http://www.scholarpe[...]
_[175] 간행물 Untersuchungen zu dynamischen neuronalen Netzen http://people.idsia.[...]
_[176] 논문 Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks https://clgiles.ist.[...]
_[177] 논문 Constructing Deterministic Finite-State Automata in Recurrent Neural Networks http://citeseerx.ist[...]
_[178] 논문 Learning Precise Timing with LSTM Recurrent Networks (PDF Download Available) https://www.research[...] 2019-04-05
_[179] 논문 Deep Learning in Neural Networks: An Overview 2015-01
_[180] 논문 Evolving Memory Cell Structures for Sequence Learning http://mediatum.ub.t[...] Springer, Berlin, Heidelberg 2009-09-14
_[181] 논문 Sequence labelling in structured domains with hierarchical recurrent neural networks
_[182] 논문 Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
_[183] 논문 LSTM recurrent networks learn simple context-free and context-sensitive languages http://ieeexplore.ie[...] 2001-11
_[184] 논문 Simplified Minimal Gated Unit Variations for Recurrent Neural Networks 2017-01-12
_[185] 논문 Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks 2017-01-20
_[186] 논문 Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
_[187] 웹사이트 Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU/LSTM RNN with Python and Theano – WildML http://www.wildml.co[...] 2019-04-05
_[188] 논문 Framewise phoneme classification with bidirectional LSTM and other neural network architectures 2005-07-01
_[189] 논문 Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins 2007-07
_[190] 논문 3rd international conference on Simulation of adaptive behavior: from animals to animats 3 https://www.research[...]
_[191] 논문 Evolving communication without dedicated communication channels
_[192] 논문 The dynamics of adaptive behavior: A research program
_[193] 논문 How Hierarchical Control Self-organizes in Artificial Adaptive Systems 2005-09-01
_[194] 서적 Recurrent Multilayer Perceptrons for Identification and Control: The Road to Applications
_[195] 논문 Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment 2008-11-07
_[196] 논문 The hierarchical and functional connectivity of higher-order cognitive mechanisms: neurorobotic model to investigate the stability and flexibility of working memory
_[197] 논문 Neural Turing Machines
_[198] 논문 The Neural Network Pushdown Automaton: Architecture, Dynamics and Training Springer Berlin Heidelberg
_[199] 서적 Serial order: A parallel distributed processing approach University of California, Institute for Cognitive Science
_[200] 서적 Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
_[201] 논문 HiPPO: Recurrent Memory with Optimal Polynomial Projections https://proceedings.[...] NeurIPS
_[202] 서적 Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
_[203] 서적 Efficiently Modeling Long Sequences with Structured State Spaces
_[204] 논문 Generalization of backpropagation with application to a recurrent gas market model http://linkinghub.el[...]
_[205] 서적 Learning Internal Representations by Error Propagation Institute for Cognitive Science, University of California, San Diego
_[206] 서적 The Utility Driven Dynamic Error Propagation Network. Technical Report CUED/F-INFENG/TR.1 University of Cambridge Department of Engineering
_[207] 서적 Backpropagation: Theory, Architectures, and Applications Psychology Press 2013-02-01
_[208] 논문 A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks
_[209] 서적 Neural and adaptive systems: fundamentals through simulations Wiley
_[210] 논문 Training recurrent networks online without backtracking 2015-07-28
_[211] 논문 A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks 1992-03-01
_[212] 논문 Complexity of exact gradient computation algorithms for recurrent neural networks. Technical Report Technical Report NU-CCS-89-27 http://citeseerx.ist[...] Northeastern University, College of Computer Science 1989
_[213] 논문 Learning State Space Trajectories in Recurrent Neural Networks https://doi.org/10.1[...] 1989-06-01
_[214] 서적 A Field Guide to Dynamical Recurrent Networks https://books.google[...] John Wiley & Sons 2001-01-15
_[215] 논문 Long Short-Term Memory 1997-11-01
_[216] 논문 On-Line Learning Algorithms for Locally Recurrent Neural Networks 1999
_[217] 논문 Diagrammatic derivation of gradient algorithms for neural networks https://doi.org/10.1[...] 1996
_[218] 논문 A Signal-Flow-Graph Approach to On-line Gradient Calculation 2000
_[219] 간행물 IJCAI 99 http://www.cs.utexas[...] Morgan Kaufmann 2019-04-05
_[220] 웹사이트 Applying Genetic Algorithms to Recurrent Neural Networks for Learning Network Parameters and Architecture http://arimaa.com/ar[...] 2019-04-05
_[221] 논문 Accelerated Neural Evolution Through Cooperatively Coevolved Synapses http://dl.acm.org/ci[...] 2008-06
_[222] 논문 HiPPO: Recurrent Memory with Optimal Polynomial Projections https://arxiv.org/ab[...] 2020
_[223] 논문 HiPPO: Recurrent Memory with Optimal Polynomial Projections https://arxiv.org/ab[...] 2020
_[224] 논문 Unitary Evolution Recurrent Neural Networks https://arxiv.org/ab[...] 2015
_[225] 논문 Unitary Evolution Recurrent Neural Networks https://arxiv.org/ab[...] 2015
_[226] 논문 HiPPO: Recurrent Memory with Optimal Polynomial Projections https://arxiv.org/ab[...] 2020
_[227] 서적 Computational Capabilities of Recurrent NARX Neural Networks https://books.google[...] University of Maryland 1995
_[228] 논문 A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks http://ieeexplore.ie[...] 2006-10
_[229] 논문 Evolino: Hybrid Neuroevolution/Optimal Linear Search for Sequence Learning https://www.academia[...] 2005
_[230] 논문 Framewise phoneme classification with bidirectional LSTM and other neural network architectures 2005
_[231] 논문 An Application of Recurrent Neural Networks to Discriminative Keyword Spotting http://dl.acm.org/ci[...] Springer-Verlag 2007
_[232] 논문 Speech Recognition with Deep Recurrent Neural Networks 2013
_[233] 논문 Long Short Term Memory Networks for Anomaly Detection in Time Series https://www.elen.ucl[...] 2015-04
_[234] 논문 Learning precise timing with LSTM recurrent networks http://www.jmlr.org/[...] 2002
_[235] 논문 Learning the Long-Term Structure of the Blues Springer, Berlin, Heidelberg 2002-08-28
_[236] 논문 Learning nonregular languages: A comparison of simple recurrent networks and LSTM 2002
_[237] 논문 LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages ftp://ftp.idsia.ch/p[...] 2001
_[238] 논문 Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets 2003
_[239] 논문 Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks MIT Press 2009
_[240] 논문 Unconstrained Online Handwriting Recognition with Recurrent Neural Networks http://dl.acm.org/ci[...] Curran Associates Inc. 2007
_[241] 논문 Sequential Deep Learning for Human Action Recognition Springer 2011
_[242] 논문 Fast model-based protein homology detection without alignment 2007
_[243] 논문 Bidirectional Long Short-Term Memory Networks for predicting the subcellular localization of eukaryotic proteins 2007
_[244] 논문 Predictive Business Process Monitoring with LSTM neural networks 2017
_[245] 논문 Doctor AI: Predicting Clinical Events via Recurrent Neural Networks http://proceedings.m[...] 2016
_[246] 저널 A thorough review on the current advance of neural network structures https://www.scienced[...] 2019
_[247] 저널 State-of-the-art in artificial neural network applications: A survey https://www.scienced[...] 2018-11-01
_[248] 저널 Time series forecasting using artificial neural networks methodologies: A systematic review https://www.scienced[...] 2018-12-01
_[249] 저널 A Novel Connectionist System for Improved Unconstrained Handwriting Recognition http://www.idsia.ch/[...]
_[250] 웹인용 Long Short-Term Memory recurrent neural network architectures for large scale acoustic modeling https://static.googl[...] 2014
_[251] arxiv
_[252] 저널 Comparative analysis of Recurrent and Finite Impulse Response Neural Networks in Time Series Prediction http://www.ijcse.com[...] 2012-02-01 # Feb-Mar 로 추정
_[253] 저널 Learning representations by back-propagating errors 1986-10-01 # October 로 추정
_[254] 서적 Habilitation thesis: System modeling and optimization ftp://ftp.idsia.ch/p[...]
_[255] 저널 Long Short-Term Memory 1997-11-01
_[256] 서적 An Application of Recurrent Neural Networks to Discriminative Keyword Spotting http://dl.acm.org/ci[...] Springer-Verlag
_[257] 저널 Deep Learning in Neural Networks: An Overview 2015-01-01 # January 로 추정
_[258] 저널 Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks https://papers.nips.[...] Neural Information Processing Systems (NIPS) Foundation
_[259] arxiv Deep Speech: Scaling up end-to-end speech recognition 2014-12-17
_[260] 웹인용 Long Short-Term Memory recurrent neural network architectures for large scale acoustic modeling https://static.googl[...] 2020-10-08 # 확인날짜
_[261] arxiv Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition 2014-10-15
_[262] 간행물 Photo-Real Talking Head with Deep Bidirectional LSTM
_[263] 웹인용 Unidirectional Long Short-Term Memory Recurrent Neural Network with Recurrent Output Layer for Low-Latency Speech Synthesis https://static.googl[...] ICASSP
_[264] 웹인용 Google voice search: faster and more accurate http://googleresearc[...] 2015-09-01 # September 로 추정
_[265] 저널 Sequence to Sequence Learning with Neural Networks https://papers.nips.[...]
_[266] arxiv Exploring the Limits of Language Modeling 2016-02-07
_[267] arxiv Multilingual Language Processing From Bytes 2015-11-30
_[268] arxiv Show and Tell: A Neural Image Caption Generator 2014-11-17
_[269] 간행물 A Survey on Hardware Accelerators and Optimization Techniques for RNNs https://www.research[...]
_[270] 서적 Neural Networks as Cybernetic Systems http://www.brains-mi[...]
_[271] 저널 Finding Structure in Time
_[272] 서적 Neural-Network Models of Cognition - Biobehavioral Foundations 1997-01-01
_[273] 저널 Learning Precise Timing with LSTM Recurrent Networks http://www.jmlr.org/[...] 2017-06-13 # access-date
_[274] 논문 Untersuchungen zu dynamischen neuronalen Netzen http://people.idsia.[...]
_[275] 저널 Deep Learning in Neural Networks: An Overview 2015-01-01 # January 로 추정
_[276] 서적 Evolving Memory Cell Structures for Sequence Learning http://mediatum.ub.t[...] Springer 2009-09-14
_[277] 저널 Sequence labelling in structured domains with hierarchical recurrent neural networks
_[278] 저널 Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
_[279] 저널 LSTM recurrent networks learn simple context-free and context-sensitive languages https://semanticscho[...] 2001-11-01 # November 로 추정
_[280] arxiv Simplified Minimal Gated Unit Variations for Recurrent Neural Networks 2017-01-12
_[281] ArXiv Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks 2017-01-20
_[282] 웹사이트 Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU/LSTM RNN with Python and Theano – WildML http://www.wildml.co[...] 2015-10-27
_[283] ArXiv Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling 2014
_[284] 저널 Framewise phoneme classification with bidirectional LSTM and other neural network architectures 2005-07-01
_[285] 저널 Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins 2007-07
_[286] 저널 Sequence to Sequence Learning with Neural Networks https://papers.nips.[...] 2014
_[287] 서적 A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks 2006-10
_[288] 저널 Evolino: Hybrid Neuroevolution/Optimal Linear Search for Sequence Learning https://www.academia[...] 2005
_[289] ArXiv Recurrent neural networks for time series forecasting 2019-01-01
_[290] 저널 Recurrent Neural Networks for Time Series Forecasting: Current Status and Future Directions 2020
_[291] 저널 Framewise phoneme classification with bidirectional LSTM and other neural network architectures 2005
_[292] 서적 An Application of Recurrent Neural Networks to Discriminative Keyword Spotting http://dl.acm.org/ci[...] Springer-Verlag 2007
_[293] 저널 Speech Recognition with Deep Recurrent Neural Networks 2013
_[294] 저널 Speech synthesis from neural decoding of spoken sentences 2019-04-24
_[295] 저널 Long Short Term Memory Networks for Anomaly Detection in Time Series https://www.elen.ucl[...] 2015-04
_[296] 저널 Learning precise timing with LSTM recurrent networks http://www.jmlr.org/[...] 2002
_[297] 서적 Learning the Long-Term Structure of the Blues Springer 2002-08-28
_[298] 저널 Learning nonregular languages: A comparison of simple recurrent networks and LSTM 2002
_[299] 저널 LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages ftp://ftp.idsia.ch/p[...] 2001
_[300] 저널 Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets 2003
_[301] 저널 Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks MIT Press 2009
_[302] 서적 Unconstrained Online Handwriting Recognition with Recurrent Neural Networks http://dl.acm.org/ci[...] Curran Associates Inc. 2007
_[303] 저널 Sequential Deep Learning for Human Action Recognition Springer 2011
_[304] 저널 Fast model-based protein homology detection without alignment 2007
_[305] 저널 Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins 2007-07
_[306] 서적 Predictive Business Process Monitoring with LSTM neural networks 2017