Notes from Week 3 of auditing the course¶
Sequence Models¶
- Moving from sentiment in individual words to sentiment obtained from sequence of words (e.g. fun vs not fun)
RNN¶
Outputs from previous stage are fed to the next stage. See details in the Deep Learning Specialization
LSTM (Long Short Term memory)¶
The keyword with the context comes from much earlier in the sequence. "I grew in Ireland and when I was in school I was taight to speak ..." In LSTM, a pipeline of "conexts" is fed both in the forward and reverse directions in a network.
Is an LSTM a type of RNN?
LSTMs in code¶
model = tf.keras.Sequential([
tf.keras.layers.Embedding(tokenizer.vocan_szie,64),
# LSTM(64) signifies the number of outputs we desire from the LSTM layer
# Bidirectional propogates model state in both directions (output of bidirectionlayer will be 128, rather than 64)
# return_sequences=True is required when stacking LSTMs to ensure that output of first LSTM match the desited input of the next layer (why would this be any difefrent?)
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64), return_sequences = True),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Dense(64,activation='relu'),
tf.keras.layers.Dense(1,activation='sigmoid'),
])
With text/language models, the likleihood of overfitting is greater as the validation set will most likely have out of vocabulory words.
Jupyter Notebooks¶
Week 3, Lab 1, Single layer LSTM Week 3, Lab 2, Multi layer LSTM Week 3, Lab 3, Using Convolutions Week 3, Lab 4 Week 3, Lab 5, Sracasm with LSTM Week 3, Lab 6, Sarcasm with 1D convolution
Links¶
Sequence Models Course LSTM Course