Skip to content

Notes from Week 3 of auditing the course

Sequence Models

  • Moving from sentiment in individual words to sentiment obtained from sequence of words (e.g. fun vs not fun)

RNN

Outputs from previous stage are fed to the next stage. See details in the Deep Learning Specialization

LSTM (Long Short Term memory)

The keyword with the context comes from much earlier in the sequence. "I grew in Ireland and when I was in school I was taight to speak ..." In LSTM, a pipeline of "conexts" is fed both in the forward and reverse directions in a network.

Is an LSTM a type of RNN?

LSTMs in code

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(tokenizer.vocan_szie,64),
    # LSTM(64) signifies the number of outputs we desire from the LSTM layer 
    # Bidirectional propogates model state in both directions (output of bidirectionlayer will be 128, rather than 64)
    # return_sequences=True is required when stacking LSTMs to ensure that output of first LSTM match the desited input of the next layer (why would this be any difefrent?)
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64), return_sequences = True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
    tf.keras.layers.Dense(64,activation='relu'),
    tf.keras.layers.Dense(1,activation='sigmoid'),
])

With text/language models, the likleihood of overfitting is greater as the validation set will most likely have out of vocabulory words.

Jupyter Notebooks

Week 3, Lab 1, Single layer LSTM Week 3, Lab 2, Multi layer LSTM Week 3, Lab 3, Using Convolutions Week 3, Lab 4 Week 3, Lab 5, Sracasm with LSTM Week 3, Lab 6, Sarcasm with 1D convolution

Sequence Models Course LSTM Course

To Do