Bugs & Pitfalls


PyTorch

LSTM

  1. The LSTM takes [Seq_len * Batch_size * Hidden_size] as input, but embedding usually outputs [Batch_size * Seq_len * Hidden_size]

  2. The output of LSTM is output, (h, c), where (h, c) is a tuple. So, we should use it as output, _ = self.word_lstm(x)