keras: calculate the perplexity on penntreebank using LSTM keras got infinity
I am very new to KERAS, and I use the dealt dataset from the RNN Toolkit and try to use LSTM to train the language model I have problem with the calculating the perplexity though. It always get quite large negative log loss, and when using the exp function, it seems to get the infinity, I got stuck here. Below is my model code, and the github link( https://github.com/janenie/lstm_issu_keras ) is the current problematic code of mine. Can someone help me out?
the test_y data format is word index in sentences per sentence per line, so is the test_x.
class LSTMLM: def init(self, input_len, hidden_len, output_len, return_sequences=True): self.input_len = input_len self.hidden_len = hidden_len self.output_len = output_len self.seq = return_sequences self.model = Sequential()
def build(self, maxlen=50, dropout=0.2):
self.model.add(Embedding(self.input_len, self.hidden_len, input_length=maxlen))
self.model.add(LSTM(output_dim=self.hidden_len, return_sequences=True))
#self.model.add(Dropout(dropout))
self.model.add(TimeDistributedDense(self.output_len))
self.model.add(Activation('softmax'))
self.model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
def train(self, train_x, train_y, dev_x, dev_y, batchsize=16, epoch=1):
hist = self.model.fit(train_x, train_y, batch_size=batchsize, nb_epoch=1, \
show_accuracy=True, validation_data=(dev_x, dev_y))
print hist.history
def saveModel(self, save_file):
self.model.save_weights(save_file)
def computePPL(self, test_x, test_y, maxLen=100):
predictions = self.model.predict(test_x)
test_num = len(test_y)
maxlen = predictions.shape[1]
ppl = 0.0
total_words = 0
for i in xrange(test_num):
pred_idx = test_y[i]
#print np.max(pred_idx)
sent_len = len(test_y[i])
padding = maxlen - sent_len
for j in xrange(sent_len):
prob_ = predictions[i, j+padding, pred_idx[j]]
ppl -= np.log(prob_)
total_words += sent_len
return np.exp(ppl)/(float(total_words))
Please make sure that the boxes below are checked before you submit your issue. Thank you!
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 17 (3 by maintainers)
Accordings to the Socher’s notes that is presented by @cheetah90 , could we calculate perplexity by following simple way?
Hi @braingineer
Thanks for sharing your code snippets! I went with your implementation and the little trick for 1/log_e(2). Seems to work fine for me. I wondered how you actually use the
mask
parameter when you give it tomodel.compile(..., metrics=[perplexity])
?