Saving and loading keras model with elmo embedding layer

0
Asked by : melany wijngaard 2021-02-03 21:07

I'm training a Keras model for token classification with an ELMO layer. I will need to save the model for future use, I've tried with model.save_weights("model_weights.h5"), but then if I load them into a new model that I build, and then I call model.predict(...), I get results as if the model has never been trained. It looks like the configurations are not saved properly.

I am new with keras and tensorflow 1, and I'm not sure if this is the way to do it. Any help is welcome! I'm obviously missing something here, but I couldn't find sufficient on saving models with an elmo layer.

I am defining the model like this :

 def ElmoEmbedding(x):
    return elmo_model(inputs={"tokens": tf.squeeze(tf.cast(x, tf.string)),
                              "sequence_len": tf.constant(batch_size*[max_len])},
                      signature="tokens",
                      as_dict=True)["elmo"]

def build_model(max_len, n_words, n_tags): 
    word_input_layer = Input(shape=(max_len, 40, ))
    elmo_input_layer = Input(shape=(max_len,), dtype=tf.string)
    
    word_output_layer = Dense(n_tags, activation = 'softmax')(word_input_layer)
    elmo_output_layer = Lambda(ElmoEmbedding, output_shape=(1, 1024))(elmo_input_layer)
    
    output_layer = Concatenate()([word_output_layer, elmo_output_layer])
    output_layer = BatchNormalization()(output_layer)
    output_layer = Bidirectional(LSTM(units=512, return_sequences=True, recurrent_dropout=0.2, dropout=0.2))(output_layer)
    output_layer = TimeDistributed(Dense(n_tags, activation='softmax'))(output_layer)
    
    model = Model([elmo_input_layer, word_input_layer], output_layer)
    
    return model

And I then I run the training like:

 tf.disable_eager_execution()
elmo_model = hub.Module("https://tfhub.dev/google/elmo/3", trainable=False)

sess = tf.Session()
K.set_session(sess)
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])

model = build_model(max_len, n_words, n_tags)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

history = model.fit([np.array(X1_train), np.array(X2_train).reshape((len(X2_train), max_len, 40))],
                    y_train,
                    validation_data=([np.array(X1_valid), np.array(X2_valid).reshape((len(X2_valid), max_len, 40))], y_valid),
                    batch_size=batch_size, epochs=5, verbose=1)

model.save_weights("model_weights.h5")

If I try to load the weights in another session like the following, I get zero accuracy:

 tf.disable_eager_execution()
elmo_model = hub.Module("https://tfhub.dev/google/elmo/3", trainable=False)

sess = tf.Session()
K.set_session(sess)
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])

model = build_model(max_len, n_words, n_tags)
model.load_weights("model_weights.h5")
y_pred = model.predict([X1_test, np.array(X2_test).reshape((len(X2_test), max_len, 40))])

0 Answers

0 votes
Related Posts
Related Posts
1
Put labels in Coolprop Chart...
alani gonçalves alani gonçalves
2020-03-31  4:28pm 2020-03-31 4:28pm
2
Python get N characters from string with...
rachel lambert rachel lambert
2021-03-27  12:32am 2021-03-27 12:32am
3
Modifying navigator.webdriver flag to pr...
necati nalbantoğlu necati nalbantoğlu
2021-02-13  2:35pm 2021-02-13 2:35pm
4
Capsule Network Implementation for audio...
ross nichols ross nichols
2020-05-24  2:36am 2020-05-24 2:36am
5
How can I use elasticsearch.client.Async...
benjamin patel benjamin patel
2020-04-06  2:24am 2020-04-06 2:24am