You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am fine-tuning the im2txt model with my own small data set (2000 images) using a pretrained checkpoint (5M steps on COCO data).
The loss is currently around 0.2 after ~8800 steps and the model is still running.
I have noticed that the captions for my data, generated with the inference function, are getting worse. The same word is repeated over and over again
INFO:tensorflow:Successfully loaded checkpoint: model.ckpt-5008890
Captions for image disc_111097.jpg:
0) stopped stopped stopped stopped stopped at a red light . (p=0.022266)
1) stopped stopped stopped stopped at a red light . (p=0.019730)
2) stopped stopped stopped stopped at a red light (p=0.010466)
The further I continue, the worse it gets. The output is nearly the same for every image including images from the COCO data set.
The only explanation I can think of it that it has something to do with the word counts. Since my data set is small, I merged the existing word counts file for the COCO data with a new word counts file for my data in the following way:
words that are only in the COCO word counts get added to new word_counts
words that are only in my data (min_word_count=1) get added to new word_counts
words that are in both word counts files get added to new word_counts with added counts
What is the reason, that beam search just predicts one word, repeated over and over again?
Additional info: The same problem occurs when I use my small data set for initial training, only that the extreme case of one repeated word already appears with the first checkpoint file after 317 steps.
No matter, how long I continue the initial training, the weights don't get updated anymore and the output stays exactly the same. All checkpoint files in initial training look exactly the same.
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no, I am just using the provided code with a pre-trained model
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04.2
TensorFlow installed from (source or binary): Tensorflow with Python2 using Anaconda
TensorFlow version (use command below): 1.13.1
Bazel version (if compiling from source): 0.22.0
CUDA/cuDNN version: 6.0.21
GPU model and memory: NVIDIA Corporation Device 119e, 256M
Exact command to reproduce: bazel-bin/im2txt/run_inference --checkpoint_path=${CHECKPOINT_PATH} --vocab_file=${VOCAB_FILE} --input_files=${IMAGE_FILE}
The text was updated successfully, but these errors were encountered:
Here is how they get most likely words:
most_likely_words = np.argsort(word_probabilities)[:-self.beam_size][::-1]
But it is wrong. It actually get worst words instead.
I believe the correct way should be:
most_likely_words = np.argsort(word_probabilities)[-self.beam_size:][::-1]
I am fine-tuning the im2txt model with my own small data set (2000 images) using a pretrained checkpoint (5M steps on COCO data).
The loss is currently around 0.2 after ~8800 steps and the model is still running.
I have noticed that the captions for my data, generated with the inference function, are getting worse. The same word is repeated over and over again
The further I continue, the worse it gets. The output is nearly the same for every image including images from the COCO data set.
The only explanation I can think of it that it has something to do with the word counts. Since my data set is small, I merged the existing word counts file for the COCO data with a new word counts file for my data in the following way:
What is the reason, that beam search just predicts one word, repeated over and over again?
Additional info: The same problem occurs when I use my small data set for initial training, only that the extreme case of one repeated word already appears with the first checkpoint file after 317 steps.
No matter, how long I continue the initial training, the weights don't get updated anymore and the output stays exactly the same. All checkpoint files in initial training look exactly the same.
System information
bazel-bin/im2txt/run_inference --checkpoint_path=${CHECKPOINT_PATH} --vocab_file=${VOCAB_FILE} --input_files=${IMAGE_FILE}
The text was updated successfully, but these errors were encountered: