Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding TAPAS #1

Open
NielsRogge opened this issue Jul 14, 2022 · 1 comment
Open

Question regarding TAPAS #1

NielsRogge opened this issue Jul 14, 2022 · 1 comment

Comments

@NielsRogge
Copy link

NielsRogge commented Jul 14, 2022

Hi,

Thanks for this interesting work. I contributed TAPAS some time ago to HuggingFace Transformers and was curious to see how well it performed on this dataset. Fun to see that you used it in your experiments :D

However, when checking out the experiments notebook, I saw that you instantiate the model as follows:

elif model_name == "tapas":
    tokenizer = TapasTokenizer.from_pretrained(model_path)
    config = TapasConfig.from_pretrained(model_path)
    config.num_labels = 3
    model = TapasForSequenceClassification(config)

However, this will instantiate a randomly initialized TAPAS model, i.e. it will not instantiate the base of the model with the pre-trained weights. This is because you instantiate the model based on a configuration, rather than using the from_pretrained method.

Hence, fine-tuning this particular one will result in random results (as illustrated by your paper).

TAPAS seems to be the only one where you don't instantiate the model using from_pretrained, so curious to hear your reply!

And btw, we recently merged a new table-based model by Microsoft into the library called TAPEX. It claims to outperform TAPAS on several benchmarks, including TabFact. So would be interesting to see how well this one performs on PubHealthTab.

Kind regards,

Niels
ML engineer @ HuggingFace

@NielsRogge
Copy link
Author

Btw, would be great to add this dataset to the HuggingFace hub as well.

TabFact for instance is accessible there: https://huggingface.co/datasets/tab_fact

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant