Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency Issue #44

Open
amine759 opened this issue Apr 24, 2024 · 2 comments
Open

Dependency Issue #44

amine759 opened this issue Apr 24, 2024 · 2 comments
Assignees

Comments

@amine759
Copy link

  • ferret version: 0.4.1
  • Python version: 3.10.12
  • Running on : Google Colab

Description

I'm trying to run :

bench.show_table(explanations)

and I get the following error :

AttributeError                            Traceback (most recent call last)
[<ipython-input-9-39a3816ba913>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bench.show_table(explanation2)

[/usr/local/lib/python3.10/dist-packages/ferret/benchmark.py](https://localhost:8080/#) in show_table(self, explanations, apply_style, remove_first_last)
    396             table.columns = pd.io.parsers.base_parser.ParserBase(
    397                 {"names": table.columns, "usecols": None}
--> 398             )._maybe_dedup_names(table.columns)
    399 
    400         return (

AttributeError: 'ParserBase' object has no attribute '_maybe_dedup_names'

This only happens when I have duplicate tokens in my text example which I'm willing to explain, apparently in The latest version of pandas (2.2.2) _maybe_dedup_names is deprecated, probably to _maybe_make_multi_index_columns.
I have tried to downgrade pandas yet new errors occurs, Can you provide me with the pandas version used here ?

Thanks.

@g8a9 g8a9 self-assigned this Apr 26, 2024
@g8a9
Copy link
Owner

g8a9 commented Apr 26, 2024

Hey, thank you for reaching out. Yes indeed, it seems related to the deprecation of that pandas method. I believe the best way here is to rename duplicated columns (tokens) ourselves (i.e., not relying on pandas for that) -- but I would include the change in a new library release.

Before doing that, can you share the google colab or snippet of code that crashes on your side?

@amine759
Copy link
Author

Hi @g8a9, Thanks for your reply.
This code is just from the doc but text example has duplicate tokens :

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark

name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)

bench = Benchmark(model, tokenizer)

explanations = bench.explain("You look stunning stunning !", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)

bench.show_table(explanations)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-3-1ca62c9897f7>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bench.show_table(explanations)

[/usr/local/lib/python3.10/dist-packages/ferret/benchmark.py](https://localhost:8080/#) in show_table(self, explanations, apply_style, remove_first_last)
    396             table.columns = pd.io.parsers.base_parser.ParserBase(
    397                 {"names": table.columns, "usecols": None}
--> 398             )._maybe_dedup_names(table.columns)
    399 
    400         return (

AttributeError: 'ParserBase' object has no attribute '_maybe_dedup_names'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants