Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Daniel Ricciardo incorrectly present in 2024 Japanese Grand Prix Practice 1 #680

Open
borolepratik opened this issue Jan 24, 2025 · 6 comments · May be fixed by #699
Open

[BUG] Daniel Ricciardo incorrectly present in 2024 Japanese Grand Prix Practice 1 #680

borolepratik opened this issue Jan 24, 2025 · 6 comments · May be fixed by #699

Comments

@borolepratik
Copy link

borolepratik commented Jan 24, 2025

Describe the issue:

Daniel Ricciardo didnt take part in the 2024 Japanese Grand Prix Practice 1 session; the driver is shown when fetching the session's results.
https://www.formula1.com/en/results/2024/races/1232/japan/practice/1

Image

Reproduce the code example:

import fastf1

# Parameters
year = 2024
grand_prix = 4
session = 1

# Load session data
session = fastf1.get_session(year, grand_prix, session)
session.load()

session.results

Error message:

@Casper-Guo
Copy link
Contributor

Casper-Guo commented Mar 2, 2025

Looks to me like this bug is because Ergast is returning incorrect data. Daniel Riccardo is not in the F1 API drivers list but is in the Ergast drivers list. Correspondingly Iwasa is missing from the Ergast data return.

And then we have this logic for handling when the two sources disagree:

else:
    missing_drivers = list(set(driver_info_ergast['DriverNumber'])
                           .difference(driver_info_f1['DriverNumber']))
    # drivers are missing if DNSed (did not start)
    # in that case, pull more information from Ergast for these drivers

But Ergast actually do not provide the correct drivers for FP1 since it only has data for the actual grand prix.

    def _drivers_results_from_ergast(
            self, *, load_drivers=False, load_results=False
    ) -> Optional[pd.DataFrame]:
        if self.name in self._RACE_LIKE_SESSIONS + self._QUALI_LIKE_SESSIONS:
            session_name = self.name
        else:
            # this is a practice session, use drivers from race session but
            # don't load results
            session_name = 'Race'
            load_results = False

So this is actually a much wider issue with reserve drivers taking part in FP1. If you try doing this with most Abu Dhabi GPs' FP1s, your results will contain more than 20 drivers.

We probably need to be a lot more careful about pulling in drivers from Ergast.

@theOehrly
Copy link
Owner

theOehrly commented Mar 2, 2025

This might be fairly straightforward to fix. Right now, we pull driver information for practice sessions from the race endpoint on Ergast, because there is no practice data on Ergast.
Also, to include drivers that did not start (DNS), drivers are added based on the Ergast data to have the full classification result.
What we probably only need to do is change the data merging for practice sessions such that only the additional values are filled in, but no additional drivers are added. So maybe just skip the "else missing drivers" part above.

@Casper-Guo
Copy link
Contributor

I am prepping a PR for this and would prefer to include tests. I tried to find recent examples of drivers DNS but in those cases the drivers are in the F1 API data as well. Do you remember any example that utilizes the pulling additional information from Ergast logic?

Casper-Guo added a commit to Casper-Guo/Fast-F1 that referenced this issue Mar 3, 2025
@Casper-Guo Casper-Guo linked a pull request Mar 3, 2025 that will close this issue
@theOehrly
Copy link
Owner

I'll check later for an example.

@borolepratik
Copy link
Author

Do you remember any example that utilizes the pulling additional information from Ergast logic?

Oliver Bearman in 2024 Saudi Arabian GP Practice 1 and 2.

# Parameters
year = 2024
grand_prix = 2
session = 1 # or 2

# Load session data
session = fastf1.get_session(year, grand_prix, session)
session.load()

Image

Best to look for situations where another driver took over driving duties mid-weekend.

@Casper-Guo
Copy link
Contributor

This isn't what I was originally looking for but it helped me catch another regression haha. Thanks for flagging it!

Casper-Guo added a commit to Casper-Guo/Fast-F1 that referenced this issue Mar 4, 2025
Casper-Guo added a commit to Casper-Guo/Fast-F1 that referenced this issue Mar 4, 2025
Casper-Guo added a commit to Casper-Guo/Fast-F1 that referenced this issue Mar 4, 2025
Casper-Guo added a commit to Casper-Guo/Fast-F1 that referenced this issue Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants