Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Admin for related charts #3833

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

🎉 Admin for related charts #3833

wants to merge 7 commits into from

Conversation

Marigold
Copy link
Collaborator

@Marigold Marigold commented Jan 13, 2025

Progress (2025-01-17)

  • Enabled user reviews and added a table at the top to display recommendations from various systems.
  • Added a related_charts table to MySQL for storing reviews and related charts to be used later during baking.
  • Incorporated co-views into scoring and introduced them as a separate recommendation system.

Co-views

Co-views turned out to be much better than expected. I reviewed various charts, and the top 5 charts with the most co-views were almost always strong recommendations. This raises the question of whether we should address all issues in the project (such as agreement reviews and performance evaluations) or whether a simpler solution based on co-views might suffice. I believe it could, with a few minor modifications.

Some additional observations:

  • Surprisingly, we often have sufficient data, even for charts with low traffic.
  • Ranking based on co-views naturally favors pages with high pageviews, but the balance is reasonable. High-pageview charts don’t dominate the top 5, and the results are usually well-distributed.
  • For charts with the highest pageviews, co-views tend to recommend other high-pageview charts, which might not always be relevant. For example, Life Expectancy recommends charts about GDP. Is this behavior desirable? It could be, as users visiting such charts might not have specific goals, and recommending other high-viewed charts might make sense. If this behavior is undesirable, we could penalize highly viewed charts (that's possible from Advanced options).
    • There are various ways how to "penalize" charts we don't want to show. First, we should decide whether we prefer showing highly viewed charts or charts with fewer views, but more relevant to the chart. In any way, this would still be a simple formula with just pageviews and coviews.
  • Co-views in production might be self-reinforcing. Is this beneficial or problematic? We should analyze how many co-views result from clicks on related charts versus natural browsing and consider excluding the former if necessary.

Problematic Charts

@owidbot
Copy link
Contributor

owidbot commented Jan 13, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-similar-charts-admin

chart-diff: ✅ No charts for review.
data-diff: ✅ No differences found
Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2025-02-03 12:11:21 UTC
Execution time: 18.45 seconds

@Marigold Marigold force-pushed the similar-charts-admin branch 3 times, most recently from 9907952 to bd1e926 Compare January 17, 2025 10:43
@Marigold Marigold force-pushed the similar-charts-admin branch from c8bccec to 4606e95 Compare January 17, 2025 15:25
@Marigold Marigold changed the title 🎉 Enable good/bad labeling of similar charts by users 🎉 Admin for related charts Jan 20, 2025
@larsyencken
Copy link
Contributor

Thanks for this update Mojmir!

What you're seeing is that major markers of development often are co-viewed with other major markers of development. That's bad if you want "more data like this", but could be good if you want "things I might be interested in".

It sounds like we should do a quality check with Joe around co-views, then consider moving to an initial implementation based on that. Since we like it so much, perhaps we should push co-view data from BigQuery into analytics/DuckDB?

Currently co-view data strips the query string from the URL, so a story based on coviews that includes explorer views and Mdim views might need some collaboration with @bnjmacdonald in order to get a version of co-view data that includes some query parameters.

@Marigold Marigold force-pushed the similar-charts-admin branch 3 times, most recently from 500237c to 78c84cb Compare January 22, 2025 13:42
@Marigold Marigold force-pushed the similar-charts-admin branch from d58dc8e to e50c284 Compare February 3, 2025 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants