Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web: don't allow sites without feed or webmentions #1458

Closed
snarfed opened this issue Nov 8, 2024 · 10 comments
Closed

Web: don't allow sites without feed or webmentions #1458

snarfed opened this issue Nov 8, 2024 · 10 comments
Labels

Comments

@snarfed
Copy link
Owner

snarfed commented Nov 8, 2024

We get a number of sites now that have no RSS or Atom feed or mf2, and are unlikely to post via webmention. It's awkward to bridge those if we'll never bridge any posts for them.

We can easily tell if there's an RSS or Atom feed right now, and we know if a site has already sent us a webmention or not, but we don't distinguish where profile info comes from, ie mf2 vs mf1 vs metaformats.

Maybe the most likely thing to do hwere would be to try to detect mf2 at fetch time and store it in a new Web.has_mf2_hcard property, and either backfill or only block a site if we fetched their homepage after we launch that.

@snarfed
Copy link
Owner Author

snarfed commented Nov 8, 2024

This is needed to do #1403 for web users.

@snarfed
Copy link
Owner Author

snarfed commented Nov 8, 2024

I have a start on this in a stash, don't bridge Web users without feed or mf2 h-card, #1458.

@ZipMartini
Copy link

Does a site's RSS feed always have to be the site's index? Obviously the fetch can't know if an RSS feed does exist, but in a different page, i.e. blog.xml...

@Tamschi
Copy link
Collaborator

Tamschi commented Nov 18, 2024

I don't know how Bridgy Fed does this, but generally speaking <link rel="alternate" type="application/rss+xml" title="…" href="…"> tags are used for RSS feed auto-discovery. I'd assume that or a similar tag would have to be present on the homepage, and that the RSS feed itself can be anywhere. The system wouldn't work with many sites otherwise.

@snarfed
Copy link
Owner Author

snarfed commented Nov 18, 2024

Right! More broadly, Bridgy Fed only supports whole web sites right now, ie not specific pages or paths.

@D-Melhede
Copy link

Why are direct RSS feed URLs not supported ?
dr.dk doesn't produce any content on Bluesky, but it actually has multiple different RSS feeds available, e.g.
https://www.dr.dk/nyheder/service/feeds/senestenyt

@snarfed
Copy link
Owner Author

snarfed commented Nov 20, 2024

(^ answered in #1527)

@snarfed snarfed added the now label Nov 24, 2024
@snarfed snarfed changed the title Web: don't allow sites without feed or mf2 h-card Web: don't allow sites without feed or webmentions Nov 24, 2024
@snarfed
Copy link
Owner Author

snarfed commented Nov 24, 2024

Once this is done, I'd also like to backfill it, ie go through all existing bridged web sites and disable the ones that don't have a feed or webmentions.

snarfed added a commit that referenced this issue Nov 27, 2024
ie for web sites that don't have RSS or Atom feed or webmentions

for #1458
@snarfed
Copy link
Owner Author

snarfed commented Nov 27, 2024

Currently backfilling this, ie deactivating (and removing DNS! #1268) for matching web sites, with:

ids = []
for w in Web.query(Web.copies.protocol == 'atproto'):
  if w.status and not w.manual_opt_out:
    ids.append(w.key.id())
    print(len(ids), w.key.id())
    try:
      did = w.get_copy(ATProto)
      repo = arroba.server.storage.load_repo(did)
      if not repo.status:
        arroba.server.storage.deactivate_repo(repo)
      ATProto.remove_dns(w.handle_as('atproto'))
    except:
      logging.exception('')

@snarfed
Copy link
Owner Author

snarfed commented Nov 27, 2024

Done, removed 3362 DNS entries. Now doing ActivityPub, will follow up in #1268

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants