Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typo in organization name (“Istitute” of Electrical and Electronics Engineers) #5

Closed
strogonoff opened this issue Feb 16, 2022 · 11 comments
Labels
bug Something isn't working

Comments

@strogonoff
Copy link

strogonoff commented Feb 16, 2022

https://github.com/ietf-ribose/relaton-data-misc/blob/1f9465d443de097c8be6dac02f1db2b74eddace0/data/reference.IEEE.802.1Qcc-2018.yaml#L29-L40

And the same situation as with #2, “publisher” is specified as IETF, although the document is IEEE. If Relaton data says who published the document itself, not who published a resource with bibliographic item about that document, then saying an IEEE document is published by IETF is an error.

@ronaldtse
Copy link

Thanks @strogonoff , there are several problems in this file (and this dataset in general)

  1. The publisher is not IETF.
  2. There is a typo in the organization name anyway
  3. The IEEE documents are available in the relaton-data-ieee data set.
  4. This misc dataset should be considered static now and no longer mirroring the legacy bibxml copy. We should now edit this dataset manually to correct errors.

@ronaldtse
Copy link

My earlier comment wasn't actionable...

I can resolve 4 by removing the GHA workflow from this repository.

Can 3 be resolved using the new mapping system? If that is done then 1/2 are no longer issues.

@strogonoff
Copy link
Author

strogonoff commented Feb 17, 2022

@ronaldtse Yes, https://dev.bibxml.org/management/xml2rfc-compat/ allows manually mapping xml2rfc paths and verifying XML responses. (It might not work reliably for bibxml3 on our demo infrastructure, I get error 500 sometimes, but it should work well for bibxml2.)

If we map some paths, we should export mapped path JSON in case DB is wiped at some point. See also docs. We could pre-map some path before delivering to IETF and provide an importable JSON file as part of deliverable.

Who’s the best person to do the mapping? As was established in the issue about IEEE.802.11 2012, I may lack subject domain knowledge to confirm that resulting XML represents correct documents.

@strogonoff
Copy link
Author

@ronaldtse Alternatively, instead of mapping manually, we can just remove documents from bibxml-misc and adjust automatic logic for resolving bibxml-misc XML paths so that it does smarter parsing for filenames and has a higher chance of automatically finding IEEE references we may have indexed from other sources.

It should be confirmed by hand though, some of those IEEE XML filenames can be weird and not resolve automatically, requiring a manual map.

This was referenced Feb 17, 2022
@ronaldtse
Copy link

@strogonoff right, but there is an additional consideration that there are only 94 IEEE entries, so a manual map is easier to do than having to tweak a translation pattern 😉

So for IEEE documents, we will:

  1. Remove the IEEE documents from the misc data set
  2. Create manual mappings for those documents to the ieee data set

OK?

@ronaldtse
Copy link

@strogonoff can I also remove all the files that are prefixed with _ now in the misc data set?

@strogonoff
Copy link
Author

@strogonoff can I also remove all the files that are prefixed with _ now in the misc data set?

I don’t see any such paths here: https://github.com/ietf-ribose/relaton-data-misc/tree/main/data

@strogonoff
Copy link
Author

strogonoff commented Feb 17, 2022

If you mean bibxml-data-archive, if you remove prefix variations it will break fallback behavior (if path didn’t resolve) and xml2rfc path resolution checker tool (if IETF staff wants to confirm that _-prefixed paths resolve, they will not be shown). So no, do not remove them. The purpose of that archive is to have all paths for fallback and for path resolution validation. We are currently actually missing paths there in bibxml3 directory, and we should definitely not remove any paths.

@ronaldtse
Copy link

I don’t see any such paths here: https://github.com/ietf-ribose/relaton-data-misc/tree/main/data

Ah okay.

@strogonoff strogonoff changed the title Typo in organization name Typo in organization name (“Istitute” of Electrical and Electronics Engineers) May 23, 2022
@strogonoff strogonoff added the bug Something isn't working label May 23, 2022
@strogonoff
Copy link
Author

strogonoff commented May 23, 2022

I checked with the XML file that seems to be the source for this, and there is no typo there (the abbreviation is not decoded at all).

Ping @ronaldtse @andrew2net

@andrew2net
Copy link
Contributor

@strogonoff I'll fix it in the next relaton-bib release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants