Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc-json: Improve Item::attrs to not use debug representation. #137645

Open
aDotInTheVoid opened this issue Feb 26, 2025 · 18 comments
Open

rustdoc-json: Improve Item::attrs to not use debug representation. #137645

aDotInTheVoid opened this issue Feb 26, 2025 · 18 comments
Labels
A-attributes Area: Attributes (`#[…]`, `#![…]`) A-rustdoc-json Area: Rustdoc JSON backend T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@aDotInTheVoid
Copy link
Member

Currently, Item::attrs uses a debug-printing

Stringified versions of parsed attributes on this item.
Essentially debug printed (e.g. #[inline] becomes something similar to #[attr="Inline(Hint)"]).
Equivalent to the hir pretty-printing of attributes.

https://github.com/rust-lang/rust/blame/85abb276361c424d64743c0965242dd0e7b866d1/src/rustdoc-json-types/lib.rs#L123-L126

This isn't particularly friendly to users, who need to deal with an internal (and undocumented 😱) attr representation.

This was changed in #135726, as part of the a larger attribute rework (#131229)

MCVE:

#[repr(C)]
pub struct Foo(i32);

which produces (abridged):

    "1": {
      "attrs": ["#[attr=\"Repr([ReprC])\")]\n"],
      "inner": {
        "struct": {
          "generics": {"params": [], "where_predicates": []},
          "impls": [2, 4, 6, 8, 10, 12, 15, 19, 23, 26, 31, 36, 39],
          "kind": {"tuple": [null]}
        }
      },
      "name": "Foo",
    }

CC @jdonszelmann @obi1kenobi

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 26, 2025
@aDotInTheVoid aDotInTheVoid added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. A-attributes Area: Attributes (`#[…]`, `#![…]`) A-rustdoc-json Area: Rustdoc JSON backend and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 26, 2025
@aDotInTheVoid
Copy link
Member Author

So there's basically two options here:

  1. Go back to how things used to be, and return attrs as a Vec<String> of how they're represented in the surface syntax.

    This would change

    #[repr(C)]
    #[repc(align(8)]
     pub struct Foo {}

    to

    #[repr(C, align(8))]
     pub struct Foo {}

    but that's fine

  2. Expose attributes in a more structured way in rustdoc-json-types. Currently clean::Attributes wraps a list of hir::Attribute (modulo some stuff for doc-comments). This exposes both parsed and unparsed attributes. We could do something similar.

Doing option 2 would require a lot more design work, but is probably worthwhile (even if we do option 1 initially). Some questions:

  • How do we want to deal with #[cfg], which for rustc are gone by HIR, but rustdoc has a kinda seperate treatment?
  • Do we care to persist stuff like #[inline] at all?
  • How do we treat stuff that exists in the interface between std and rustc that isn't exposed to normal crates? Should we include #[unstable] attrs? How about #[rustc_promotable]?
  • How do we deal with tool attributes that rustc knows nothing of?

@jdonszelmann
Copy link
Contributor

One advantage of option 1 is that it also improves hir pretty printing. I personally think having them available in a nice and structured way is pretty neat, and hir is not required to look nice. Furthermore, it's also a lot of code (I initially tried). But it in an advantage at least

@jdonszelmann
Copy link
Contributor

Tools attrs are still accessible as tokens. If you try to hir pretty print they even still look like they used to. So that would fit in with option 1 I guess

@jdonszelmann
Copy link
Contributor

@obi1kenobi are there any other attrs except repr you would care about with semver checks? Although I dont think thats the only reason to expose things like this (I'd be in favor of exposing things like inline) it'd be interesting to know if that's literally the only one right now.

@obi1kenobi
Copy link
Member

I'm in favor of Option 1 initially, and Option 2 in the long run as a nice-to-have. We already have code that does attribute parsing, so while I'd love to eliminate it eventually, that isn't really a priority. Option 1 sounds like it can be done with less work, and speed is good :)

@obi1kenobi are there any other attrs except repr you would care about with semver checks? Although I dont think thats the only reason to expose things like this (I'd be in favor of exposing things like inline) it'd be interesting to know if that's literally the only one right now.

No, unfortunately we need many more. At minimum (probably not an exhaustive list): #[non_exhaustive], #[automatically_derived], #[must_use], #[doc(hidden)], #[deprecated], #[no_mangle] and #[export_name] (plus their new unsafe variants like #[unsafe(no_mangle)]) , #[repr(packed)], #[repr(C)], #[repr(transparent)], enum integer reprs like #[repr(i8)].

  • How do we want to deal with #[cfg], which for rustc are gone by HIR, but rustdoc has a kinda seperate treatment?

If we can keep #[cfg] somehow, that would enable new lints like "this API is no longer available on this feature/OS/architecture etc." Currently, users have to build rustdoc JSON for each feature combo they want to check, on each platform they want to check, which is fine but could obviously be improved.

  • Do we care to persist stuff like #[inline] at all?

If we can keep it, it'd be great! There are some cases where in the future a warning may be desirable as a lint if e.g. #[inline] was removed from some public API where it might affect optimization. I don't recall the #[inline] implementation details off the top of my head enough to flesh this out further.

Not a priority today though.

  • How do we treat stuff that exists in the interface between std and rustc that isn't exposed to normal crates? Should we include #[unstable] attrs? How about #[rustc_promotable]?

I have no opinion on this, since none of cargo-semver-checks relies on it today. If we can present the attributes as-is (i.e. as pretty-printed Rust code, the status quo ante of other attributes) without a ton of work, I guess that'd be nice.

  • How do we deal with tool attributes that rustc knows nothing of?

It's possible cargo-semver-checks might get its own tool attribute in the near future, so preserving tool attributes in as-is form (i.e. as pretty-printed Rust code, the status quo ante of other attributes) would be nice if possible. Users have expressed a desire to opt out of specific lints for specific types or modules, and tool attributes in rustdoc JSON would make that much easier to implement.

@obi1kenobi
Copy link
Member

Thank you both for looking into this issue and helping address it! ❤

@jdonszelmann
Copy link
Contributor

for the record, although I played devil's advocate a little and posted some arguments for option, I think option 2 gives a lot more flexibility and although I agree you already have parsing for option 1, with option 2 you potentially won't need any parsing at all nor any other tool which I think is very valuable

@obi1kenobi
Copy link
Member

I agree option 2 would be nice! But my top priority is having whichever option we can ship faster :)

@obi1kenobi
Copy link
Member

I've been triaging the impact to cargo-semver-checks and taking the opportunity to improve our test coverage. As far as I can tell, the only thing that's broken for us right now is #[repr(..)] attributes. The other attributes we use seem to be unaffected by the recent change, AFAICT.

I noticed that rustdoc already has some machinery for pretty-printing #[repr(..)] attributes that appears to not be currently used by the JSON path (see Item::attributes()). We might be able to reuse that code path on the JSON side, fixing the #[repr(..)] part of the issue in a reasonably non-invasive manner. This won't address the representation of other attributes (so it won't fully resolve this issue), but would be sufficient to unblock cargo-semver-checks.

Vibe check on switching to using the existing #[repr(..)] pretty-printing machinery on the JSON side? 👀

@aDotInTheVoid
Copy link
Member Author

aDotInTheVoid commented Mar 4, 2025

rustdoc already has some machinery for pretty-printing #[repr(..)] attributes that appears to not be currently used by the JSON path (see Item::attributes())

We go through clean::Item::attributes, but i don't think we enable keap_as_is so don't hit that branch.

let attrs = item.attributes(self.tcx, self.cache(), true);

Vibe check on switching to using the existing #[repr(..)] pretty-printing machinery on the JSON side? 👀

Seems like a good approach for now, even if eventually we move to something more structured. It'd require careful explaining in the docs, but that's fine, they're not good here at the moment and I'm happy to accept incremental improvements that aren't perfect.

While doing this, it's probably a good idea to add tests for the non-#[repr] attributes we emit.

@obi1kenobi
Copy link
Member

rustdoc already has some machinery for pretty-printing #[repr(..)] attributes that appears to not be currently used by the JSON path (see Item::attributes())

We go through clean::Item::attributes, but i don't think we enable keap_as_is

Apologies if I'm very mistaken here: is the true at the call site you linked not the keep_as_is value in Item::attributes()?

While doing this, it's probably a good idea to add tests for the non-#[repr] attributes we emit.

Based on my read of rustc_hir_pretty, the other publicly-available ("not rustc-internal, could be used by any crates.io library") attributes that may have been impacted here are #[deprecated] and attributes in the diagnostic namespace, like #[diagnostic::on_unimplemented].

What might the tests for such attributes test? Just pin down the existing behavior ("Debug-like output"), or something else?

Also, is it necessary to add those tests as part of the same work unit that changes #[repr(..)], or can that be separate? 👀

@aDotInTheVoid
Copy link
Member Author

true at the call site you linked not the keep_as_is value in Item::attributes()?

D'oh, I got the negitives the wrong way round. rustdoc-json currently enables keep_as_is, so doesn't hit the branch listed.

What might the tests for such attributes test? Just pin down the existing behavior ("Debug-like output"), or something else.

Yeah. Also the handling of when we do & don't show #[repr(tranparent)]. I know the HTML has some heuristics as to weather this is "public" and should be shown. Honestly can't remember what we do in JSON.

Probably also worth having tests for attibutes we don't pass through. (e.g. #[inline], #[no_mangle], #[link_section], etc)

Also, is it necessary to add those tests as part of the same work unit that changes #[repr(..)], or can that be separate? 👀

It'd be ok for them to be separate. The """best""" way would be to land the tests first, then change repr in a latter pr (that adjusts previous tests). But requiring that would be making contributors clean up our incomplete test-suite before they can land the fixes they actually care about. That's not right, so it's fine to submit in whatever order makes sense for you.

@obi1kenobi
Copy link
Member

I very quickly threw together two draft PRs:

These are my first substantive attempts at code contributions in rustc, so while I've read the dev guide, I'd like to apologize in advance for my unfamiliarity with the codebase. There may be better ways of doing things that I've missed!

If at a glance the changes seem to be in an acceptable direction, I'm happy to do more work to get them into a merge-ready state. I'd love some pointers as to the testing norms for rustc_hir_pretty in particular: I'm not sure whether/where/how that code is currently tested.

@obi1kenobi
Copy link
Member

obi1kenobi commented Mar 4, 2025

Yeah. Also the handling of when we do & don't show #[repr(tranparent)]. I know the HTML has some heuristics as to weather this is "public" and should be shown. Honestly can't remember what we do in JSON.

My read of the code as it currently stands is that rustdoc JSON always shows #[repr(transparent)] when set. Personally, I find this less useful than the HTML behavior of only showing it when the transparency is public API, and I'd be in favor of changing to that behavior. It's currently impossible to use rustdoc JSON to decide if #[repr(transparent)] is public or not (1-ZST info isn't part of rustdoc JSON) so cargo-semver-checks has to use a heuristic here instead. Adopting the HTML behavior for JSON as well would let us rely on an authoritative answer of whether it's public or not, eliminating the need for the heuristic.

Probably also worth having tests for attibutes we don't pass through. (e.g. #[inline], #[no_mangle], #[link_section], etc)

#[no_mangle] and #[export_name] are passed through, together with their edition 2024 unsafe() variants. They are tested in the cargo-semver-checks / trustfall-rustdoc-adapter test suites, since they are relevant to SemVer over the library ABI.

AFAICT, they do not appear to have suffered any changes as a result of the recent attributes change.

I'm not sure about whether #[link_section] or #[inline] are passed through. I don't actually know what #[link_section] does, but if it's SemVer-relevant to the library ABI, it should probably be passed through.

It'd be ok for them to be separate. The """best""" way would be to land the tests first, then change repr in a latter pr (that adjusts previous tests). But requiring that would be making contributors clean up our incomplete test-suite before they can land the fixes they actually care about. That's not right, so it's fine to submit in whatever order makes sense for you.

I'd love to unbreak #[repr(..)] first. Time permitting I'm happy to contribute more test coverage in separate PRs.

Also, if you're okay with it: there's a cargo-semver-checks contributor who I believe would enjoy the opportunity to contribute and who I believe would do an excellent job. I'd like to point them in the direction of the test coverage question here 👀

workingjubilee added a commit to workingjubilee/rustc that referenced this issue Mar 5, 2025
…trs, r=compiler-errors

Pretty-print `#[deprecated]` attribute in HIR.

Pretty-print `#[deprecated]` attribute in a form closer to how it might appear in Rust source code, rather than using a `Debug`-like representation.

Consider the following Rust code:
```rust
#[deprecated]
pub struct PlainDeprecated;

#[deprecated = "here's why this is deprecated"]
pub struct DirectNote;

#[deprecated(since = "1.2.3", note = "here's why this is deprecated")]
pub struct SinceAndNote;
```

Here's the previous output:
```
#[attr="Deprecation{deprecation: Deprecation{since: Unspecifiednote:
suggestion: }span: }")]
struct PlainDeprecated;

#[attr="Deprecation{deprecation: Deprecation{since: Unspecifiednote:
here's why this is deprecatedsuggestion: }span: }")]
struct DirectNote;

#[attr="Deprecation{deprecation: Deprecation{since: NonStandard(1.2.3)note:
here's why this is deprecatedsuggestion: }span: }")]
struct SinceAndNote;
```

Here's the new output:
```rust
#[deprecated]
struct PlainDeprecated;

#[deprecated = "here's why this is deprecated"]
struct DirectNote;

#[deprecated(since = "1.2.3", note = "here's why this is deprecated"]
struct SinceAndNote;
```

Also includes a test for `#[diagnostic::(..)]` attributes, though their behavior is not changed here. I already wrote the test, so I figured it probably won't hurt to have it.

Related to discussion in rust-lang#137645.

r? `@jdonszelmann`
@obi1kenobi
Copy link
Member

For any folks following along:

  • rustdoc: Add attribute-related tests for rustdoc JSON. #138033 adds test coverage without changing any implementation details.
  • Pretty-print #[deprecated] attribute in HIR. #138019 causes #[deprecated] to be pretty-printed instead of Debug-printed. When it merges, I'll add some rustdoc JSON tests for it as well.
  • rustdoc: Use own logic to print #[repr(..)] attributes in JSON output. #138018 causes #[repr(..)] to be pretty-printed in rustdoc JSON and bumps the JSON format version. It switches to using rustdoc's existing code path for repr(..) handling, which is already used outside of the JSON backend. I added test coverage for all the combinations of #[repr(..)] attributes I could think of and know how to use. (I don't know what #[repr(simd)] is nor how to use it properly.)
  • While attempting to write a test for #[automatically_derived], I hit some bugs in the JSON query library that rustdoc JSON tests use. I spent several hours trying to craft a valid workaround without success. I will likely need some help to move past that, I'm definitely stuck at the moment.

#[automatically_derived] isn't currently Debug-printed, so assuming the other PRs merge without issues, cargo-semver-checks's use cases should be all set again.

rust-timer added a commit to rust-lang-ci/rust that referenced this issue Mar 5, 2025
Rollup merge of rust-lang#138019 - obi1kenobi:pg/pretty-print-more-attrs, r=compiler-errors

Pretty-print `#[deprecated]` attribute in HIR.

Pretty-print `#[deprecated]` attribute in a form closer to how it might appear in Rust source code, rather than using a `Debug`-like representation.

Consider the following Rust code:
```rust
#[deprecated]
pub struct PlainDeprecated;

#[deprecated = "here's why this is deprecated"]
pub struct DirectNote;

#[deprecated(since = "1.2.3", note = "here's why this is deprecated")]
pub struct SinceAndNote;
```

Here's the previous output:
```
#[attr="Deprecation{deprecation: Deprecation{since: Unspecifiednote:
suggestion: }span: }")]
struct PlainDeprecated;

#[attr="Deprecation{deprecation: Deprecation{since: Unspecifiednote:
here's why this is deprecatedsuggestion: }span: }")]
struct DirectNote;

#[attr="Deprecation{deprecation: Deprecation{since: NonStandard(1.2.3)note:
here's why this is deprecatedsuggestion: }span: }")]
struct SinceAndNote;
```

Here's the new output:
```rust
#[deprecated]
struct PlainDeprecated;

#[deprecated = "here's why this is deprecated"]
struct DirectNote;

#[deprecated(since = "1.2.3", note = "here's why this is deprecated"]
struct SinceAndNote;
```

Also includes a test for `#[diagnostic::(..)]` attributes, though their behavior is not changed here. I already wrote the test, so I figured it probably won't hurt to have it.

Related to discussion in rust-lang#137645.

r? `@jdonszelmann`
@obi1kenobi
Copy link
Member

Heads up that after Monday of next week, I'll be travelling for a little while and won't be able to drive this work forward. I'm happy to promptly make any changes required to get #138033 and #138018 over the finish line before then, if that's feasible. #138019 was reverted, but isn't load-bearing for cargo-semver-checks, merely nice-to-have, so I'm not that worried about it.

If we're not able to get this over the finish line before then, then for the avoidance of any doubt, let me explicitly opt into allowing @aDotInTheVoid and @GuillaumeGomez (or anyone they designate) to take over my open PRs, alter them as needed, and move them forward. I don't care whether my name is on the commit that fixes the problem, merely that the fix gets shipped :)

@aDotInTheVoid
Copy link
Member Author

While attempting to write a test for #[automatically_derived]

Oof. See #110406. I think this is blocked on improving the test harness. Sorry!

. Adopting the HTML behavior for JSON as well would let us rely on an authoritative answer of whether it's public or not, eliminating the need for the heuristic.

I'm not sure rustdoc HTML's behavior is truly authoritative here, it also seems more of a heuristic to capture "authorial intent". I think we should always pass it through, and try to make sure consumers have enough information to make those decisions for themselves.

Also, if you're okay with it: there's a cargo-semver-checks contributor who I believe would enjoy the opportunity to contribute and who I believe would do an excellent job. I'd like to point them in the direction of the test coverage question here 👀

More than happy for you do to this. Tell them I apologize in advance for the state of things 😅

jhpratt added a commit to jhpratt/rust that referenced this issue Mar 8, 2025
…aDotInTheVoid

rustdoc: Add attribute-related tests for rustdoc JSON.

Add rustdoc JSON tests covering the use of the following attributes:
- `#[non_exhaustive]` applied to enums, variants, and structs
- `#[must_use]`, both with and without a message
- `#[no_mangle]`, in both edition 2021 and 2024 (`#[unsafe(no_mangle)]`) flavors
- `#[export_name]`, also in both edition 2021 and 2024 flavors

Related to rust-lang#137645; this is a subset of the attributes that `cargo-semver-checks` relies on and tests in its own test suite or in the test suites of its components such as `trustfall-rustdoc-adapter`.

Helps with rust-lang#81359

r? `@aDotInTheVoid`
@obi1kenobi
Copy link
Member

This is what the nomicon says about repr(transparent):

This repr is only considered part of the public ABI of a type if either the single field is pub, or if its layout is documented in prose. Otherwise, the layout should not be relied upon by other crates.

AFAICT this is the same rule that the HTML behavior implements. cargo-semver-checks attempts to approximate this rule, but since we don't know in general which field(s) are 1-ZST (no info on that in the JSON), we currently miss some breakage — the lint is written to have false-negatives instead of false-positives.

Having rustdoc JSON list #[repr(transparent)] only if it's public API would also be consistent with how #[doc(hidden)] pub items are not shown in rustdoc JSON. It would also make cargo-semver-checks better at its job, with less work :)

I don't feel strongly about this. If you think it's best to always include #[repr(transparent)], that's fine by me — I'd much prefer this issue closed in any form, than any specific resolution of it. I just wanted to point out that we have a relatively strong and explicit rule to lean on here, not just a heuristic about authorial intent.

rust-timer added a commit to rust-lang-ci/rust that referenced this issue Mar 8, 2025
Rollup merge of rust-lang#138033 - obi1kenobi:pg/json-attrs-tests, r=aDotInTheVoid

rustdoc: Add attribute-related tests for rustdoc JSON.

Add rustdoc JSON tests covering the use of the following attributes:
- `#[non_exhaustive]` applied to enums, variants, and structs
- `#[must_use]`, both with and without a message
- `#[no_mangle]`, in both edition 2021 and 2024 (`#[unsafe(no_mangle)]`) flavors
- `#[export_name]`, also in both edition 2021 and 2024 flavors

Related to rust-lang#137645; this is a subset of the attributes that `cargo-semver-checks` relies on and tests in its own test suite or in the test suites of its components such as `trustfall-rustdoc-adapter`.

Helps with rust-lang#81359

r? `@aDotInTheVoid`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Area: Attributes (`#[…]`, `#![…]`) A-rustdoc-json Area: Rustdoc JSON backend T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants