Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve forward compatibility with a notion of minimal modifier set #51

Open
MDLC01 opened this issue Feb 19, 2025 · 5 comments
Open

Improve forward compatibility with a notion of minimal modifier set #51

MDLC01 opened this issue Feb 19, 2025 · 5 comments
Labels
meta Discussion about the structure of this repo proposal This may still need discussion

Comments

@MDLC01
Copy link
Collaborator

MDLC01 commented Feb 19, 2025

Motivation

Currently, when selecting a variant from a set of modifiers, the first variant from the list that contains all the modifiers, and a minimal amount of additional modifiers, is chosen.12 This means using non-fully qualified names when referring to a symbol might cause breakage when Codex is updated. For example, consider the following symbol:

arrow
  .l ←
  .r →
  .r.bar ↦

arrow.bar resolves to arrow.r.bar, which is ↦. Now, suppose a new version of Codex changes the symbol to the following:

arrow
  .l ←
  .l.bar ↤
  .r →
  .r.bar ↦

Now, arrow.bar will resolve to arrow.l.bar, which is ↤.

Essentially, this means adding new variants in the middle of the variant list can cause unexpected breakage. As of now, there is no policy regarding what constitute a breaking change when it comes to fallbacks.

Proposed solution

The intuitive idea of this solution is to make explicit what parts of the fully qualified form (i.e., which modifiers) can be omitted.

As before, each variant has a set of modifiers. Hereafter, we refer to this set of modifiers as the fully qualified form, denoted $M_\text{full}$. Additionally, a variant can define a minimal modifier set, denoted $M_\text{min}$, which is a subset of the fully qualified form (i.e., $M_\text{min} \subseteq M_\text{full}$). When selecting a variant for a set of modifiers $M$, the same process as before is applied, with the additional constraint that the selected variant's minimal modifier set is included in $M$ (i.e., $M_\text{min} \subseteq M$).

The current behavior corresponds to having $M_\text{min} = \emptyset$ for all variants. With this proposal, the default would become $M_\text{min} = M_\text{full}$, with lots of manual overrides to allow for backward compatibility and more leniency.

This improves forward-compatibility by explicitly specifying which fallbacks can be relied on, and which can't. Ideally, there could even be an automated way of detecting breakages. This is currently not feasible, because most breakages are not implicitly guaranteed to be future proof.3

Other benefits

As well as improving forward compatibility, this proposal can be the source of documentation improvements. Indeed, the current documentation45 only presents fully qualified names. For some common symbols, this can be problematic. For example, sym.errorbar.square.stroked can be accessed through simply sym.errorbar, but the documentation does not reflect that. With this proposal, the documentation can present $M_\text{min}$ as the main symbol name and $M_\text{full}$ as the fully qualified name, when $M_\text{min} \neq M_\text{max}$.

As mentioned previously, this proposal would make it possible to detect breakages automatically, because it clarifies which non-fully qualified variants are legal and clearly defined, and which are not.

Footnotes

  1. This is actually not defined in Codex, but in Typst. Making the variant selection part of Codex is the topic of Resolve modifiers #30.

  2. https://github.com/typst/typst/blob/d199546f9fe92b2d380dc337298fdca3e6fca8c8/crates/typst-library/src/foundations/symbol.rs#L387-L420

  3. For example, sym.angle.top currently resolves to sym.angle.spheric.top (⦡), but this is more a side effect of the fact that there is no bare sym.angle.top symbol than a conscious decision, and shouldn't be relied upon.

  4. https://typst.app/docs/reference/symbols/sym/

  5. https://typst.app/docs/reference/symbols/emoji/

@MDLC01 MDLC01 added meta Discussion about the structure of this repo proposal This may still need discussion labels Feb 19, 2025
@T0mstone
Copy link
Collaborator

Syntax idea: .modifier? for a non-required modifier.

@knuesel
Copy link

knuesel commented Feb 21, 2025

I think there are two problems with the current behavior:

  1. The backward-compatibility issue described above

  2. Bad readability of source code: currently it's hard to interpret Typst code such as $ arrow.bar $ without executing the compiler. You basically have to run a whole algorithm in your head:

    1. Consider all variants of arrow that include bar
    2. Keep only those with the minimal number of other modifiers
    3. Pick the first one according to the order in which they are declared in Typst

The above proposal improves on problem 1 (by putting some restrictions on the valid ways of inputting a variant, we make breakage less likely), but the fundamental issue remains...

To really fix the issue, we should require that variants cannot have conflicting definitions: this means that a given set of modifiers cannot match two variants. For example (using @T0mstone's notation), if we have arrow.r?.bar, we could later add arrow.l.bar but not arrow.l?.bar.

Intuitive specification

The whole behavior can be specified intuitively with "aliases":

  1. When defining a variant, modifiers marked with ? are optional so the defining s.x?.y?.zcorresponds to four aliases: s.x.y.z, s.x.z, s.y.z and s.z.
  2. The order of modifiers doesn't matter so s.x.z is the same as s.z.x.
  3. Different variants cannot share an alias.

(These aliases are only used for resolving a variant. It's still a single variant, displayed as a single entry on the symbol page, but the entry would show s.x?.y?.z to document which modifiers can be omitted.)

I think this solves both problems:

  1. backward-compatibility: users can refer to variants only through valid aliases, and when we define a new variant it cannot share an alias with an existing variant.

  2. Readability: if the code says s.x.y and I know a variant that matches this set of modifiers, I know it's the right one. No need to check what other variants exist in case there would be another match.

It also preserves nice properties: modifiers are commutative, users can "build" their symbol by trying modifiers, and they can leave out optional modifiers.

Formal specification

  1. A variant $V$ specifies a set of required modifiers $V_\mathrm{req}$ and a set of optional modifiers $V_\mathrm{opt}$. The set of all valid ways of referring to $V$ is

$$R_V = \{M \ :\ V_\mathrm{req} \subseteq M \subseteq V_\mathrm{req} \cup V_\mathrm{opt}\}$$

  1. It is not allowed for two variants $V_1$ and $V_2$ to share a valid reference:

$$ V_1 \neq V_2 \implies R_{V_1} \cap R_{V_2} = \emptyset. $$

To resolve a set of modifiers $M$, we take the first and only $V$ such that $M \in R_V$.

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Feb 21, 2025

As I originally wrote on Discord, from the user's perspective, your idea is really just a rephrasing of mine, with $M_\text{min} = V_\text{req}$, and $M_\text{full} = V_\text{req} \cup V_\text{opt}$. Nothing prevents us from adding your second constraint to my proposal. In fact, I think we should if we end up implementing it.

Moreover, I think this can be expressed in a simplified way to the user: each variant has required and optional modifiers, which makes it possible to allow using non-fully qualified names when it makes sense; however, no two variants can share the same set of required modifiers in order to prevent ambiguity. There might be some approximations, but this is what most users need to know understand the variant selection system.

In the end, I think what we are discussing here is essentially an implementation detail which would not be observable to the end user.1

Footnotes

  1. The only observable difference that was noted was in the symbol list, but in your proposal, aliases are "still a single variant, displayed as a single entry on the symbol page", so the end result would be the same.

@knuesel
Copy link

knuesel commented Feb 21, 2025

Yes I'm just adding this constraint and proposing another formulation. But the constraint is a bit tricky: "no two variants can share the same set of required modifiers" is not sufficient. For example s.x?.y and s.x.y.z? have different sets of required modifiers, but still they should not be allowed together since they would both match s.x.y.

(I think the alias formulation expresses the constraint correctly and in a way that's more concrete for users, but it's just one possible formulation.)

@MDLC01
Copy link
Collaborator Author

MDLC01 commented Feb 21, 2025

"no two variants can share the same set of required modifiers" is not sufficient.

This is what I meant by "There might be some approximations, but this is what most users need to know understand the variant selection system." Even if the phrasing is not complete (as in, correct, but missing some information), we just need the users to understand the general idea, and they can try the rest by themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta Discussion about the structure of this repo proposal This may still need discussion
Projects
None yet
Development

No branches or pull requests

3 participants