Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it really need index.json? #336

Closed
igorshubovych opened this issue Nov 24, 2015 · 12 comments
Closed

Does it really need index.json? #336

igorshubovych opened this issue Nov 24, 2015 · 12 comments
Labels
architecture Organization of the pages per language, platform, etc. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc.

Comments

@igorshubovych
Copy link
Collaborator

pages/index.json is a binary file which contains index of pages.

There are several issues with it:

  • It makes contribution process more complicated. Everybody needs to install more software and rebuild index on every commit. Even though it is done automatically, the committed binary file from time to time cause conflicts.
  • It forces the way clients has to implement the index. Currently it is not a problem. But imagine one day there will be a web client which stores all the information in the database and reuse database index. In such case index.json is not needed. Or imagine, some clients may need a more sophisticated index, e.g. full-text search index.
  • Not all clients use it anyway, e.g. Python client, Go client, C++ client, Elixir client, Ruby client.

There are several solutions to that:

  • Leave it as is
  • We can have some service, which function as CI: it fetches the updated pages, built index, package into zip and put the file somewhere, clients will fetch the zip file from this service to instead of Github API. Technically we can use Github pages and Travis for it.
  • Remove index.json from github. Modify those several clients which are using it to build index after updating the pages. It should be pretty simple based on this code.

What do you think?

@lord63
Copy link
Contributor

lord63 commented Nov 25, 2015

Not all clients use it anyway

That's because the clients came out first, and then the index.json. And some clients don't update to use them. The new python client and the node.js client are using index.json

When I develop the python client, I think the index.json is a good idea. It's help the program find out whether we support this command and where is the command.

It makes contribution process more complicated

Hmmm, that's ture, it may be annoying sometimes. That's because the index.json need to be updated every time when there is a new command. But only modify the pages won't make any difference. In my opinion, it should be ok.

Or, we can come out a better solution?

@rprieto
Copy link
Contributor

rprieto commented Nov 25, 2015

That's a very good question. I agree with @lord63, there's definitely value in having it because it simplifies some clients, and could even allows for new features (e.g. search, autocomplete...). Unfortunately some clients can't easily generate metadata because they don't download all pages, either for simplicity reasons or because it's not practical. For example @Ostera's http://www.ostera.io/tldr.js/ probably wouldn't want to download and process all tldr pages on page load :)

That being said I agree with @igorshubovych that the current index.json is also quite specific to certain requirements. I agree it's a good idea to move it to a new https://github.com/tldr-pages/metadata repo for example. The repo can have all the scripts required to build index.json, and could actually contain index.json too. We can run that on a cron (every day) or whenever there's a commit on https://github.com/tldr-pages/tldr.

Note that also currently run a linter on each commit (requires Ruby, etc...). I'm very keen to keep the linter as part of this repo. However since it doesn't create any files & simply reports success/failure, we could remove the pre-commit hook and just have Travis report the status of linting.

This will mean contributors no longer need Ruby to contribute, but we still get nice linting happening, and the index being rebuilt behind the scenes. Thoughts?

@waldyrious
Copy link
Member

Doesn't Travis or some other github hook / integration support this kind of automatic processing? (I'm asking, I don't know). If we could have some sort of bot or post-commit hook running every time someone makes a commit, we wouldn't need a separate metadata repository.

Alternatively (as I've suggested elsewhere), we could simply set up a Travis testing script with informative error messages when something's missing, like the index edit. That way contributors would be informed of the necessary steps to fix their PR (but in an automated manner, rather than as a comment from another human being, which may feel more like nitpicking), and the repo could even be configured to now allow merging of PRs unless the Travis tests pass :)

@rprieto
Copy link
Contributor

rprieto commented Nov 25, 2015

Good question, I've never tried creating files / committing to the same repo with Travis. Maybe it works well, hopefully it doesn't re-trigger the build in an infinite loop :) Not sure what is means for access as well, i.e. Travis would need write access to everyone's branches. Definitely worth investigating though.

The nice thing if we can get Travis / something else to build index.json is that contributors don't need Ruby installed (since we've had a few issues recently where people strugged to setup the gem path etc). They don't even need to clone the repo since you can make changes / open PRs from the Github UI if it's small enough.

@leostera
Copy link
Contributor

👍 to @waldyrious, worst case we can use a separate branch index in order not to pollute the master branch history.

If Travis really only checks PR's to master, once they're merged, we need a github post-receive webhook to trigger an index build, commit, and push to the index branch – effectively updating the index for every PR available. It might be even possible to have this process in one of Travis' hooks (https://docs.travis-ci.com/user/customizing-the-build/#The-Build-Lifecycle)

I'll try to have a look at this over the weekend, but ping me if you know of anything.

@rprieto
Copy link
Contributor

rprieto commented Nov 25, 2015

I haven't really used Git in this way before, is that a common practice having a branch with completely different data? (i.e. not intended to be merged). I know that's how Github does gh-pages but it always striked me as odd.

@igorshubovych
Copy link
Collaborator Author

Travis CI integration for my fork:
https://travis-ci.org/igorshubovych/tldr

Commit:
igorshubovych@31c2e19

@igorshubovych
Copy link
Collaborator Author

P.S.
I just noticed that my commit contains index.json. And this is exactly why we do not need it in repo.

@igorshubovych
Copy link
Collaborator Author

Some updates on my Travis investigation:

  • Travis configuration is created
  • It always lints Markdown files using mdl
  • On master branch it also rebuild index and create a zip package from pages and index
  • Need to create static site for TLDR
  • Need to implement uploading of created zip package

You can see the latest build here:
https://travis-ci.org/igorshubovych/tldr

The code is in travis branch:
https://github.com/igorshubovych/tldr/tree/travis

@igorshubovych
Copy link
Collaborator Author

I have some progress on this index.json:

The results can be seen here:
https://github.com/tldr-pages/tldr-pages.github.io/commits/master

And the pages zip-archive is available here:
http://tldr-pages.github.io/assets/tldr.zip

TODO:

  • Refactor build script
  • Merge Travis code into main repo, possibly reintegrate Travis and tldr
  • It is probably better to start using Git LFS for zip package, overwise repo history of tldr-pages.github.io will quickly get bloated
  • point Node.JS client to new location

@leostera
Copy link
Contributor

leostera commented Dec 2, 2015

Your sir, raise some good points. I like this.
On Wed, Dec 2, 2015 at 1:57 AM Igor Shubovych [email protected]
wrote:

I have some progress on this index.json:

The results can be seen here:
https://github.com/tldr-pages/tldr-pages.github.io/commits/master

And the pages zip-archive is available here:
http://tldr-pages.github.io/assets/tldr.zip

TODO:

  • Refactor build script
  • Merge Travis code into main repo, possibly reintegrate Travis and
    tldr
  • It is probably better to start using Git LFS for zip package,
    overwise repo history of tldr-pages.github.io will quickly get bloated
  • point Node.JS client to new location


Reply to this email directly or view it on GitHub
#336 (comment).

@leostera
Copy link
Contributor

leostera commented Dec 5, 2015

The last two items on that TODO List can be turned into discrete issues for any of us to act upon. I'll close this since the conversation appears to have reached an end.

We already have #343, and #344

Ultimately, from the tldr-pages.github.io website we can download either the index.json for clients like mine, or the actual set of pages for clients that implement proper caching – which reminds me I could start a webworker to download the tarball, parse it, and put it into localStorage. Then just repull the index to break cache...sorry, not the issue in discussion.

Closing this! Great work @igorshubovych

@leostera leostera closed this as completed Dec 5, 2015
@waldyrious waldyrious added architecture Organization of the pages per language, platform, etc. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. labels Sep 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Organization of the pages per language, platform, etc. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc.
Projects
None yet
Development

No branches or pull requests

5 participants