Does it really need index.json? #336

igorshubovych · 2015-11-24T15:34:41Z

pages/index.json is a binary file which contains index of pages.

There are several issues with it:

It makes contribution process more complicated. Everybody needs to install more software and rebuild index on every commit. Even though it is done automatically, the committed binary file from time to time cause conflicts.
It forces the way clients has to implement the index. Currently it is not a problem. But imagine one day there will be a web client which stores all the information in the database and reuse database index. In such case index.json is not needed. Or imagine, some clients may need a more sophisticated index, e.g. full-text search index.
Not all clients use it anyway, e.g. Python client, Go client, C++ client, Elixir client, Ruby client.

There are several solutions to that:

Leave it as is
We can have some service, which function as CI: it fetches the updated pages, built index, package into zip and put the file somewhere, clients will fetch the zip file from this service to instead of Github API. Technically we can use Github pages and Travis for it.
Remove index.json from github. Modify those several clients which are using it to build index after updating the pages. It should be pretty simple based on this code.

What do you think?

lord63 · 2015-11-25T01:39:17Z

Not all clients use it anyway

That's because the clients came out first, and then the index.json. And some clients don't update to use them. The new python client and the node.js client are using index.json

When I develop the python client, I think the index.json is a good idea. It's help the program find out whether we support this command and where is the command.

It makes contribution process more complicated

Hmmm, that's ture, it may be annoying sometimes. That's because the index.json need to be updated every time when there is a new command. But only modify the pages won't make any difference. In my opinion, it should be ok.

Or, we can come out a better solution?

rprieto · 2015-11-25T06:39:21Z

That's a very good question. I agree with @lord63, there's definitely value in having it because it simplifies some clients, and could even allows for new features (e.g. search, autocomplete...). Unfortunately some clients can't easily generate metadata because they don't download all pages, either for simplicity reasons or because it's not practical. For example @Ostera's http://www.ostera.io/tldr.js/ probably wouldn't want to download and process all tldr pages on page load :)

That being said I agree with @igorshubovych that the current index.json is also quite specific to certain requirements. I agree it's a good idea to move it to a new https://github.com/tldr-pages/metadata repo for example. The repo can have all the scripts required to build index.json, and could actually contain index.json too. We can run that on a cron (every day) or whenever there's a commit on https://github.com/tldr-pages/tldr.

Note that also currently run a linter on each commit (requires Ruby, etc...). I'm very keen to keep the linter as part of this repo. However since it doesn't create any files & simply reports success/failure, we could remove the pre-commit hook and just have Travis report the status of linting.

This will mean contributors no longer need Ruby to contribute, but we still get nice linting happening, and the index being rebuilt behind the scenes. Thoughts?

waldyrious · 2015-11-25T07:46:35Z

Doesn't Travis or some other github hook / integration support this kind of automatic processing? (I'm asking, I don't know). If we could have some sort of bot or post-commit hook running every time someone makes a commit, we wouldn't need a separate metadata repository.

Alternatively (as I've suggested elsewhere), we could simply set up a Travis testing script with informative error messages when something's missing, like the index edit. That way contributors would be informed of the necessary steps to fix their PR (but in an automated manner, rather than as a comment from another human being, which may feel more like nitpicking), and the repo could even be configured to now allow merging of PRs unless the Travis tests pass :)

rprieto · 2015-11-25T08:11:44Z

Good question, I've never tried creating files / committing to the same repo with Travis. Maybe it works well, hopefully it doesn't re-trigger the build in an infinite loop :) Not sure what is means for access as well, i.e. Travis would need write access to everyone's branches. Definitely worth investigating though.

The nice thing if we can get Travis / something else to build index.json is that contributors don't need Ruby installed (since we've had a few issues recently where people strugged to setup the gem path etc). They don't even need to clone the repo since you can make changes / open PRs from the Github UI if it's small enough.

leostera · 2015-11-25T09:04:04Z

👍 to @waldyrious, worst case we can use a separate branch index in order not to pollute the master branch history.

If Travis really only checks PR's to master, once they're merged, we need a github post-receive webhook to trigger an index build, commit, and push to the index branch – effectively updating the index for every PR available. It might be even possible to have this process in one of Travis' hooks (https://docs.travis-ci.com/user/customizing-the-build/#The-Build-Lifecycle)

I'll try to have a look at this over the weekend, but ping me if you know of anything.

rprieto · 2015-11-25T09:12:35Z

I haven't really used Git in this way before, is that a common practice having a branch with completely different data? (i.e. not intended to be merged). I know that's how Github does gh-pages but it always striked me as odd.

igorshubovych · 2015-11-25T17:04:16Z

Travis CI integration for my fork:
https://travis-ci.org/igorshubovych/tldr

Commit:
igorshubovych@31c2e19

igorshubovych · 2015-11-25T17:08:54Z

P.S.
I just noticed that my commit contains index.json. And this is exactly why we do not need it in repo.

igorshubovych · 2015-12-01T03:13:28Z

Some updates on my Travis investigation:

Travis configuration is created
It always lints Markdown files using mdl
On master branch it also rebuild index and create a zip package from pages and index
Need to create static site for TLDR
Need to implement uploading of created zip package

You can see the latest build here:
https://travis-ci.org/igorshubovych/tldr

The code is in travis branch:
https://github.com/igorshubovych/tldr/tree/travis

igorshubovych · 2015-12-02T00:57:33Z

I have some progress on this index.json:

I created static site at http://tldr-pages.github.io/
I set up Travis build currently for my repo https://github.com/igorshubovych/tldr
Travis commits a zip pages file to static site

The results can be seen here:
https://github.com/tldr-pages/tldr-pages.github.io/commits/master

And the pages zip-archive is available here:
http://tldr-pages.github.io/assets/tldr.zip

TODO:

Refactor build script
Merge Travis code into main repo, possibly reintegrate Travis and tldr
It is probably better to start using Git LFS for zip package, overwise repo history of tldr-pages.github.io will quickly get bloated
point Node.JS client to new location

leostera · 2015-12-02T05:19:10Z

Your sir, raise some good points. I like this.
On Wed, Dec 2, 2015 at 1:57 AM Igor Shubovych [email protected]
wrote:

I have some progress on this index.json:

I created static site at http://tldr-pages.github.io/

I set up Travis build currently for my repo
https://github.com/igorshubovych/tldr

Travis commits a zip pages file to static site

The results can be seen here:
https://github.com/tldr-pages/tldr-pages.github.io/commits/master

And the pages zip-archive is available here:
http://tldr-pages.github.io/assets/tldr.zip

TODO:

Refactor build script

Merge Travis code into main repo, possibly reintegrate Travis and
tldr

It is probably better to start using Git LFS for zip package,
overwise repo history of tldr-pages.github.io will quickly get bloated

point Node.JS client to new location

—
Reply to this email directly or view it on GitHub
#336 (comment).

leostera · 2015-12-05T12:08:14Z

The last two items on that TODO List can be turned into discrete issues for any of us to act upon. I'll close this since the conversation appears to have reached an end.

We already have #343, and #344

Ultimately, from the tldr-pages.github.io website we can download either the index.json for clients like mine, or the actual set of pages for clients that implement proper caching – which reminds me I could start a webworker to download the tarball, parse it, and put it into localStorage. Then just repull the index to break cache...sorry, not the issue in discussion.

Closing this! Great work @igorshubovych

This was referenced Dec 3, 2015

Travis CI integration: automatic linting, rebuilding index.json, building pages archive #341

Closed

Travis CI integration: automatic linting, rebuilding index.json, building pages archive [fixed] #342

Merged

leostera closed this as completed Dec 5, 2015

This was referenced Dec 30, 2015

find a way to rebuild automatically the index #487

Closed

Remove the index.json #574

Closed

waldyrious added architecture Organization of the pages per language, platform, etc. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. labels Sep 16, 2016

waldyrious mentioned this issue Mar 11, 2017

Define spec for officially sanctioned clients #1065

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does it really need index.json? #336

Does it really need index.json? #336

igorshubovych commented Nov 24, 2015

lord63 commented Nov 25, 2015

rprieto commented Nov 25, 2015

waldyrious commented Nov 25, 2015

rprieto commented Nov 25, 2015

leostera commented Nov 25, 2015

rprieto commented Nov 25, 2015

igorshubovych commented Nov 25, 2015

igorshubovych commented Nov 25, 2015

igorshubovych commented Dec 1, 2015

igorshubovych commented Dec 2, 2015

leostera commented Dec 2, 2015

leostera commented Dec 5, 2015

Does it really need index.json? #336

Does it really need index.json? #336

Comments

igorshubovych commented Nov 24, 2015

lord63 commented Nov 25, 2015

rprieto commented Nov 25, 2015

waldyrious commented Nov 25, 2015

rprieto commented Nov 25, 2015

leostera commented Nov 25, 2015

rprieto commented Nov 25, 2015

igorshubovych commented Nov 25, 2015

igorshubovych commented Nov 25, 2015

igorshubovych commented Dec 1, 2015

igorshubovych commented Dec 2, 2015

leostera commented Dec 2, 2015

leostera commented Dec 5, 2015