Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reserved word scraper framework. #4

Open
jgornick opened this issue Nov 12, 2013 · 1 comment
Open

Add reserved word scraper framework. #4

jgornick opened this issue Nov 12, 2013 · 1 comment

Comments

@jgornick
Copy link
Owner

For now, the only way to add new words to reservedwordsearch.com is by manually visiting the sources, copying the words, doing some manipulation, and finally creating a platform JSON document.

Most of the sources are links to HTML pages that could easily be parsed. It would be nice to create a framework that would allow users to instead write new scrapers to build the platform documents rather than having to do them by hand.

This could be inspired by the scraper framework used for devdocs:

Reference issues #1 for a wrapper around running scrapers.

@jgornick
Copy link
Owner Author

For JavaScript/node, may want to check out:

https://github.com/rchipka/node-osmosis
https://github.com/lapwinglabs/x-ray

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant