Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary fields from needs info #996

Closed
chrisjsewell opened this issue Aug 26, 2023 · 9 comments
Closed

Remove unnecessary fields from needs info #996

chrisjsewell opened this issue Aug 26, 2023 · 9 comments
Assignees
Labels

Comments

@chrisjsewell
Copy link
Member

chrisjsewell commented Aug 26, 2023

There are fields that are are stored, for each need, in the env.needs_all_needs that are maybe not really necessary for further processing. These add to the size of the needs.json.

For example, jinja_content, template, pre_template and post_template are used to generate content, pre_content, and post_content in

# Template builds

content, pre_content, and post_content are then used to generate AST.

After this none of these fields are required, so I feel they should not be stored on env.needs_all_needs`

@chrisjsewell chrisjsewell self-assigned this Aug 26, 2023
@chrisjsewell
Copy link
Member Author

Thoughts @danwos?

@PhilipPartsch
Copy link
Contributor

I use always the templates. So i even check the templates for debugging or if the templates differ by a supplied needs.json.

I belive it is easier for a user to reduce the data fieldes of a json file in a post processing (reduce step in database Methodology) than we offer many configuration parameter to configure the output.

@chrisjsewell
Copy link
Member Author

So i even check the templates for debugging

Heya, I can understand you may want to use this for debugging, but then I feel this can be provided by a config flag (to include such "debug" data)

The problem is, particularly for commercial use cases, there can be many 1000s of needs, and (currently) all this data is loaded in memory, making sphinx-needs have a very high memory overhead. This is magnified if you try to run sphinx in multi-processing mode, since this data is replicated across all processes.

I belive it is easier for a user to reduce the data fieldes of a json file in a post processing

Also to note, there is already a lot of data fields removed from the internal representation before it is output to the needs.json.

@PhilipPartsch
Copy link
Contributor

I know this use case, i‘m working in really big projects. As the templates are normal options, it is necassary to get them out of sohinx-needs. So if a user wants to remove options from needs.json, i would suggest to use a post processing script for this.

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Aug 27, 2023

So if a user wants to remove options from needs.json

Just to note, I'm not a user, I'm a new developer at useblocks (although with a lot of experience in the space 😄). So expect to see me doing a lot more development and maintenance of the code base.

This is definitely helpful feedback thanks; that the template field is being used, and I would be happy to meet, to understand further how you use sphinx-needs.

@danwos
Copy link
Member

danwos commented Aug 28, 2023

It's not so easy to remove fields from needs-data, as you never know the use cases and if the data may be used for filtering needs.

But I also have an eye on memory consumption, so totally understand the motivation.
Maybe it would be good to know the benefit of such a removal in terms of memory consumption.
So we should get some states about how big our needs_all_needs dict is, and maybe make a detailed analysis of a single need. Sounds like a task for Sphinx-Performance :)

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Aug 28, 2023

Another thing I've just realised: links_back is ommited from the needs.json:

However, for any user defined links (from needs_extra_links), there back links are included in needs.json.
This is definitely a discontinuity, and so I would suggest these should be omitted also
(this also relates to #997)

@arwedus
Copy link
Contributor

arwedus commented Dec 6, 2023

@danwos : I upvote the suggestion in comment #996 (comment), as every little bit that gets us a smaller needs.json is good.

@chrisjsewell
Copy link
Member Author

@arwedus actually this was already done in #1053 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants