Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Inaccuracies in PCF #4237

Closed
4 of 9 tasks
azizabah opened this issue Apr 24, 2020 · 11 comments · Fixed by #4241
Closed
4 of 9 tasks

Data Inaccuracies in PCF #4237

azizabah opened this issue Apr 24, 2020 · 11 comments · Fixed by #4241
Labels
bug community Community Raised Issue
Milestone

Comments

@azizabah
Copy link

Frontend Deployment type

  • Cloud Foundry Application (cf push)
  • Kubernetes, using a helm chart
  • Docker, single container deploying all components
  • npm run start
  • Other (please specify below)

Backend (Jet Stream) Deployment type

  • Cloud Foundry Application (cf push)
  • Kubernetes, using a helm chart
  • Docker, single container deploying all components
  • Other (please specify below)

Deployed the all-in-one-container to PCF using CF Push.

Expected behaviour

I expected the metrics for things like running applications and memory usage to match what I see in Pivotal's Apps Manager product.

Actual behaviour

The metrics are drastically different. For example in one of our org-space combos, we have 219 Apps deployed, 117 of which are running, using up ~243 GB out of a 300 GB quota.

When I look at the same org-space combo in Stratos, I see only 26 applications using ~29 GB out of a 300 GB quota.

Steps to reproduce the behavior

Deploy container into a larger PCF foundation and examine irregularities.

Context

This level of inaccuracy would be a deal breaker on trying to replace Apps Manager with Stratos for interacting / observing PCF foundations.

Possible Implementation

@richard-cox
Copy link
Contributor

@azizabah Thanks for raising this issue! To confirm this is the Memory stat in the Organisation's Space List / Memory Usage card in the Space summary page? For app memory consumption we show a cumulative total of all running app's memory consumption by instance count as stated by CF. If CF reports these are stopped they don't appear in our total. Could this be the possible discrepancy?
I'm not sure how Apps Manager reports memory, do you have any more information on how they create theirs?
Was it just memory effected or were there other app bases states that were different?

@richard-cox richard-cox added the community Community Raised Issue label Apr 27, 2020
@azizabah
Copy link
Author

@richard-cox - I don't believe there's an issue with their Apps Manager since it aligns to what I see when I execute queries from the CF CLI. These two screenshots show the discrepancies I'm seeing for the exact same org-space-foundation combos.
stratos
pivotal apps manager

@richard-cox
Copy link
Contributor

Thanks for the screen shots, it looks like Stratos has failed to fetch all apps. Are there any errors reported by Stratos or in the Developer Console? Could I also confirm which version of Stratos you're running (User Icon --> About)? I'll look into it.
If possible, could you also provide the CLI queries you're running to validate the totals?

@azizabah
Copy link
Author

The only error I see in the logs is the initial sign in 401.

{"time":"2020-04-27T15:25:22.981900928Z","level":"ERROR","prefix":"echo","file":"main.go","line":"1071","message":"code=401, message={\"error\":\"User session could not be found\"}"}

When I look at Developer Console in Chrome while the Cloud Foundry -> Summary and then the specific ci-org pages, I don't see any errors and requests all seem to come back with 200's.

I'm running Stratos 3.1.0.

This is what I ran in the CF CLI and then just looked at line counts. It aligned to what I saw in the Pivotal Apps Manager UI.

cf apps | grep started > results.text

@richard-cox
Copy link
Contributor

Ok, think I understand the issue now. Is is possible that your user can see over 600 apps across all orgs? In 3.1 we've brought in a better way to handle CFs with a lot of entities (apps, orgs, spaces, etc). This process should only apply to a limited set of lists, however I think it's being applied when we fetch apps to display stats such as memory. If this is the case, applying a higher number at which this process kicks in should work. If possible could you set the stratos cf app env var UI_LIST_MAX_SIZE to something higher than the number of applications available to you (and restart)?

@azizabah
Copy link
Author

@richard-cox I updated the UI_LIST_MAX_SIZE to 2000 and this resolved the issue so your suspicion is correct.

@richard-cox
Copy link
Contributor

Wonderful! I'll get a fix for this into the next release.

@azizabah
Copy link
Author

Great. Thanks @richard-cox !

@richard-cox
Copy link
Contributor

FYI This issue will be closed via #4241. For more info on the max list process including configuration please see https://github.com/cloudfoundry/stratos/pull/4226/files?short_path=c54e3c9#diff-c54e3c9bb73391d7bccf190d0e59f626

@azizabah
Copy link
Author

@richard-cox Thanks for the link to the docs and the quick turn-around. From an UX point of view, wouldn't having UI_LIST_ALLOW_LOAD_MAXED defaulted to true provide a better initial experience for users and then allow operators/admins to turn that off later if needed?

I can add that as a separate issue if you agree and want that for tracking.

@richard-cox
Copy link
Contributor

Previously, before this feature was introduced, we heard of issues where the CF was severely impacted when Stratos fetched all entities (on very big instances). Bringing this ability back in, even if gated, is something we feel the administrator should make a conscious, informed and tested decision on. This does come at the cost of 'hiding' the feature, we could do a better job of promoting it.
If there's enough feedback from the community we may flip this, as suggested. Though for the moment we've erred on the side of caution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug community Community Raised Issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants