-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-10625 control: Create the tool to collect the logs/config for support purpose #11094
Conversation
Bug-tracker data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-11094/1/execution/node/1092/log |
…pport purpose. *** WORK IN PROGRESS *** 1> Adding dmg support collectlog option collect the logs/configs from servers. 2> Adding daos_server support collectlog option to get the system and server side information incase dmg Management layer is not working for some reason. 3> Adding daos_admin support collectlog option to get the client side logs and configs. --- To BE DONE --- 1> Unit testing still in progress. 2> All command which collect the system related information still needs to be added. 3> Functional testing to be done with this PR. Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
727f0fd
to
8353d18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-11094/3/testReport/(root)/ |
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial observations (most points refer to multiple occurrences in the PR):
if b == true can be shortened to if b
initialising to an empty value i.e. LogCollection["..."] = []string{""} it is rarely necessary as zero values are usually good enough un-initialised (one of the useful things of go)
If using string constants as keys, define them e.g. const CopyServerConfigCmd = "CopyServerConfig", not convinced this is the best use of the map though, you might be able to define enum for the keys
progress.Total = progress.Total + 1 can be shortened to progress.Total++
The code for the daos_server and dmg command variants should be consolidated into shared helper functions to reduce duplication.
src/control/cmd/dmg/support.go
Outdated
} | ||
if len(resp.GetHostErrors()) > 0 { | ||
var bld strings.Builder | ||
_ = pretty.PrintResponseErrors(resp, &bld) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors should always be handled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/control/cmd/dmg/support.go
Outdated
} | ||
|
||
// Rsync the logs from servers | ||
hostName, _ := support.GetHostName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors should always be handled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of code! Most of my comments are Go language tips and/or naming suggestions, and can be applied in multiple places.
I haven't tried running it so can't speak to any oddities there.
type collectLogCmd struct { | ||
configCmd | ||
cmdutil.LogCmd | ||
Stop bool `short:"s" long:"Stop" description:"Stop the collectlog command on very first error"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion - Maybe "stop-on-error" for the long name? As it is, this option feels a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
configCmd | ||
cmdutil.LogCmd | ||
Stop bool `short:"s" long:"Stop" description:"Stop the collectlog command on very first error"` | ||
TargetFolder string `short:"t" long:"loglocation" description:"Folder location where log is going to be copied"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion - "dest" or "target" for the long name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Stop bool `short:"s" long:"Stop" description:"Stop the collectlog command on very first error"` | ||
TargetFolder string `short:"t" long:"loglocation" description:"Folder location where log is going to be copied"` | ||
Archive bool `short:"z" long:"archive" description:"Archive the log/config files"` | ||
CustomLogs string `short:"c" long:"custom-logs" description:"Collect the Logs from given directory"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I fully understand this option. Is this an extra source dir that the logs will be copied from? "extra-logs-dir" may be a clearer name in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
// Default 3 steps of log/conf collection. | ||
progress := support.ProgressBar{1, 3, 0, false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more readable model with Go is to use the struct member names inline.
progress := support.ProgressBar{1, 3, 0, false} | |
progress := support.ProgressBar{ | |
Start: 1, | |
Total: 3, | |
} |
No need to initialize the zeroes. All the members are automatically initialized to 0 unless otherwise specified.
An aside, I am not sure what the 1 means here. Shouldn't we always start at 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and remove the zeros.
src/control/lib/support/log.go
Outdated
Start int // start int number | ||
Total int // end int number | ||
Steps int // Int number be increased per steps | ||
JsonOutput bool // Json option to skip progress bar if it's enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name suggestion - "NoDisplay"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/control/lib/support/log.go
Outdated
JsonOutput bool // Json option to skip progress bar if it's enabled | ||
} | ||
|
||
type Params struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name suggestion - CollectLogsParams
Considering we may someday add other support commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/control/lib/support/log.go
Outdated
LogCmd string | ||
} | ||
|
||
type copy struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy of what? I'd suggest a different name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/control/lib/support/log.go
Outdated
} | ||
|
||
// Print the progress bar during log collect command | ||
func PrintProgress(progBar *ProgressBar) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion - could make these member functions of ProgressBar.A function signature like:
func (p *ProgressBar) Display(out io.Writer)
It would be easy enough to decide within the function whether to print progress or end status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/control/lib/support/log.go
Outdated
func CollectSupportLog(log logging.Logger, opts ...Params) error { | ||
switch opts[0].LogFunction { | ||
case "CopyServerConfig": | ||
return CopyServerConfig(log, opts...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do all of these functions need to be exported? Starting with a capital letter in Go means the symbol is exported. If they are solely called from this function I'd suggest unexporting them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Thank you and appreciated for the review comments. I will update the PR. |
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
Required-githooks: true
Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style warning(s) for job https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-11094/5/
Please review https://wiki.hpdd.intel.com/display/DC/Coding+Rules
:avocado: tags=hw,medium | ||
:avocado: tags=basic,control,dmg | ||
:avocado: tags=test_dmg_support_collect_log | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(style) trailing whitespace
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-11094/5/execution/node/145/log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-11094/43/execution/node/1034/log |
Test-tag: pr control Required-githooks: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-11094/45/execution/node/1236/log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is a huge PR, it seems low risk in that the changes are adding new functionality rather than changing existing code. I'm in support of getting it landed so that we can continue to iterate on it.
} | ||
|
||
// Get the system hostname | ||
func GetHostName() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not blocking on this, but there is a Go function os.Hostname()
that can fetch the hostname without calling out to the shell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review. I will test and updated when will work on second iteration.
If I recall, I was trying to use that os.Hostname() but it was giving the full name including the domain name. So another option is to use the function and remove the domain from it.
Test-tag: pr control Required-githooks: true Signed-off-by: Samir Raval <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No errors found by checkpatch.
Test stage Functional on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-11094/47/execution/node/1114/log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a while trying to work out whether the changes requested were addressed but tbh it's quite hard to tell, some of them definitely have been so I will approve.
Although this port copies the code from master to 2.4, there are issues related to permissions making it non-functional. This is a partial cherry-pick of the following PR's on master: DAOS-10625 control: Create the tool to collect the logs/config for support purpose (#11094) DAOS-13759 control: Update support collect-log tool. (#12906) DAOS-13763 control: Fix daos_metrics collection for support collect-log. (#12555) DAOS-13936 support: Collect the specific logs and Time range log for support (#13325) Change-Id: I168c14e177a5003c4e315595b1bf154e84cef473
Although this port copies the code from master to 2.4, there are issues related to permissions making it non-functional. This is a partial cherry-pick of the following PR's on master: DAOS-10625 control: Create the tool to collect the logs/config for support purpose (#11094) DAOS-13759 control: Update support collect-log tool. (#12906) DAOS-13763 control: Fix daos_metrics collection for support collect-log. (#12555) DAOS-13936 support: Collect the specific logs and Time range log for support (#13325) Change-Id: I168c14e177a5003c4e315595b1bf154e84cef473
Although this port copies the code from master to 2.4, there are issues related to permissions making it non-functional. This is a partial cherry-pick of the following PR's on master: DAOS-10625 control: Create the tool to collect the logs/config for support purpose (#11094) DAOS-13759 control: Update support collect-log tool. (#12906) DAOS-13763 control: Fix daos_metrics collection for support collect-log. (#12555) DAOS-13936 support: Collect the specific logs and Time range log for support (#13325) Change-Id: I168c14e177a5003c4e315595b1bf154e84cef473
Although this port copies the code from master to 2.4, there are issues related to permissions making it non-functional. This is a partial cherry-pick of the following PR's on master: DAOS-10625 control: Create the tool to collect the logs/config for support purpose (#11094) DAOS-13759 control: Update support collect-log tool. (#12906) DAOS-13763 control: Fix daos_metrics collection for support collect-log. (#12555) DAOS-13936 support: Collect the specific logs and Time range log for support (#13325) Signed-off-by: Chris Davis <[email protected]>
1> Adding dmg support collectlog option collect the logs/configs from servers.
2> Adding daos_server support collectlog option to get the system and server side information in case dmg Management layer is not working for some reason.
3> Adding daos_admin support collectlog option to get the client side logs and configs.
4> Added unit test for support lib
5> Added functional tests for dmg,daos_agent and daos_server support options
Signed-off-by: Samir Raval [email protected]
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: