You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*[Serviceability and DEBUG](#serviceability-and-debug)
25
30
*[Warm Boot Support](#warm-boot-support)
26
31
*[Unit Test](#unit-test)
@@ -34,7 +39,8 @@
34
39
Rev | Date | Author | Change Description
35
40
:---: | :-----: | :------: | :---------
36
41
1.0 | 05/07/19 | Kalimuthu | Initial version
37
-
2.0 | 03/08/19 | Rajendra | Review Comments
42
+
2.0 | 08/03/19 | Rajendra | Review Comments
43
+
2.1 | 05/17/20 | Rajendra | Defined an openconfig yang model for systemd-coredump. Added KLISH CLI commands for coredump configuration
38
44
39
45
40
46
# About this Manual
@@ -88,14 +94,14 @@ This document describes new mechanisms to manage the core files that are generat
88
94
89
95
To configure the core dump and tech-support data, export to an external server and to view the core details the following config and show commands shall be supported. It is to be noted that the tech-support data always includes the core dumps generated on the system.
90
96
91
-
### Config commands
97
+
### Config commands requirements
92
98
93
99
>1. Config command to enable/disable the coredump generation of processes.
94
100
>2. Config command to store the details of exporting tech-support data to an external server which includes remote server name, path, transfer protocol type and the user credentials.
95
101
>2. Config command to enable/disable the tech-support export
96
102
>3. Config command to configure the tech-support export periodic interval.
97
103
98
-
### Show commands
104
+
### Show commands requirements
99
105
> 1. Show commands to display the core file information
100
106
> 2. show commands to display the tech-support export information.
101
107
@@ -111,15 +117,15 @@ There should be a limit on the size of the core file generated and the space occ
111
117
112
118
The corefile management functionality is divided into two main services.
113
119
114
-
1. Core-dump generation service.
115
-
2. Tech-support data export service.
120
+
1. Core-Dump Generation Service.
121
+
2. Tech-support data export service.
116
122
117
123
118
-
## Core-dump generation service
124
+
## Core-Dump Generation Service
119
125
120
-
1.Core files are usually generated when process terminates unexpectedly. Typical conditions are access violations, termination signals (except SIGKILL), etc.,
121
-
2.ulimit configuration might prevent generation of core due to size configurations. We need to ensure this is not the case.
122
-
3.Service restart functions - will not generate the core dump as it handle the graceful stop and start. This includes docker service restart as well.
126
+
1. Core files are usually generated when process terminates unexpectedly. Typical conditions are access violations, termination signals (except SIGKILL), etc.,
127
+
2. ulimit configuration might prevent generation of core due to size configurations. We need to ensure this is not the case.
128
+
3. Service restart functions - will not generate the core dump as it handle the graceful stop and start. This includes docker service restart as well.
123
129
124
130
## systemd-coredump
125
131
@@ -154,25 +160,7 @@ Current SONiC code has some basic support for generation and compression of core
154
160
>- Setting of “kernel/core_pattern” in “build_debian.sh” is removed as systemd-coredump sets this parameter.
155
161
>- A symlink /var/core is created to point to the systemd-coredump standard core file destination “var/lib/systemd/coredump”
156
162
>- “show techsupport” command is modified to capture the core files from the symlink “/var/core”. It is also modified to consider that core files are lz4 compressed instead of gz files.
157
-
158
-
## Configuration commands:
159
-
160
-
For SONiC switches following CLI commands will be provided to manage core files
161
-
162
-
#### show core [ config | info | list ]
163
-
164
-
>###### **\<config>** Show coredump configuration
165
-
>###### **\<info>** Show information about one or more coredumps
166
-
>###### **\<list>** List available coredumps
167
-
168
-
Display list of current core files available and their information. This is a wrapper command for the coredumpctl utility provided by systemd-coredump package.
169
163
170
-
#### config core <enable|disable>
171
-
172
-
Enable or disable coredump functionality. This configuration entry will be part of Config DB and thus can be stored as part of startup-configuration.
173
-
174
-
When disabled, this command will set ProcessSizeMax=0 in the /etc/systemd/coredump.conf file. The configuration variable ProcessSizeMax specifies maximum size in bytes of a core which will be processed. By setting it to 0 core dump generation can be disabled. When enabled this command will set ProcessSizeMax to be the same value as ExternalSizeMax. The configuration variable ExternalSizeMax indicates the maximum (uncompressed) size in bytes of a core to be saved.
175
-
176
164
## Core Dump Event Logging
177
165
178
166
Report of available core files can be obtained using the coredumpctl utility.
@@ -224,8 +212,15 @@ When core file is generated for the same process multiple times, the framework s
224
212
225
213
The archived core file is generated in a pre-defined format by the systemd-coredump tool.
@@ -251,9 +247,23 @@ The export service is configured to monitors the coredump path for any new core
251
247
252
248
### Config DB Schema
253
249
250
+
#### Coredump Configuration
251
+
252
+
The coredump administrative mode can be stored in the Config DB as defined below. By default, coredump is enabled and if the COREDUMP Config DB table entry is missing, coredump is
253
+
assumed to be administratively enabled.
254
+
255
+
```
256
+
"COREDUMP": {
257
+
"config": {
258
+
"enabled": "true"
259
+
}
260
+
}
261
+
```
262
+
263
+
#### Tech Support Services
254
264
In order to export the tech support data, remote server details have to be configured on the device. Through CLI interface, external storage server can be configured which includes server IP, path and access information like user credentials and transport protocol. This information is stored as part of config DB.
255
265
,
256
-
>>
266
+
```
257
267
"EXPORT": {
258
268
"export": {
259
269
"config": "<enable/disable>",
@@ -266,10 +276,193 @@ In order to export the tech support data, remote server details have to be confi
266
276
267
277
}
268
278
},
279
+
```
269
280
270
281
While configuring the export service, the remote server password is encrypted with device universally unique identifier (UUID) and stored into the config DB, so that the password can be decrypted only on the device. The protocol fields specifies the one of the file transfer protocol either SCP or SFTP. The interval field specifies the duration in which it captures the tech-support data and export it.
271
282
272
-
## CLI commands
283
+
## User Interface
284
+
### Data Models
285
+
286
+
Coredump configuration and status parameters are defined in the openconfig-systemd-coredump yang model. The openconfig-systemd-coredump yang model is included as an extension to the openconfig-system yang model.
287
+
288
+
```
289
+
+--rw oc-sys-ext:systemd-coredump
290
+
+--rw oc-sys-ext:config
291
+
| +--rw oc-sys-ext:enable? boolean
292
+
+--ro oc-sys-ext:state
293
+
| +--ro oc-sys-ext:enable? boolean
294
+
+--ro oc-sys-ext:core-file-records
295
+
+--ro oc-sys-ext:core-file-record* [timestamp]
296
+
+--ro oc-sys-ext:timestamp -> ../state/timestamp
297
+
+--ro oc-sys-ext:state
298
+
+--ro oc-sys-ext:timestamp? oc-types:timeticks64
299
+
+--ro oc-sys-ext:executable? string
300
+
+--ro oc-sys-ext:core-file? string
301
+
+--ro oc-sys-ext:pid? uint64
302
+
+--ro oc-sys-ext:uid? uint32
303
+
+--ro oc-sys-ext:gid? uint32
304
+
+--ro oc-sys-ext:signal? uint32
305
+
+--ro oc-sys-ext:command-line? string
306
+
+--ro oc-sys-ext:boot-identifier? string
307
+
+--ro oc-sys-ext:machine-identifier? string
308
+
+--ro oc-sys-ext:crash-message? string
309
+
+--ro oc-sys-ext:core-file-present? boolean
310
+
```
311
+
312
+
313
+
### Show Commands
314
+
315
+
The following CLI commands provide the ability to view the core files generated on the SONiC switch.
316
+
317
+
#### show core config
318
+
**Description**
319
+
320
+
Display the coredump configuration. Use this command to display if the coredump feature
321
+
is administratively enabled or disabled.
322
+
323
+
**Usage**
324
+
325
+
```
326
+
show core config
327
+
```
328
+
329
+
**Example**
330
+
331
+
```
332
+
sonic# show core config
333
+
Coredump : Enabled
334
+
```
335
+
336
+
##### show core list
337
+
**Description**
338
+
339
+
Use this command to list a summary of the core files generated by the kernel. The following information
340
+
about each core file is also displayed.
341
+
- TIME The time of the crash, as reported by the kernel in UTC
342
+
- PID: The identifier of the process that crashed
343
+
- SIG: The signal that caused the process to crash, when applicable
344
+
- COREFILE: Indicates whether the captured core file exists on local disk or has been removed
345
+
- EXE: The application executable that has crashed
346
+
347
+
**Usage**
348
+
349
+
```
350
+
show core list
351
+
```
352
+
353
+
**Example**
354
+
355
+
```
356
+
sonic# show core list
357
+
TIME PID SIG COREFILE EXE
358
+
2020-05-16 11:54:33 26480 11 present clish
359
+
2020-05-15 01:25:16 6195 11 present crashme
360
+
2020-05-15 00:45:28 13604 11 present crashme
361
+
2020-05-14 02:11:11 3197 11 present crashme
362
+
2020-05-13 01:10:56 17844 11 missing crashme
363
+
2020-05-13 01:10:55 17728 11 present crashme
364
+
```
365
+
366
+
##### show core info
367
+
**Description**
368
+
369
+
Use this command to display detailed information about a crash that has occured in the system. This command
370
+
takes processid or executable name as input to search and display the corresponding crash information. If multiple
371
+
core files are found which satisfy the match condition, information of all core files is displayed.
372
+
373
+
The following information about matching core files is displayed:
374
+
- Time: The time of the crash, as reported by the kernel in UTC
375
+
- Executable: The full path to the application executable that has crashed"
376
+
- Core File: The file name of the application core dump of the executable that has crashed
377
+
- PID: The identifier of the process that crashed
378
+
- User ID: The user identifier of the process that crashed
379
+
- Group ID: The group identifier of the process that crashed
380
+
- Signal: The signal that caused the process to crash, when applicable
381
+
- Command Line: The command line arguments of the process that crashed
382
+
- Boot ID: The unique identifier of the local system that is generated and set on each system boot up event
383
+
- Machine ID: The unique machine identifier of the local system that is set during installation
384
+
- Core File Found: Indicates whether the captured core file exists on local disk or has been removed
385
+
- Crash Message: A copy of the application stack trace information of the process crashed
0 commit comments