Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(scan): consider .gitignore to automatically exclude paths by default #5506

Merged
merged 13 commits into from
Jul 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Flags:
cannot be provided with query inclusion flags
can be provided multiple times or as a comma separated string
example: 'Access control,Best practices'
--exclude-gitignore disables the exclusion of paths specified within .gitignore file
-e, --exclude-paths strings exclude paths from scan
supports glob and can be provided multiple times or as a quoted comma separated string
example: './shouldNotScan/*,somefile.txt'
Expand Down Expand Up @@ -107,6 +108,9 @@ Global Flags:

The other commands have no further options.

## Exclude Paths
By default, KICS excludes paths specified in the .gitignore file in the root of the repository. To disable this behavior, use flag `--exclude-gitignore`.

## Library Flag Usage

As mentioned above, the library flag (`-b` or `--libraries-path`) refers to the directory with libraries. The functions need to be grouped by platform and the library file name should follow the format: `<platform>.rego` to be loaded by KICS. It doesn't matter your directory structure. In other words, for example, if you want to indicate a directory that contains a library for your terraform queries, you should group your functions (used in your terraform queries) in a file named `terraform.rego` wherever you want.
Expand Down
1 change: 1 addition & 0 deletions docs/dockerhub.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ Flags:
cannot be provided with query inclusion flags
can be provided multiple times or as a comma separated string
example: 'Access control,Best practices'
--exclude-gitignore disables the exclusion of paths specified within .gitignore file
-e, --exclude-paths strings exclude paths from scan
supports glob and can be provided multiple times or as a quoted comma separated string
example: './shouldNotScan/*,somefile.txt'
Expand Down
1 change: 1 addition & 0 deletions e2e/fixtures/assets/scan_help
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Flags:
cannot be provided with query inclusion flags
can be provided multiple times or as a comma separated string
example: 'Access control,Best practices'
--exclude-gitignore disables the exclusion of paths specified within .gitignore file
-e, --exclude-paths strings exclude paths from scan
supports glob and can be provided multiple times or as a quoted comma separated string
example: './shouldNotScan/*,somefile.txt'
Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,7 @@ require (
github.com/russross/blackfriday v1.5.2 // indirect
github.com/ruudk/golang-pdf417 v0.0.0-20181029194003-1af4ab5afa58 // indirect
github.com/ryanuber/go-glob v1.0.0 // indirect
github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06
github.com/shopspring/decimal v1.2.0 // indirect
github.com/sirupsen/logrus v1.8.1 // indirect
github.com/sourcegraph/jsonrpc2 v0.0.0-20210201082850-366fbb520750 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1547,6 +1547,8 @@ github.com/ruudk/golang-pdf417 v0.0.0-20181029194003-1af4ab5afa58/go.mod h1:6lfF
github.com/ryanuber/columnize v0.0.0-20160712163229-9b3edd62028f/go.mod h1:sm1tb6uqfes/u+d4ooFouqFdy9/2g9QGwK3SQygK0Ts=
github.com/ryanuber/go-glob v1.0.0 h1:iQh3xXAumdQ+4Ufa5b25cRpC5TYKlno6hsv6Cb3pkBk=
github.com/ryanuber/go-glob v1.0.0/go.mod h1:807d1WSdnB0XRJzKNil9Om6lcp/3a0v4qIHxIXzX/Yc=
github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06 h1:OkMGxebDjyw0ULyrTYWeN0UNCCkmCWfjPnIA2W6oviI=
github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06/go.mod h1:+ePHsJ1keEjQtpvf9HHw0f4ZeJ0TLRsxhunSI2hYJSs=
github.com/safchain/ethtool v0.0.0-20210803160452-9aa261dae9b1/go.mod h1:Z0q5wiBQGYcxhMZ6gUqHn6pYNLypFAvaL3UvgZLR0U4=
github.com/satori/go.uuid v1.2.0/go.mod h1:dA0hQrYB0VpLJoorglMZABFdXlWrHn1NEOzdhQKdks0=
github.com/sclevine/agouti v3.0.0+incompatible/go.mod h1:b4WX9W9L1sfQKXeJf1mUTLZKJ48R1S7H23Ji7oFO5Bw=
Expand Down
6 changes: 6 additions & 0 deletions internal/console/assets/scan-flags.json
Original file line number Diff line number Diff line change
Expand Up @@ -185,5 +185,11 @@
"defaultValue": "",
"usage": "case insensitive list of platform types to scan\n(${supportedPlatforms})",
"validation": "validateMultiStrEnum"
},
"exclude-gitignore": {
"flagType": "bool",
"shorthandFlag": "",
"defaultValue": "false",
"usage": "disables the exclusion of paths specified within .gitignore file"
}
}
1 change: 1 addition & 0 deletions internal/console/flags/scan_flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ const (
LineInfoPayloadFlag = "payload-lines"
DisableSecretsFlag = "disable-secrets"
SecretsRegexesPathFlag = "secrets-regexes-path" //nolint:gosec
ExcludeGitIgnore = "exclude-gitignore"
)
1 change: 1 addition & 0 deletions internal/console/scan.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ func getScanParameters(changedDefaultQueryPath, changedDefaultLibrariesPath bool
ChangedDefaultLibrariesPath: changedDefaultLibrariesPath,
ChangedDefaultQueryPath: changedDefaultQueryPath,
BillOfMaterials: flags.GetBoolFlag(flags.BomFlag),
ExcludeGitIgnore: flags.GetBoolFlag(flags.ExcludeGitIgnore),
}

return &scanParams
Expand Down
45 changes: 39 additions & 6 deletions pkg/analyzer/analyzer.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import (
"github.com/Checkmarx/kics/pkg/utils"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
ignore "github.com/sabhiram/go-gitignore"

yamlParser "gopkg.in/yaml.v3"
)
Expand Down Expand Up @@ -115,6 +116,15 @@ type analyzerInfo struct {
filePath string
}

// Analyzer keeps all the relevant info for the function Analyze
type Analyzer struct {
Paths []string
Types []string
Exc []string
GitIgnoreFileName string
ExcludeGitIgnore bool
}

// types is a map that contains the regex by type
var types = map[string]regexSlice{
"openapi": {
Expand Down Expand Up @@ -213,7 +223,7 @@ var types = map[string]regexSlice{

// Analyze will go through the slice paths given and determine what type of queries should be loaded
// should be loaded based on the extension of the file and the content
func Analyze(paths, types, exc []string) (model.AnalyzedPaths, error) {
func Analyze(a *Analyzer) (model.AnalyzedPaths, error) {
// start metrics for file analyzer
metrics.Metric.Start("file_type_analyzer")
returnAnalyzedPaths := model.AnalyzedPaths{
Expand All @@ -225,9 +235,11 @@ func Analyze(paths, types, exc []string) (model.AnalyzedPaths, error) {
var wg sync.WaitGroup
// results is the channel shared by the workers that contains the types found
results := make(chan string)
ignoreFiles := make([]string, 0)
hasGitIgnoreFile, gitIgnore := shouldConsiderGitIgnoreFile(a.Paths[0], a.GitIgnoreFileName, a.ExcludeGitIgnore)

// get all the files inside the given paths
for _, path := range paths {
for _, path := range a.Paths {
if _, err := os.Stat(path); err != nil {
return returnAnalyzedPaths, errors.Wrap(err, "failed to analyze path")
}
Expand All @@ -238,7 +250,12 @@ func Analyze(paths, types, exc []string) (model.AnalyzedPaths, error) {

ext := utils.GetExtension(path)

if _, ok := possibleFileTypes[ext]; ok && !isExcludedFile(path, exc) {
if hasGitIgnoreFile && gitIgnore.MatchesPath(path) {
ignoreFiles = append(ignoreFiles, path)
a.Exc = append(a.Exc, path)
}

if _, ok := possibleFileTypes[ext]; ok && !isExcludedFile(path, a.Exc) {
files = append(files, path)
}

Expand All @@ -251,15 +268,15 @@ func Analyze(paths, types, exc []string) (model.AnalyzedPaths, error) {
// unwanted is the channel shared by the workers that contains the unwanted files that the parser will ignore
unwanted := make(chan string, len(files))

for i := range types {
types[i] = strings.ToLower(types[i])
for i := range a.Types {
a.Types[i] = strings.ToLower(a.Types[i])
}

for _, file := range files {
wg.Add(1)
// analyze the files concurrently
a := &analyzerInfo{
typesFlag: types,
typesFlag: a.Types,
filePath: file,
}
go a.worker(results, unwanted, &wg)
Expand All @@ -276,6 +293,7 @@ func Analyze(paths, types, exc []string) (model.AnalyzedPaths, error) {

availableTypes := createSlice(results)
unwantedPaths := createSlice(unwanted)
unwantedPaths = append(unwantedPaths, ignoreFiles...)
returnAnalyzedPaths.Types = availableTypes
returnAnalyzedPaths.Exc = unwantedPaths
// stop metrics for file analyzer
Expand Down Expand Up @@ -500,3 +518,18 @@ func isExcludedFile(path string, exc []string) bool {
}
return false
}

// shouldConsiderGitIgnoreFile verifies if the scan should exclude the files according to the .gitignore file
func shouldConsiderGitIgnoreFile(path, gitIgnore string, excludeGitIgnoreFile bool) (bool, *ignore.GitIgnore) {
gitIgnorePath := filepath.ToSlash(filepath.Join(path, gitIgnore))
_, err := os.Stat(gitIgnorePath)

if !excludeGitIgnoreFile && err == nil {
gitIgnore, _ := ignore.CompileIgnoreFile(gitIgnorePath)
if gitIgnore != nil {
log.Info().Msgf(".gitignore file was found in '%s' and it will be used to automatically exclude paths", path)
return true, gitIgnore
}
}
return false, nil
}
101 changes: 75 additions & 26 deletions pkg/analyzer/analyzer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,77 +10,126 @@ import (

func TestAnalyzer_Analyze(t *testing.T) {
tests := []struct {
name string
paths []string
wantTypes []string
wantExclude []string
wantErr bool
name string
paths []string
wantTypes []string
wantExclude []string
wantErr bool
gitIgnoreFileName string
excludeGitIgnore bool
}{
{
name: "analyze_test_dir_single_path",
paths: []string{filepath.FromSlash("../../test/fixtures/analyzer_test")},
wantTypes: []string{"dockerfile", "googledeploymentmanager", "cloudformation", "crossplane", "knative", "kubernetes", "openapi", "terraform", "ansible", "azureresourcemanager", "dockercompose"},
wantExclude: []string{},
wantErr: false,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_helm_single_path",
paths: []string{filepath.FromSlash("../../test/fixtures/analyzer_test/helm")},
wantTypes: []string{"kubernetes"},
wantExclude: []string{},
wantErr: false,
name: "analyze_test_helm_single_path",
paths: []string{filepath.FromSlash("../../test/fixtures/analyzer_test/helm")},
wantTypes: []string{"kubernetes"},
wantExclude: []string{},
wantErr: false,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_multiple_path",
paths: []string{
filepath.FromSlash("../../test/fixtures/analyzer_test/Dockerfile"),
filepath.FromSlash("../../test/fixtures/analyzer_test/terraform.tf")},
wantTypes: []string{"dockerfile", "terraform"},
wantExclude: []string{},
wantErr: false,
wantTypes: []string{"dockerfile", "terraform"},
wantExclude: []string{},
wantErr: false,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_multi_checks_path",
paths: []string{
filepath.FromSlash("../../test/fixtures/analyzer_test/openAPI_test")},
wantTypes: []string{"openapi"},
wantExclude: []string{},
wantErr: false,
wantTypes: []string{"openapi"},
wantExclude: []string{},
wantErr: false,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_error_path",
paths: []string{
filepath.FromSlash("../../test/fixtures/analyzer_test/Dockserfile"),
filepath.FromSlash("../../test/fixtures/analyzer_test/terraform.tf")},
wantTypes: []string{},
wantExclude: []string{},
wantErr: true,
wantTypes: []string{},
wantExclude: []string{},
wantErr: true,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_unwanted_path",
paths: []string{
filepath.FromSlash("../../test/fixtures/type-test01/template01/metadata.json"),
},
wantTypes: []string{},
wantExclude: []string{filepath.FromSlash("../../test/fixtures/type-test01/template01/metadata.json")},
wantErr: false,
wantTypes: []string{},
wantExclude: []string{filepath.FromSlash("../../test/fixtures/type-test01/template01/metadata.json")},
wantErr: false,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_tfplan",
paths: []string{
filepath.FromSlash("../../test/fixtures/tfplan"),
},
wantTypes: []string{"terraform"},
wantExclude: []string{},
wantErr: false,
wantTypes: []string{"terraform"},
wantExclude: []string{},
wantErr: false,
gitIgnoreFileName: "",
excludeGitIgnore: false,
},
{
name: "analyze_test_considering_ignore_file",
paths: []string{
filepath.FromSlash("../../test/fixtures/gitignore"),
},
wantTypes: []string{"kubernetes"},
wantExclude: []string{filepath.FromSlash("../../test/fixtures/gitignore/positive.dockerfile"),
filepath.FromSlash("../../test/fixtures/gitignore/secrets.tf"),
filepath.FromSlash("../../test/fixtures/gitignore/gitignore")},
wantErr: false,
gitIgnoreFileName: "gitignore",
excludeGitIgnore: false,
},
{
name: "analyze_test_not_considering_ignore_file",
paths: []string{
filepath.FromSlash("../../test/fixtures/gitignore"),
},
wantTypes: []string{"dockerfile", "kubernetes", "terraform"},
wantExclude: []string{filepath.FromSlash("../../test/fixtures/gitignore/gitignore")},
wantErr: false,
gitIgnoreFileName: "gitignore",
excludeGitIgnore: true,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
types := []string{""}
exc := []string{""}
got, err := Analyze(tt.paths, types, exc)

analyzer := &Analyzer{
Paths: tt.paths,
Types: types,
Exc: exc,
ExcludeGitIgnore: tt.excludeGitIgnore,
GitIgnoreFileName: tt.gitIgnoreFileName,
}

got, err := Analyze(analyzer)
if (err != nil) != tt.wantErr {
t.Errorf("Analyze = %v, wantErr = %v", err, tt.wantErr)
}
Expand Down
1 change: 1 addition & 0 deletions pkg/scan/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ type Parameters struct {
ChangedDefaultLibrariesPath bool
ScanID string
BillOfMaterials bool
ExcludeGitIgnore bool
}

// Client represents a scan client
Expand Down
22 changes: 13 additions & 9 deletions pkg/scan/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,12 +41,16 @@ func (c *Client) prepareAndAnalyzePaths() (provider.ExtractedPath, error) {

log.Info().Msgf("Total files in the project: %d", getTotalFiles(allPaths.Path))

pathTypes, errAnalyze :=
analyzePaths(
allPaths.Path,
c.ScanParams.Platform,
c.ScanParams.ExcludePaths,
)
a := &analyzer.Analyzer{
Paths: allPaths.Path,
Types: c.ScanParams.Platform,
Exc: c.ScanParams.ExcludePaths,
GitIgnoreFileName: ".gitignore",
ExcludeGitIgnore: c.ScanParams.ExcludeGitIgnore,
}

pathTypes, errAnalyze := analyzePaths(a)

if errAnalyze != nil {
return provider.ExtractedPath{}, errAnalyze
}
Expand Down Expand Up @@ -142,20 +146,20 @@ func resolvePath(flagContent, flagName string) (string, error) {
// analyzePaths will analyze the paths to scan to determine which type of queries to load
// and which files should be ignored, it then updates the types and exclude flags variables
// with the results found
func analyzePaths(paths, types, exclude []string) (model.AnalyzedPaths, error) {
func analyzePaths(a *analyzer.Analyzer) (model.AnalyzedPaths, error) {
var err error
var pathsFlag model.AnalyzedPaths
excluded := make([]string, 0)

pathsFlag, err = analyzer.Analyze(paths, types, exclude)
pathsFlag, err = analyzer.Analyze(a)
if err != nil {
log.Err(err)
return model.AnalyzedPaths{}, err
}

logLoadingQueriesType(pathsFlag.Types)

excluded = append(excluded, exclude...)
excluded = append(excluded, a.Exc...)
excluded = append(excluded, pathsFlag.Exc...)
pathsFlag.Exc = excluded
return pathsFlag, nil
Expand Down
3 changes: 3 additions & 0 deletions test/fixtures/gitignore/gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*.dockerfile

*.tf
Loading