Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for checking for zeroed files #88

Merged
merged 1 commit into from
Oct 31, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 12 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ Other tools are usually written in C/C++ for high performance but still need to
But the most important thing for me was to learn Rust and create a program useful for the open source community.

## Features
- Written in fast and memory safe Rust
- Written in memory safe Rust
- Amazingly fast - due using more or less advanced algorithms
- CLI frontend, very fast and powerful with rich help
- GUI GTK frontend - uses modern GTK 3 and looks similar to FSlint
- Light/Dark theme match the appearance of the system
- GUI Orbtk frontend(Very early WIP) - alternative GUI with reduced functionality
- Saving results to a file - allows reading entries found by the tool easily
- Rich search option - allows setting absolute included and excluded directories, set of allowed file extensions or excluded items with * wildcard
- Clean Glade file in which UI can be easily modernized
Expand All @@ -29,6 +29,7 @@ But the most important thing for me was to learn Rust and create a program usefu
- Empty Files - Looks for empty files across disk
- Temporary Files - Allows finding temporary files
- Similar Files - Finds files which are not exactly the same
- Zeroed Files - Find files which are filled with zeros(usually corrupted)

## Usage and requirements

Expand All @@ -49,7 +50,7 @@ cargo install czkawka_gui
```
You can update package by typing same command.

### Snap, Flatpak
### Snap, Flatpak
Maybe someday

### Debian/Ubuntu repository and PPA
Expand All @@ -58,7 +59,7 @@ Tried to setup it, but for now I have problems described in this issue
https://salsa.debian.org/rust-team/debcargo-conf/-/issues/21


### AUR - Arch Linux Package(unofficial)
### AUR - Arch Linux Package (unofficial)
Czkawka is also available in Arch Linux's AUR from which it can be easily downloaded and installed on the system.
```
yay -Syu czkawka-git
Expand Down Expand Up @@ -106,7 +107,7 @@ cargo run --bin czkawka_cli
```
![CLI](https://user-images.githubusercontent.com/41945903/93716816-0bbcfd80-fb72-11ea-8d31-4c87cc2abe6d.png)

## Speed
## Benchmarks
Since Czkawka is written in Rust and aims to be a faster alternative to FSlint (written in Python), we need to compare the speed of these tools.

I prepared a directory and performed a test without any folder exceptions(I removed all directories from FSlint and Czkawka from other tabs than Include Directory) which contained 320004 files and 36902 folders and 108844 duplicated files in 34475 groups which took 4.53 GB.
Expand All @@ -121,8 +122,8 @@ DupeGuru after selecting files, froze at 45% for ~15 minutes, so I just kill it.
|:----------:|:-------------:|
| FSlint 2.4.7 (First Run)| 255s |
| FSlint 2.4.7 (Second Run)| 126s |
| Czkawka 1.2.2 (First Run) | 150s |
| Czkawka 1.2.2 (Second Run) | 107s |
| Czkawka 1.3.0 (First Run) | 150s |
| Czkawka 1.3.0 (Second Run) | 107s |
| DupeGuru 4.0.4 (First Run) | - |
| DupeGuru 4.0.4 (Second Run) | - |

Expand All @@ -133,21 +134,21 @@ To not get Dupeguru crash I checked smaller directory with 217986 files and 4188
| App| Idle Ram | Max Operational Ram Usage | Stabilized after search |
|:----------:|:-------------:|:-------------:|:-------------:|
| FSlint 2.4.7 | 54 MB | 120 MB | 117 MB |
| Czkawka 1.2.2 | 8 MB | 42 MB | 41 MB |
| Czkawka 1.3.0 | 8 MB | 42 MB | 41 MB |
| DupeGuru 4.0.4 | 110 MB | 637 MB | 602 MB |

Similar Images which check 386 files which takes 1,9GB

| App| Scan time |
|:----------:|:-------------:|
| Czkawka 1.2.2 | 267s |
| Czkawka 1.3.0 | 267s |
| DupeGuru 4.0.4 | 75s |

Similar Images which check 5018 files which takes 389MB

| App| Scan time |
|:----------:|:-------------:|
| Czkawka 1.2.2 | 45s |
| Czkawka 1.3.0 | 45s |
| DupeGuru 4.0.4 | 87s |

So still is a big room for improvements.
Expand All @@ -166,6 +167,7 @@ So still is a big room for improvements.
| Temporary files | X | X | |
| Big files | X | | |
| Similar images | X | | X |
| Zeroed Files| X | | |
| Checking files EXIF| | | X |
| Installed packages | | X | |
| Invalid names | | X | |
Expand Down
22 changes: 21 additions & 1 deletion czkawka_cli/src/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,25 @@ pub enum Commands {
#[structopt(flatten)]
not_recursive: NotRecursive,
},
#[structopt(name = "zeroed", about = "Finds zeroed files", help_message = HELP_MESSAGE, after_help = "EXAMPLE:\n czkawka zeroed -d /home/rafal -e /home/rafal/Pulpit -f results.txt")]
ZeroedFiles {
#[structopt(flatten)]
directories: Directories,
#[structopt(flatten)]
excluded_directories: ExcludedDirectories,
#[structopt(flatten)]
excluded_items: ExcludedItems,
#[structopt(flatten)]
allowed_extensions: AllowedExtensions,
#[structopt(short = "D", long, help = "Delete found files")]
delete_files: bool,
#[structopt(flatten)]
file_to_save: FileToSave,
#[structopt(flatten)]
not_recursive: NotRecursive,
#[structopt(short, long, parse(try_from_str = parse_minimal_file_size), default_value = "1024", help = "Minimum size in bytes", long_help = "Minimum size of checked files in bytes, assigning bigger value may speed up searching")]
minimal_file_size: u64,
},
}

#[derive(Debug, StructOpt)]
Expand Down Expand Up @@ -207,4 +226,5 @@ EXAMPLES:
{bin} big -d /home/rafal/ /home/piszczal -e /home/rafal/Roman -n 25 -x VIDEO -f results.txt
{bin} empty-files -d /home/rafal /home/szczekacz -e /home/rafal/Pulpit -R -f results.txt
{bin} temp -d /home/rafal/ -E */.git */tmp* *Pulpit -f results.txt -D
{bin} image -d /home/rafal -e /home/rafal/Pulpit -f results.txt"#;
{bin} image -d /home/rafal -e /home/rafal/Pulpit -f results.txt
{bin} zeroed -d /home/rafal -e /home/rafal/Pulpit -f results.txt"#;
38 changes: 38 additions & 0 deletions czkawka_cli/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ use czkawka_core::{
empty_folder::EmptyFolder,
similar_files::SimilarImages,
temporary::{self, Temporary},
zeroed::{self, ZeroedFiles},
};
use std::{path::PathBuf, process};
use structopt::StructOpt;
Expand Down Expand Up @@ -208,5 +209,42 @@ fn main() {
sf.print_results();
sf.get_text_messages().print_messages();
}

Commands::ZeroedFiles {
directories,
excluded_directories,
excluded_items,
allowed_extensions,
delete_files,
file_to_save,
not_recursive,
minimal_file_size,
} => {
let mut zf = ZeroedFiles::new();

zf.set_included_directory(path_list_to_str(directories.directories));
zf.set_excluded_directory(path_list_to_str(excluded_directories.excluded_directories));
zf.set_excluded_items(path_list_to_str(excluded_items.excluded_items));
zf.set_allowed_extensions(allowed_extensions.allowed_extensions.join(","));
zf.set_minimal_file_size(minimal_file_size);
zf.set_recursive_search(!not_recursive.not_recursive);

if delete_files {
zf.set_delete_method(zeroed::DeleteMethod::Delete);
}

zf.find_zeroed_files(None);

if let Some(file_name) = file_to_save.file_name() {
if !zf.save_results_to_file(file_name) {
zf.get_text_messages().print_messages();
process::exit(1);
}
}

#[cfg(not(debug_assertions))] // This will show too much probably unnecessary data to debug, comment line only if needed
zf.print_results();
zf.get_text_messages().print_messages();
}
}
}
1 change: 1 addition & 0 deletions czkawka_core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,6 @@ pub mod common_items;
pub mod common_messages;
pub mod common_traits;
pub mod similar_files;
pub mod zeroed;

pub const CZKAWKA_VERSION: &str = env!("CARGO_PKG_VERSION");
Loading