Skip to content

Commit 777ac50

Browse files
authored
Add support for checking for zeroed files (#88)
1 parent 7112ff6 commit 777ac50

File tree

10 files changed

+769
-15
lines changed

10 files changed

+769
-15
lines changed

README.md

+12-10
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,11 @@ Other tools are usually written in C/C++ for high performance but still need to
1414
But the most important thing for me was to learn Rust and create a program useful for the open source community.
1515

1616
## Features
17-
- Written in fast and memory safe Rust
17+
- Written in memory safe Rust
18+
- Amazingly fast - due using more or less advanced algorithms
1819
- CLI frontend, very fast and powerful with rich help
1920
- GUI GTK frontend - uses modern GTK 3 and looks similar to FSlint
2021
- Light/Dark theme match the appearance of the system
21-
- GUI Orbtk frontend(Very early WIP) - alternative GUI with reduced functionality
2222
- Saving results to a file - allows reading entries found by the tool easily
2323
- Rich search option - allows setting absolute included and excluded directories, set of allowed file extensions or excluded items with * wildcard
2424
- Clean Glade file in which UI can be easily modernized
@@ -29,6 +29,7 @@ But the most important thing for me was to learn Rust and create a program usefu
2929
- Empty Files - Looks for empty files across disk
3030
- Temporary Files - Allows finding temporary files
3131
- Similar Files - Finds files which are not exactly the same
32+
- Zeroed Files - Find files which are filled with zeros(usually corrupted)
3233

3334
## Usage and requirements
3435

@@ -49,7 +50,7 @@ cargo install czkawka_gui
4950
```
5051
You can update package by typing same command.
5152

52-
### Snap, Flatpak
53+
### Snap, Flatpak
5354
Maybe someday
5455

5556
### Debian/Ubuntu repository and PPA
@@ -58,7 +59,7 @@ Tried to setup it, but for now I have problems described in this issue
5859
https://salsa.debian.org/rust-team/debcargo-conf/-/issues/21
5960

6061

61-
### AUR - Arch Linux Package(unofficial)
62+
### AUR - Arch Linux Package (unofficial)
6263
Czkawka is also available in Arch Linux's AUR from which it can be easily downloaded and installed on the system.
6364
```
6465
yay -Syu czkawka-git
@@ -106,7 +107,7 @@ cargo run --bin czkawka_cli
106107
```
107108
![CLI](https://user-images.githubusercontent.com/41945903/93716816-0bbcfd80-fb72-11ea-8d31-4c87cc2abe6d.png)
108109

109-
## Speed
110+
## Benchmarks
110111
Since Czkawka is written in Rust and aims to be a faster alternative to FSlint (written in Python), we need to compare the speed of these tools.
111112

112113
I prepared a directory and performed a test without any folder exceptions(I removed all directories from FSlint and Czkawka from other tabs than Include Directory) which contained 320004 files and 36902 folders and 108844 duplicated files in 34475 groups which took 4.53 GB.
@@ -121,8 +122,8 @@ DupeGuru after selecting files, froze at 45% for ~15 minutes, so I just kill it.
121122
|:----------:|:-------------:|
122123
| FSlint 2.4.7 (First Run)| 255s |
123124
| FSlint 2.4.7 (Second Run)| 126s |
124-
| Czkawka 1.2.2 (First Run) | 150s |
125-
| Czkawka 1.2.2 (Second Run) | 107s |
125+
| Czkawka 1.3.0 (First Run) | 150s |
126+
| Czkawka 1.3.0 (Second Run) | 107s |
126127
| DupeGuru 4.0.4 (First Run) | - |
127128
| DupeGuru 4.0.4 (Second Run) | - |
128129

@@ -133,21 +134,21 @@ To not get Dupeguru crash I checked smaller directory with 217986 files and 4188
133134
| App| Idle Ram | Max Operational Ram Usage | Stabilized after search |
134135
|:----------:|:-------------:|:-------------:|:-------------:|
135136
| FSlint 2.4.7 | 54 MB | 120 MB | 117 MB |
136-
| Czkawka 1.2.2 | 8 MB | 42 MB | 41 MB |
137+
| Czkawka 1.3.0 | 8 MB | 42 MB | 41 MB |
137138
| DupeGuru 4.0.4 | 110 MB | 637 MB | 602 MB |
138139

139140
Similar Images which check 386 files which takes 1,9GB
140141

141142
| App| Scan time |
142143
|:----------:|:-------------:|
143-
| Czkawka 1.2.2 | 267s |
144+
| Czkawka 1.3.0 | 267s |
144145
| DupeGuru 4.0.4 | 75s |
145146

146147
Similar Images which check 5018 files which takes 389MB
147148

148149
| App| Scan time |
149150
|:----------:|:-------------:|
150-
| Czkawka 1.2.2 | 45s |
151+
| Czkawka 1.3.0 | 45s |
151152
| DupeGuru 4.0.4 | 87s |
152153

153154
So still is a big room for improvements.
@@ -166,6 +167,7 @@ So still is a big room for improvements.
166167
| Temporary files | X | X | |
167168
| Big files | X | | |
168169
| Similar images | X | | X |
170+
| Zeroed Files| X | | |
169171
| Checking files EXIF| | | X |
170172
| Installed packages | | X | |
171173
| Invalid names | | X | |

czkawka_cli/src/commands.rs

+21-1
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,25 @@ pub enum Commands {
9898
#[structopt(flatten)]
9999
not_recursive: NotRecursive,
100100
},
101+
#[structopt(name = "zeroed", about = "Finds zeroed files", help_message = HELP_MESSAGE, after_help = "EXAMPLE:\n czkawka zeroed -d /home/rafal -e /home/rafal/Pulpit -f results.txt")]
102+
ZeroedFiles {
103+
#[structopt(flatten)]
104+
directories: Directories,
105+
#[structopt(flatten)]
106+
excluded_directories: ExcludedDirectories,
107+
#[structopt(flatten)]
108+
excluded_items: ExcludedItems,
109+
#[structopt(flatten)]
110+
allowed_extensions: AllowedExtensions,
111+
#[structopt(short = "D", long, help = "Delete found files")]
112+
delete_files: bool,
113+
#[structopt(flatten)]
114+
file_to_save: FileToSave,
115+
#[structopt(flatten)]
116+
not_recursive: NotRecursive,
117+
#[structopt(short, long, parse(try_from_str = parse_minimal_file_size), default_value = "1024", help = "Minimum size in bytes", long_help = "Minimum size of checked files in bytes, assigning bigger value may speed up searching")]
118+
minimal_file_size: u64,
119+
},
101120
}
102121

103122
#[derive(Debug, StructOpt)]
@@ -207,4 +226,5 @@ EXAMPLES:
207226
{bin} big -d /home/rafal/ /home/piszczal -e /home/rafal/Roman -n 25 -x VIDEO -f results.txt
208227
{bin} empty-files -d /home/rafal /home/szczekacz -e /home/rafal/Pulpit -R -f results.txt
209228
{bin} temp -d /home/rafal/ -E */.git */tmp* *Pulpit -f results.txt -D
210-
{bin} image -d /home/rafal -e /home/rafal/Pulpit -f results.txt"#;
229+
{bin} image -d /home/rafal -e /home/rafal/Pulpit -f results.txt
230+
{bin} zeroed -d /home/rafal -e /home/rafal/Pulpit -f results.txt"#;

czkawka_cli/src/main.rs

+38
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ use czkawka_core::{
1212
empty_folder::EmptyFolder,
1313
similar_files::SimilarImages,
1414
temporary::{self, Temporary},
15+
zeroed::{self, ZeroedFiles},
1516
};
1617
use std::{path::PathBuf, process};
1718
use structopt::StructOpt;
@@ -208,5 +209,42 @@ fn main() {
208209
sf.print_results();
209210
sf.get_text_messages().print_messages();
210211
}
212+
213+
Commands::ZeroedFiles {
214+
directories,
215+
excluded_directories,
216+
excluded_items,
217+
allowed_extensions,
218+
delete_files,
219+
file_to_save,
220+
not_recursive,
221+
minimal_file_size,
222+
} => {
223+
let mut zf = ZeroedFiles::new();
224+
225+
zf.set_included_directory(path_list_to_str(directories.directories));
226+
zf.set_excluded_directory(path_list_to_str(excluded_directories.excluded_directories));
227+
zf.set_excluded_items(path_list_to_str(excluded_items.excluded_items));
228+
zf.set_allowed_extensions(allowed_extensions.allowed_extensions.join(","));
229+
zf.set_minimal_file_size(minimal_file_size);
230+
zf.set_recursive_search(!not_recursive.not_recursive);
231+
232+
if delete_files {
233+
zf.set_delete_method(zeroed::DeleteMethod::Delete);
234+
}
235+
236+
zf.find_zeroed_files(None);
237+
238+
if let Some(file_name) = file_to_save.file_name() {
239+
if !zf.save_results_to_file(file_name) {
240+
zf.get_text_messages().print_messages();
241+
process::exit(1);
242+
}
243+
}
244+
245+
#[cfg(not(debug_assertions))] // This will show too much probably unnecessary data to debug, comment line only if needed
246+
zf.print_results();
247+
zf.get_text_messages().print_messages();
248+
}
211249
}
212250
}

czkawka_core/src/lib.rs

+1
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,6 @@ pub mod common_items;
1111
pub mod common_messages;
1212
pub mod common_traits;
1313
pub mod similar_files;
14+
pub mod zeroed;
1415

1516
pub const CZKAWKA_VERSION: &str = env!("CARGO_PKG_VERSION");

0 commit comments

Comments
 (0)