WeeDuplicateDetective is a helpful tool if you wish to get rid of duplicates of files. Its use is simple, and the duplicates found are organized efficiently, giving you full control over the clean-up process.
Using this software is easy, the interface is organized in the reading order (left to right and top to bottom).
Besides, a basic tutorial is displayed inside the application.
Look for the string beginning with "1." in the list of folders to scan. Follow the instructions, the next step will appear as soon as this one is performed.
- Add one or more folder to the list by
- either drag-and-dropping them in the list of folders to scan
- or clicking the button [BROWSE], selecting the folder you want to clean and then clicking on the button [ADD TO THE LIST]
- Select one or more comparison criteria and click on the button [FIND DUPLICATES]
- Once all duplicates that follow your comparison criteria have been found, mark all that you want to be removed and click on the buttom [CLEAN THE SELECTED DUPLICATES]
- Choose the cleaning method that best suits your needs
Basically, a duplicate is a copy of a file. That means that the content of the two files is the same.
In WeeDuplicateDetective, you can define what you want to consider as a duplicate, among specific criteria. For instance you may want to consider as duplicates all files in a specific folder sharing the same name, regardless of their content.
In order to compare the content of the files,
WeeDuplicateDetective uses a hash sum (also called
"control sum"). This is a method that allows to calculate a
rather unique identifier depending on the content of a file.
Once computed this identifier uses only a few bytes of memory,
and the file does not have to be read again (this is not the
same as compression though, since the content of the file cannot
be retrieved from its hash sum). See it as an indexing system,
where each file will have its content indexed. If two files have
the same content, then the hash sum will be the same, and
vice-versa: if two has sums are identical, then so are the
contents of their file.
The uniqueness of this identifier is not absolutely guaranteed
when you use the hash sum method called MD5. It's the best
compromise in terms of performances though.
There are indeed some rare cases where two files with a different content will produce the same control sum, this is called a "collision". To reduce even further, to a negligible amount, the risk of collisions, WeeDuplicateDetective provides you with alternative, albeit a bit slower, hash sum calculation methods: SHA1 and SHA-256.
Typically, using MD5 will be more than enough though.
- General note about the user interface
- Folders to scan
- Comparison Criteria
- Define a file filter
- Finding duplicates
- File tree of the duplicates
- Search bar
- Side bar
- Cleaning the duplicates
Whether WeeDuplicateDetective looks good or not is a matter of taste, but there are some grounds rules that its user interface respects:
- very few graphical elements - almost all elements are textual. Leaving out abstract icons allows for a better comprehension of what does what. Besides, I am not (really not) a graphic designer
- western reading order - the main tasks of WeeDuplicateDetective (selecting the folders, selecting the comparison criteria, scanning for duplicates, presenting the duplicates, and cleaning them) are presented left to right and top to bottom
- Very descriptive text - everybody can read but not everybody
can guess a functionality or its result by reading one word.
Besides, this helps a lot if you are using a screen reader.
- Clearly separated areas for the different tasks
- Clearly identify what the user has to do next
All in all, this may make the UI a bit "full" and a bit devoid of color, but general usability is my main goal here.
WeeDuplicateDetective needs to know where to search for duplicate files. This information is provided on the top of the application. You can give several folders,
- either by dragging and dropping them on the list (you can drag and drop several folders at once)
- or by browsing your computer: button [BROWSE] and then button [ADD TO THE LIST]
You can remove a folder from the list by selecting it first, then clicking the button [REMOVE FROM LIST] (this operation will not delete the folder on the computer).The order of the list items is important since WeeDuplicateDetective will scan each one after another, starting with the top-most one. It is therefore possible to modify this order: select a folder and click on either [MOVE UP] or [MOVE DOWN] to move the item in the list.
Directly below the folder selection area you will need to select what criteria files must fulfill to be considered duplicates.
- if you select "Same size", then all files in the given folders that have the EXACT same size will be considered as duplicates of one another
- if you select "Same size" and "Same name", then all files in the given folders that have the EXACT same size AND EXACTLY the same name will be considered as duplicates of one another
If you select "Same content" the scan will last much longer because WeeDuplicateDetective has to read the full content of each file (something it does not do otherwise).
In "advanced Criteria" you can find less used duplicate criteria:
- the date when the file was created
- the date when the file was last modified
- the hash algorithm used to compare the content of the
scanned files. If you do not know what a hash algorithm is,
then leave this option as it is. Actually changing it will
have very little effect on the result: the accuracy of finding
duplicates will only increase in very specific cases. Changing
this option may also result in increasing the time needed for
WeeDuplicateDetective to complete its scan task.
You can filter the files types you want to be scanned (forgetting about everything else). The filter is based on the extension of the files. Some common file types are already defined. In case you need more, click on [EDIT...]
You will get this dialog if you clicked on [EDIT...] in the comparison criteria
The predefined file type filters cannot be modified here, but you can add new ones, or copy a pre-defined one and edit the copy.
- Select a filter and click on [COPY] to make a copy, on [DELETE] to delete it from the list
- Fill in the fields "file type" and "extensions" and click either [ADD] to add a new file type filter or [EDIT] (if available) to modify the selected filter.
Make sure the files type extensions are properly entered. To describe an extension, use "*." and the file extensions. E.g. *.jpg
To describe a list of extensions, separate them with a comma (the comma can be followed by a space, but this is not mandatory). E.g. *.jpg, *.bmp, *.jpeg
Clicking on [CLOSE] will save your changes and close this
Once you have selected at least one comparison
criteria, the button [FIND The DUPLICATE FILES] will be
enabled. It is located on the right of the area were you choose
the comparison criteria.
Click on it to start finding the duplicates files in the
folders you have chosen. A dialog window will appear, showing
- the progress of the scan process - depending on how many
files you have in the folders, and on whether you want to
compare the actual content of the files as well (comparison
criterion "Same content") this can last from a few
seconds to hours.
- the number of files to analyze - parallel to the scan, WeeDuplicateDetective will determine how many files there are to analyze, this can last a a while as well (though not as long as the scan it self), therefore, instead of a number, you may read the text "(computing)" instead.
- a big, red [STOP] button - this allows you to interrupt the
scan process. The duplicates found so far will be shown. In
order to scan the remaining files, you will need to restart a
scan process (click on the button [FIND THE DUPLICATES FILES])
again. This is so, because files to scan may have been created
or modified in the meantime, and WeeDuplicateDetective
would miss those changes.
Once the scan process is done, the file tree shows several things:
- the duplicates are grouped together. the first item of each
group is considered the original, regardless of file creation
time: it's simply the first file of the group that was found
by WeeDuplicateDetective . You can modify this and sort
the files by creation time at a later time
- Each item has a check-box, this is one of the three methods with which you select the files you want to be cleaned. You can select files manually, that means by directly clicking this check-box, and also with the auto-selection feature or the right-click pop-up menu (see below). All selection methods can work together to achieve fast but powerful results.
- Files with the system attribute set on
- Files located in the WINDOWS folder and its sub-folders
- Columns in the files list:
- File name, this should be obvious
- Files unsafe for deletion are shown in
bold, and a big red "No" can be seen in the 2nd column, titled
"Safe?". If WeeDuplicateDetective thinks the file is
safe for deletion, the "Yes" will be shown.
- the amount of duplicates (not including the first found file) can be seen in the 3rd column, titled "#" (this symbol is commonly used to mean "amount" or "number")
- The path where the file is stored. This is the full path, including the drive letter. If you sort the list per Path, the drive letter will be taken into account by the sorting mechanism.
- the date the file was created
- the last time the file's content was modified
- If you right-click on a list item (a "list item" is a line in the file list with all the information mentioned above about the corresponding file) you will get a pop-up menu.
- "Open this file" will try to open the file listed, as if you double-clicked on it
- "Locate this file" will open the Windows File Explorer at
the path where the file is stored (also, the file will
automatically be selected there)
- the menu items in the section "Select for cleanup" will
give you several possibilities to select files in the
current duplicate group. The current duplicate group is the
group of duplicates to which the item you clicked on belongs
- the menu items in the section "Unselect" offers similar actions, but with the opposite goal: deselecting files
You can search the list of files for files which information contains the text you give in the big text block (on the screenshot below, this text area contains the text "res").
It's simple to use:
- select the type of information you where want to search (name of the files, its path, its size, its date, or even all of the information)
- type the text you want to search for (wildcards like '*' or '?', or regular expressions are accepted)
- click on [Find First] to find the first occurrence in the list where the text is matched.
- click on [Find Next] to find the next occurrences
Using regular expressions
When the first character of the search pattern is \ the rest is evaluated as a regular expression. This means that the first \ is not part of the regular expression, but is just here to help WeeDuplicateDetective know the you are using one.
Actually, search patterns without \ as the first character, but containing the wildcards * and ? will be internally transformed into the corresponding regular expression, with a few specificities:
|taken as is
||^ and $ automatically added
|. is replaced by \. and $ by \$
|* is replaced by .*, and ? by .?|
A syntax error in the regular expression will be notified to
you, but it still may not return exactly what you want. This is
usually because the regular expression you give is not correct.
There are several good regular expression checkers on the
If you do not know what a regular expression is but are
curious about it, then google it :)
On the bottom you can see some statistics about WeeDuplicateDetective 's findings.
The most interesting are
- how much space you could spare if you deleted all duplicates of file except only one instance in each duplicate group.
- how many files are currently selected and how much disk
space you would save if you clean
This panel offers you a lot of helpful options to quickly mark a bunch of files for deletion.
The panel is contains three groups:
- the first five elements (radio buttons) are the selection mode. In other words, you chose here one main rule with which to select files in the list
- "all but each original" - for each group, all files will be selected, except the first one
- "each original only" - for each group, only the first file will be selected
- "all but original and 1st copy" - for each group, all files will be selected, except the first and the second one
- "each last copy" - for each group, only the last file will
- "all files" - for each group, all files will be selected
- the next two elements allow to fine tune the selection mode
- "match the search pattern" - this check-box is available only if the text area in the search bar is not empty. If you check this check-box, the the files to be selected according to the selection mode will actually be selected only if they match the search pattern. See the example below.
- "Exclude unsafe files" - this ensure that files deemed unsafe for deletion will not be selected
- The next three elements are the buttons that will launch the selection process
- [Select files] / [Unselect files] - select or unselect files according to your selection criteria
- "Unselect all" - self explanatory
Files deemed unsafe for deletion by WeeDuplicateDetective are:
- Files with the system attribute set on
- Files located in the WINDOWS folder and its sub-folders
How to select all *.BAK files found by WeeDuplicateDetective :
- in the search bar: select "Name" and enter ".bak" in the text area
- in the selection area: click on "all files"
- click on "match the search pattern"
- click the button [Select items]. Done.
Three areas here again.
1. Expand / Collapse tree nodes
Expand or collapse the branches in the tree of files. In other
words this will hide or show the copies for all groups of
2. Sort the list items
Change the order in which the files are shown. Initially the files are shown in the list in the order in which they are found on the disk. You can easily change this order:
Sort - What to sort
- "originals only" - the copies will not be sorted, only the originals will be , in respect to each other
- "duplicates only" - inside each group and skipping the original file, all the copies will be sorted in respect to each other
- "All files" - 1st all the copies in each group are sorted, including the original, then the originals are sorted in respect to each other
By - by which criteria to sort
- When 'what to sort' is set to "originals only" then you get
an additional choice: sort by "amount of duplicates".
- hopefully self-explanatory
3. Remove items from the list
You have the possibility to remove items from the list.This can be useful to exclude a specify type of file (therefore making sure it will not be cleaned) or, on the other hand, focus on a specific file type.
Once an item is removed from the list, it cannot be displayed again. The only way to get all the items back in the list is to launch the scan process again.
This operation does not actually remove any stored
file, it only removes its reference in the file list in
The options are:
- "have no duplicates anymore" - this will remove from the
list all duplicate groups that contain only one item
- "match the search pattern" - this will remove from the list
all items that match the search
- "do NOT match the search pattern" - opposite of the above
- "are unsafe for cleaning" - this will remove from the list
all items that are deemed unsafe for deletionby
- "are checked for deletion" - this will remove from the list
all items that are checked. Again, this will remove only the
list item and not the real file.You can use this to clean up
the list by first selecting all
the files you know you want to keep and then using this option
How to remove .BAK files from the list to make sure they will not be cleaned:
- in the search bar: select "Name" and enter ".bak" in the text area
- in the "organize" area: select "match the search pattern"
- click on the button [Remove]. Done.
This area is available once a file is selected in the list. You
do no need to check the file for clean-up to select it, a simple
left click in the list area, on any line is enough.
"Manage" consists of two buttons:
[Open it] - the selected file will be opened as if you double clicked on it in a file explorer, for instance. Which program/application is used to open the file depends on the file associations defined in your system.
[Locate it] - The folder containing the selected file will be opened and the file will be selected in the file explorer.
Those two operations are also available in the
"Preview" is a very primitive media player/viewer.
Once a file is selected, you can check "Preview the selected
file" and WeeDuplicateDetective will try to display or
play the file if it is supported.
Pictures are simply displayed, with no additional control or
Supported pictures formats: JPG, GIF (also animated), PNG, BMP and TIFF
Videos and music files get the usual controls:
- 3 buttons: [play], [pause] and [stop]
- a timeline slider
- a volume control
Click on [play] to start playing the video or the sound file.
If you get an error message or if nothing happens, it means that
the file format is not supported. Also, said error message comes
from the codecs themselves, not from WeeDuplicateDetective . If
you do not know what a codec is, ask Google :)
Supported video / sound formats: AVI (DivX/XviD), MPG, WMV, WMA, MID, MP3. Depending on the codecs installed on your system as well as on the version of .NET available, you may be able to playback more file formats (OGG, MP4, etc.).
This media viewer is not intended as a "real" media viewer
(click on [Open it] if you want real playback), it's more
intended as a quick view to help decide if a file should be
cleaned or not.
Once you have made your choice of which files you want to be cleaned, you can start with the actual cleaning process
- Click on the button [Delete selected duplicates], located on the bottom right
- Choose the cleaning method.
- If you chose to move the file, you will need to indicate to
- Click on [START] to, well ... start the cleaning process. You will see a progress bar, and the amount of files yet to be cleaned will decrease. You can interrupt the clean up by clicking on [CLEAN]
- once the process is finished or interrupted, you can click on [CLOSE] to go back to the main window.
About the cleaning method "Move the files to a folder"
When choosing "Move files", the folder structure in which the files are stored originally will not be kept. The files will a similar name will nevertheless not overwrite each other. Indeed, if a file with a similar name already exists in the target folder, then the files being moved will be renamed using the following pattern: <file name>_<number>.<file extension> where number starts at 000001 and can go up to 999999.
Additionally to the usual title bar buttons (Minimize, Maximize
and Close) you can see a 4th button on the right end. Click on
it to display this help file.
Press the key combination Ctrl+F1 to see the
about box. Nothing spectacular though here.
Side bar grip
|Click on this vertical bar and move the mouse while
holding the click.
This will resize the list area and the side bar.
This area allows you to resize the main window. It's located on
the bottom right corner. Click on it and move the mouse while
holding the click.
126.96.36.199 - 2012/01
- remember the search patterns (in the combobox)
- lighter UI
- less borders, less color gradients and less animations
(speeds up the UI)
- the process of finding duplicates has its own dialog
- layout made a bit less "full" for some options, mostly in the right side bar
- removed a couple of unneeded text fields (ironically, I
had duplicate information...)
- files unsafe for deletion
- show those files (files in the Windows directory and system and hidden files)
- exclude those files from being selected or even remove them from the list
- all fields at once
- using wildcards (* and ?) or a regular expression
- allow filtering of search files per file extension
- new dialog to edit the type filters
- add one criteria to remove files from the list
- bug fixes
- display of amount of files cleaned & of files scanned
- crash when cleaning all files in a duplicate group
- cursor when finding duplicates
- remove files from list: stats not updated
- help file (FINALLY!) - html file, with style reminiscent of
188.8.131.52 - 2011/08/14
- allow cancel during search for duplicates
- support for animated gifs
- bug fix:some minor issues
184.108.40.206 - 2011/07
- drop shadows on borders
- add remove from list criteria: files that do not math the search pattern
- "unselect items" button
- add '?' button in main clean window title
- change: 'Close' to 'Cancel' during clean up, and 'Close' when it's done
- paralellize sort
- add progress for sort
- bug fixes
- bug fix: volume slider
- bug fix: duplicate count
- bug fix: sparable space count update during clean up
- bug fix: disable Find duplicates button while cleaning
- bug fix: done counter in the deletion dialog
- bug fix: amount of duplicates (converter)
0.9.8.0 - 2011/06/26
- different sortings of the tree
- order by: file name / path / size / # of copies / date
- sort: +button : remove singles from list
- bug fixes
- bug fix: use search Pattern: if already checked, do not change
- bug fix: file stats (total #files, total space, total space sparable, etc.) not updated after clean up
- bug fix: context menu over file tree list items is now displayed
- bug fix: selected counter when clicking on find duplicates
- bug fix: scroll behavior of the tree list view