67.2 Photo Duplicates

20220122

Duplicates of photos readily occur as we copy photos around on our storage and attempt to manage large collections of photos with different file naming schemes.

Duplicate photos can readily be found using fdupes as introduced in Section18.5.

An efficient process is to use the --delete --recurse options of fdupes to begin an interactive session which lists all duplicated files and provides options for their resolution.

fdupes --delete --recurse .

With no options fdupes lists groups of duplicated files in the specified directory:

$ fdupes .

./20180323_122434_02.jpg
./20180323_122434_01.jpg
./20180323_122434_00.jpg

./20030102_092312_03.jpg
./20031012_092312_00.jpg

With -r (--recurse) sub-directories are included. A summary of duplicates is obtained with -m (--summarize):

fdupes --summarize .
13567 duplicate files (in 6407 sets), occupying 16996.0 megabytes

By default fdupes can be asked to delete duplicates, retaining the first listed within each group. Using --reverse and --order= by name might be useful for filenames that differ by numerals, so keeping the lowest numbered file. Explore with order to get what best works for you.

$ fdupes --order='name' --reverse .

./20180323_122434_00.jpg
./20180323_122434_01.jpg
./20180323_122434_02.jpg

./20031012_092312_00.jpg
./20030102_092312_03.jpg

The following command will delete duplicates --delete without asking --noprompt, keeping the first listed in each group using an ordering we have specified:

$ fdupes --delete --noprompt --order='name' --reverse .

   [+] ./2020/20200926_063024.jpg
   [-] ./camera/20200926_063024.jpg
   [-] ./todo/20200926_063024.jpg

   [+] ./2020/20201114_061818.jpg
   [-] ./camera/20201114_061818.jpg
   [-] ./todo/20201114_061818.jpg
...


Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0