|
Register To Post |
bogus | Screwed Up Documents | ||
Grand Imperial Pooh-Bah
|
A friend has one of these Seagate USB storage drives. Uses it as a backup. The problem is that the backup has turned into a mess. Copies from prior computer, folders buried within folders, and many levels. So finding everything unique was getting downright impossible. As fate would have it, I have been working on a product to clean up network folders of duplicate files. This is essentially the same thing. What the new program does is this: 1. Scans through a folder, getting a list of file names. 2. Gets the files MD5 checksum value, essentially its fingerprint. 3. Stores the stats of a file in a SQL database. 4. Finds another file, and looks for a matching MD5, if so, nothing happens. 5. Otherwise, looks for a name match, if so, creates a version 6. Saves the version - (1) or (2) or whatever 7. Repeats accordingly. It took a little while to get through a 20g folder, but faster that I ever would have been. I plan on making this available via download in a few weeks. |
||
Posted on: 2012/3/9 17:05
|
|||
_________________
The single biggest problem with communication is the illusion that it has taken place. - George Bernard Shaw Education is the best tool to overcome irrational fear. - me |
|||
Transfer |
pianoguy | Re: Screwed Up Documents | ||
Guru Emeritus
|
Slick!
|
||
Posted on: 2012/3/9 17:10
|
|||
_________________
1996 LT4 �Before you criticize someone, you should walk a mile in their shoes. That way when you criticize them, you are a mile away from them and you have their shoes.�- Jack Handey |
|||
Transfer |
BillH | Re: Screwed Up Documents | ||
The Stig Moderator
|
Excellent thing to do, Andy.
I understood the first 3 sentenances. |
||
Posted on: 2012/3/9 22:42
|
|||
_________________
Every man dies but not every man lives. |
|||
Transfer |
Matatk | Re: Screwed Up Documents | ||
Webmaster
|
That should speed things up and clear up some space on the HDD!
|
||
Posted on: 2012/3/10 2:10
|
|||
_________________
2002 EBM convertible, Magnusson supercharger, cam, headers, etc. 1989 Corvette...RIP |
|||
Transfer |
bogus | Re: Screwed Up Documents | ||
Grand Imperial Pooh-Bah
|
What it will do is allow the data owner to know what the hell they have!
There were folders embedded in folders within folders. Who knows what was going on. At least 100g of wasted space. |
||
Posted on: 2012/3/10 19:13
|
|||
_________________
The single biggest problem with communication is the illusion that it has taken place. - George Bernard Shaw Education is the best tool to overcome irrational fear. - me |
|||
Transfer |
warnerbob18 | Re: Screwed Up Documents | ||
Guru Newb
|
will it be helpful in occupying waste space..?
As all data on the HDD is not copied in sequence and some space between cells left blank always. Will it be helpful in recovering the left space..? |
||
Posted on: 2012/3/16 9:50
|
|||
Transfer |
bogus | Re: Screwed Up Documents | ||
Grand Imperial Pooh-Bah
|
What you're asking about is a disc compression tool. Windows already has such a thing.
What this does is find duplicates, copies, you name it, and then sorts things down to one concise list of files. It comes in real handy if you haven't kept a logical document folder structure, or, have stacked the data from old computer on top of old computer on top of old computer. Once you get several copies of the same thing spanning multiple folders and going several folders deep, good luck sorting it out. This sorts that out. |
||
Posted on: 2012/3/16 13:40
|
|||
_________________
The single biggest problem with communication is the illusion that it has taken place. - George Bernard Shaw Education is the best tool to overcome irrational fear. - me |
|||
Transfer |
JrRifleCoach | Re: Screwed Up Documents | ||
Elite Guru
|
Quote:
Sounds reasonable. So this will locate and DB all the files and then delete the dupes? Or just create an dupe index if your looking for a particular file? I have roughly 250GB of music files stored in two different locations. I need to combine these into a single location and identify/remove the dupes. How can your prog help? And will it be capable of running without intervention. (Tommy, I said intervention, not the Spanish Inquisition) |
||
Posted on: 2012/3/17 1:01
|
|||
Transfer |
bogus | Re: Screwed Up Documents | ||
Grand Imperial Pooh-Bah
|
It would fix your problem.
Here is what happens: 1. Initiate a loop to read the folders 2. Get fingerprint 3. Compare fingerprint to look for match 3a. If no match, store in db, if the file name is the same, but the fingerprints don't match, it will save the file with a version number - filename(2).txt - for example. 3b. copy to a new folder 3c. If match, iterate loop In short, this is NOT destructive. If you process the folders and it will populate a new folder with the selected unique files. All original files are where they started. |
||
Posted on: 2012/3/17 2:52
|
|||
_________________
The single biggest problem with communication is the illusion that it has taken place. - George Bernard Shaw Education is the best tool to overcome irrational fear. - me |
|||
Transfer |
JrRifleCoach | Re: Screwed Up Documents | ||
Elite Guru
|
Very nice.
Need a guinea pig? Errr tester? |
||
Posted on: 2012/3/18 0:17
|
|||
_________________
As democracy is perfected, the POTUS represents more closely the inner soul of the people. On some great and glorious day, the folks of the land will reach their heart's desire at last and the White House will be occupied by a fool and narcissistic moron. |
|||
Transfer |
bogus | Re: Screwed Up Documents | ||
Grand Imperial Pooh-Bah
|
sure! once I get it more finalized, I will make a copy available!
|
||
Posted on: 2012/3/18 3:49
|
|||
_________________
The single biggest problem with communication is the illusion that it has taken place. - George Bernard Shaw Education is the best tool to overcome irrational fear. - me |
|||
Transfer |
|