Why commandline is important

Some interesting BASHing I did today:

find . -type f -print | cut -d'/' -f 2 | uniq | sort > content
for files in *; do echo $files; done | sort | grep -v content | grep -v -f content | xargs -n 1 rm -rf

The first line finds all subdirectories of the current directory that contain files and prints the list to the file contents. The second line deletes all the subdirectories that are not contained in the file content.

I had to do this to clean up a huge number of orphaned batch directories left by our document imaging system (Windows, but I'm running Cygwin and pointing to a drive mapped to the DI fileserver). The vendor provides a GUI app to do this, but it takes the larger part of eternity to map out all the directories, then you have to click on each directory and click delete. Pretty useless for the thousands of directories I had to deal with. My commandline solution took about 5 minutes to work out and around 15 seconds to run. Several hours of menial mouse clicking saved.

I had to shut down a release process in order to make sure I didn't catch any legitimate directories that had been created but not yet populated. The next iteration will probably include a bit more logic to only include directories that are more than a day old or so. A simple -mtime tweak to the find command should do it. That will enable me to run it while the entire system is live.

