Andrew Que Sites list Photos
Projects Contact
Main

March 22, 2022

Limiting Find Command to Only New Files

Early in the month I wrote about a bash script to auto rotate images based on Exif data. This is completely functional, but slower than I'd like. The reason for the slowness is the number of pictures it must check. I set the script up to only check the last 30 days, but currently that is over 2000 files.

In reality, I only need to check the files that have not already been checked. To accomplish this we can use the find commands argument -newer. This allows a check for only files that are newer than the target file. Afterward we just need to touch the target file to set the modified time to the current time. The result will be that only files created after the last time the rotate check was done will be checked.

#!/bin/bash -
#------------------------------------------------------------------------------
# Uses: Search through recent images and rotate any not in correct orientation.
# Date: 2022-03-06
# Author: Andrew Que <https://www.DrQue.net/>
# Revisions:
#   2022-03-06 - Creation.
#   2022-03-20 - Rotation checks only files since last time run.
#------------------------------------------------------------------------------

# Get the current year.
currentYear=$(date +"%Y") || exit

# Path to rotate.
sourcePath="/<path>/$currentYear/"

# File to mark the last time run.
lastRunFile=$(realpath ~/.var/exifRotate.lastrun) || exit

# Number of threads to run.  Use the number of CPU cores available.
threads=$(grep -c ^processor /proc/cpuinfo) || exit

# Search pattern to select desired images.
searchPattern="IMG_*.JPG"

# Command to check image orientation and rotate it if necessary.
rotateCommand=$(cat <<-END
      # Check EXIF orientation for for "Top-left".
      exif "{}" | grep Top-left > /dev/null;

      # If the orientation isn't "Top-left", do rotation.
      # Note: "mogrify" always reads and writes the files even if no 
      # changes have been made.  So we skip doing this if we can.
      if [ \$? -eq 1 ]; then
        echo "Rotating {}"
        mogrify -auto-orient "{}" 2> /dev/null
      fi
END
)

# Search for all image that are of the correct name and age...
# then, in parallel, run the rotate command on each image.
find "$sourcePath" -type f -newer $lastRunFile -name "$searchPattern" \
  | xargs -i -n 1 -P $threads sh -c "$rotateCommand"

# Update the last time script was run.
touch $lastRunFile

Highlighted are the new lines.

For the last run file, I am using a local directory called var. This is similar to system var directory which holds transient data. However, the system directory isn't writable by users. This directory is not on all distributions and one could just use a hidden file in the home root.

The results are that the rotation check is now significantly faster. It still takes time to rotate the images, but doesn't need to check so many files.

March 13, 2022

Red Dragon drive issues

The Red Dragon

The Red Dragon

The Red Dragon has been a faithful backup machine for many years. It sometimes has issues with not establishing a network connect on boot, but that it rare. Most of the time if I have problem, it is because there are permission errors on the Snow Dragon preventing backups for completing. Then in January the Red-Dragon temporarily changed the MAC address on the network card. That was bizarre but went away after being unplugged.

On Friday I got an e-mail saying the backups had failed. The 8 TB backup drive didn't appear to mount. When I inspected the machine I found the partition table was corrupt. I ended up reformatting the drive and running a fresh backup. No idea what caused the problem.

I put the drive back into the computer today, and the partition disappeared on the first boot. Now I know there is something wrong, but I'm not sure what. Turns out I was able to restore the partition using fdisk. I decided to run a software update, changed a drive cable, and did several tests to see if I couldn't recreate the problem. For now the system looks functional. What could cause a problem like this I'm not sure, but the ability for the partition table on one's backup drive to simply disappear doesn't inspire confidence.

March 12, 2022

Removing temporary directories with find and xargs

My backup scripts remove temporary/cache files before backup operations begin. Sometimes I need remove temporary directories nested in a set of directories. It is easy to identify these using the find command. The find command can remove files using the -delete option, but it cannot remove directories this way. For that one can use -exec with the rm -R command like this:

find /path/to/start -name "*/temporary" -type d -exec rm -Rf {} \;

That will remove all directories called temporary located in /path/to/start regardless of how nested. The problem is that find wants to traverse this directory, and the rm command just removed it. This results in find printing an error.

find /path/to/start -name "*/temporary" -type d -exec rm -Rf {} \;
find: ‘/path/to/start/qrjs8kb3/temporary’: No such file or directory

This isn't an error in the strictest sense because we purposely removed the directory. The workaround is to pipe the output of find to xargs.

find /path/to/start -name "*/temporary" -type d | xargs rm -Rf

That will allow find to finish searching before the directories are removed.

March 11, 2022

Anonymous named pipes with tee for logging

Found a new method for capturing log and error output I really like. Typically when I have a script setup I pipe stdout to tee and a log file, and stderr is just redirected to a log file. A command would like something like this:

command | tee -a log.txt 2>> error.txt

Note that the -a option on tee is so the data is appended to the log file. Otherwise the file would be flushed and only the output of the last command would be in it.

The only downside of this redirection is that you do not get the error output on the screen—just in the log file. Then I found this clever trick.

command | 1> >(tee -a log.txt) 2> >(tee -a error.txt 1>&2)

Here we are employ anonymous named pipes to both print and save both stdout and stderr. In addition, the tee output of stderr is redirected to stderr. Thus if the script is nested the errors would still appear in the correct standard stream.