Andrew Que Sites list Photos
Projects Contact
Main
Some Green

Some Green

In Proxmox I noticed the Syslog being filled with the following line:

Apr 19 11:14:35 data-dragon sudo[23682]: que : TTY=pts/1 ; PWD=/home/que ; USER=root ; COMMAND=/usr/local/bin/dataDumpStatus
Apr 19 11:14:35 data-dragon sudo[23682]: pam_unix(sudo:session): session opened for user root by que(uid=0)
Apr 19 11:14:35 data-dragon sudo[23682]: pam_unix(sudo:session): session closed for user root

The command /usr/local/bin/dataDumpStatus prints the ZFS status of our primary data storage. It must run as root so I have a sudo grant for the user que. On my main computer I have a terminal window in which this command is invoked using watch so I get the status every 2 seconds. Problem is, I don’t want a log entry every time this command is successfully executed. In fact, I don’t need a log of any successful sudo command at all—only unsuccessful.

There are two modifications needed for this to work, both to /etc/sudoers:

Defaults        syslog_goodpri=none
Defaults        !pam_session

The first disables messages about successful sudo commands. The second disables messages about PAM sessions. The first line gets rid of the first line in the log message, and the second line removes the second and third line of the log message.

The 3rd and finial drive test on the Data-Dragon finished shortly before noon this morning. All three tests found zero errors. I still have reports from the kernel about issues, but they are not manifesting themselves as actual errors. So I’m calling the setup good enough to move forward.

Around noon I setup the ZFS pool and the main volume. Now the long process of copying data from the backup computer begins. I estimate the transfer will take about a week. Most of this time comes from the Backup Dragon’s drive setup—4x USB drives connected to a Raspberry Pi 4 B configured in RAID-5. The bottleneck is the USB and I’m getting about a 35 MiB/sec transfer rate.

I recall when MS-DOS 6.22 came out the fetcher I liked the most was the addition of the thousands separator in the directory (DIR) command. I found visually having a separator made it much easier to determine file size. Now 26 years latter I still prefer to have thousands separators when getting a directory listing. In Linux this can be accomplished using using the parameter --block-size="'1" on the ls command. Debian typically sets up the alias ll to be ls -alF. I’ve modified this alias with the block-size parameter for thousands separators with a quick change to ~/.bashrc:

#alias ll='ls -alF'
alias ll="ls -alF --block-size=\"'1\""

Note having to change the string character from a single quote to a double quote with escapes because the single quote is used in the block size parameter.

   Picture from my not-so-fun adventure on the east side of Glacier National Park.  I got stuck in a snowdrift and the drift kept getting worse as I tried to shovel my way out of it.  Pictured is a ranger truck helping to pull me out of the mess.  As much as the situation sucked, it could have been worse if I hadn't received any help.  Hats off again to the rangers of Glacier National.
   I have given up on the my hard drive controller card and sent it back.  Despite all my attempts I could not get it to work with the drives I am using.  I ordered a similar but older card and will try my tests with this card.  I really hate hardware problems.

April 12, 2020

Western Digital Raptor

The Red-Dragon was commissioned in 2003, and booted from one of the fastest hard drives then on the market, a 36 GB 10,000 RPM SATA Western Digital Raptor. The drive still serves the Red-Dragon which has now become a backup computer. Today I wondered if that drive had SMART functionality, and indeed it does. From the report it quickly became clear to me that this hard drive has more runtime than any drive I own: 40,394 hours. That’s 6 years and 222 days of over 17 years. Most of that time was at the beginning from 2003 to 2008 when the computer never turned off. After 2008 the computer ran periodically, mostly for backups.

For a 17 year old drive with +40k hours of run time, the drive is in great shape. In it’s lifetime it has never reported a raw-read error, no reallocated sectors, no seek errors, no spin retries, no multi-zone errors, and only 8 UDMA CRC errors. Compare this to the 3 TB drive used for backups on this machine, a Seagate Barracuda 7200.14. It has 1,413 hours of run time (58 days) and in that time it has logged 107,349,736 raw read errors and 137,481,516,108.

With all my recent hard drive trouble I will simply say: they sure don’t building them like they used to.

Drive test without moving cables passes without issues a second time. Started test without new controller to see if I still had kernel error. I do. The same sets of errors:

[  544.414978] ata2.00: NCQ disabled due to excessive errors
[  544.414980] ata2.00: exception Emask 0x0 SAct 0x4 SErr 0x0 action 0x6 frozen
[  544.414984] ata2.00: failed command: READ FPDMA QUEUED
[  544.414989] ata2.00: cmd 60/00:10:00:00:00/01:00:00:00:00/40 tag 2 ncq dma 131072 in
                        res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[  544.414993] ata2.00: status: { DRDY }
[  544.414996] ata2: hard resetting link
Distant Devil's Tower

Distant Devil's Tower

Tried switching software on the controller card. Initial software was version 14. I tried 20, and then 19 as others had said was necessary. The I/O errors were unaffected by this change. This is either a controller issue, or cabling. Since we have one drive not reporting any errors, I suspect it is a controller issue. And since I have seen badblocks fail for drives on this controller I suspect the problems are real.