Is Linux overusing hard drives?

Monday, May 18 2015, when I tried to synchronize files between my main computer and my HTPC, I got error messages from Unison telling it was failing for some files. Tired of repeated unexpected failures, I tried copying the files manually. Manual copy was failing too. I quickly noticed that the whole partition I was copying the files to turned into read only. I tried to enable read/write by remounting the partition, that worked, but it reset back to read only after a few minutes. Some Google search later, I was thinking my hard drive was failing. I tried running a self-test using sudo smartctl -t short /dev/sdb and after a few minutes, sudo smartctl -a /dev/sdb. The drive was now failing the short auto-test. Damn, again? Yep, another dead hard drive!

Short term fix

First step was to stop using that beast, so move away all I can before it goes even worse. A failing drive can always become totally unusable, corrupting ALL data on it.

Transfer of my music files from that failing drive (the 1.5Tb one) to another one (the 3Tb one) started to output multiple I/O errors. No, it will fail like this for half of the files and take forever! I aborted that and instead copied my music files from my main PC. At least I would get all the files, not half of them. Fortunately, I was able to transfer most of my ripped DVD and blu-ray disks from my 1.5Tb to my 3Tb drive. I finished the transfer the next day; it ran in the background while I was working from home. I could then unmount the 1.5Tb drive, putting it out of the way for Ubuntu.

I got a new 3Tb drive Friday, May 22 2015 and installed it the morning after. This time, I was able to pick the right drive to remove, because the 1.5Tb was colored green while the 3Tb was red. The new 3Tb is also a red WD model. Unfortunately, things are never smooth when modifying that machine: I had to remove my SSD drive+bracket assembly in order to unscrew the 1.5Tb hard drive. However, after that, I was able to install the new 3Tb drive, put back the SSD assembly and reconnect everything. This went surprising well as opposed to last time where I had trouble finding a way to connect power on the four drives.

The new 3Tb drive seemed to work correctly. I was able to partition it as GPT, create a 3Tb Ext4 partition and mount that partition. I then started a long self-test to make sure it will not fail at me just a couple of days after. The self-test completed during the evening and showed no error.

Tentative prevention

Problem solved? Partly. The thing is that machine runs 24/7 to power a Minecraft server. This makes both hard drives spin non-stop. I would like Ubuntu to stop the drives when they are unused. I moved the Minecraft files to my SSD and will use the hard drives only for media and backup.

No, Ubuntu never ever spins down any hard drive! I tried to set this up with sudo hdparm -S 241 /dev/sdb, no result. Only thing that worked is manually spin down the drive with sudo hdparm -Y /dev/sdb (or /dev/sdg for the other drive). I recently found that gnome-disks has an option to set drive spindown timeout. The spindown setting from gnome-disks was honored once on my main computer, but I need to check if it’s reliable or not.

What if it happens again and again?

Otherwise, it seems I need Windows just to get my hard drives automatically spin down when unused, which is quite a shame! I don’t want to format this HTPC as a Windows machine, because the Minecraft server won’t run smoothly on Windows. I will be stuck with an always-open command prompt window with the server running, unless I search forever to figure out a way to run this as a system service, assuming it is possible.

A colleague at my workplace suggested the use of a Ubuntu virtual machine, but my HTPC doesn’t have enough memory to reliably run a VM and I cannot bump it up more than 4Gb because of motherboard limitation. Well, I could try to stick it 4 4Gb DDR2 modules and see, but I’m not sure the board would accept this at all, even though that could fit physically! If that fails, I would be stuck with useless DDR2 while newer systems use DDR3. What a pain!

I also investigated the possibility of using a RAID to improve reliability of storage. If I put a third 3Tb hard drive, I could configure a RAID5 array of total 6Tb, and even increase to 9Tb with a fourth 3Tb hard drive! The RAID5 splits and mirrors the data in such a way that two drives are involved when accessing files, increasing performance. It also makes sure that if one drive fails, ALL the data can be recovered and the array can be rebuilt by simply removing the failed drive and adding a replacement drive.

I was tempted by this, but that would have forced me to purchase two hard drives and another PSU to have more than four SATA power connectors. I wasn’t sure I wanted to spend more than 300$ just to get this up and running. Moreover, creating the RAID array would have forced me to move all the files away from my current 3Tb drive to combine it with the two new drives, unless I jumped directly to the 4-drive array.

I will instead wait for that machine to die and next system could be a smaller SSD-only HTPC combined with a NAS offering easier drive installation and replacement. I could purchase a dedicated NAS, or build myself a generic computer configured as a NAS. Fortunately, Ubuntu has facilities to configure software RAID, I checked that recently. I’m not sure about the fake RAID using the motherboard, that may or may not work, that may or may not be better than software RAID.

The downsides of SSDs

What’s the point of having a SSD if both Windows 8 and Ubuntu 15.04 introduce artificial timeouts that increase the boot time, making this equivalent as having a standard hard drive? Well, I’m there, I reached that point.

Windows 8 often boots fast from EFI to login screen but after I enter my password, it sometimes reaches the desktop in five seconds, sometimes hangs for 30 to 45 seconds. There is no obvious reason why, no way to track this down and no obvious solution other than deleting my user account and creating a new one. I cannot spend my weekends doing, redoing, redoing and redoing that. This is just pointless and inefficient! I could try to reinstall, but then I will have trouble with reactivating Windows, reauthorizing Ableton’s Live and have to spend hours waiting for manual installation of countless drivers and software tools. Ninite can help with programs, not with drivers.

Some time later, I found out that uninstalling and reinstall the driver for my M-Audio interface fixed the slow boot up. There seems to be a conflict between the M-Audio’s Fast Track Pro and Novation’s UltraNova drivers. Windows 10 also seemed to stabilize things a bit.

Ubuntu, most of the times, boots quickly. However, starting from 15.04, it was taking almost a minute from splash screen to login screen. I had to spend more than half an hour looking at syslog to figure out that the swap partition changed UUID but the update script didn’t reflect that into /etc/fstab. Several people repeat that we shouldn’t do dist-upgrades and rather reinstall, but then, why is there a dist-upgrade option at first place? Fortunately, fixing the partition ID in /etc/fstab restored my boot time.

This is not SSD-specific issues, but they cause the SSD to be less useful. Another factor reducing usefulness of SSD is the never-ending size increase of OS and applications, especially when dealing with virtual machines. This ultimately fills any SSD, requiring time consuming reorganization of the layout (partition resizing, copying on a larger drive, etc.).

I don’t want to go backward, switching from SSD to an hard drive, but practice seems to tell me I should. This is disappointing and quite frustrating.

Minor problems stacking up

It seems to happen too often to me. I’m ending up with many different small, minor problems, sometimes not big deals taken alone. But when they sum up, this becomes unbearable, resulting into a bad and exhausting day. Sometimes, the solutions are simple, sometimes not. Here is the most recent stack of such issues.

Spending my time importing modules in Python

For the moment, the only way I have to edit my Python code running on a remote virtual machine is to use Emacs running from that machine. I’m investigating local solutions, but this is just a non-sense chain of complications or requires software free for commercial use that I cannot adopt at work.

One operation that ends up to be frequent is to import a module. You are writing a piece of code and then need to call into a function defined in another module or in the standard Python library.  When this happens, I need to add an instruction to import the module if it is not yet from my current module. Import constructions can appear anywhere in the code, but convention puts the instructions at the beginning of the files. This seems better for code organization and ensures that all imports happen at start of the program rather than at any time during the execution. If a module imported at top of the file is missing, the code will fail fast, as opposed to fail only when a function importing a module is called.

As a result, I am sometimes editing code and need to step at start of file to add an import, then find back where I am and continue. This small interruption in task flow is not a big deal in itself but it becomes more and more painful when it repeats itself tens of times a day, sometimes at each and every successive line of code!

Maybe I went too granular and split my code in too many modules. Am I in a situation where it would be better to have one single huge file with a lot of stuff, rather than splitting my code in multiple files? Well, last Monday, I was at the point of wondering that.

How did I work this out before? When I was programming in Java, I was using Eclipse as my IDE. This program is able to guess the import statements from referred class names and automatically add them at the beginning of the files, without having me go there, loose my position and come back. Probably a Python IDE such as PyDev or PyCharms would do it, but I cannot use them for the moment.

Fortunately, I came up with a very simple trick. In GNU Emacs, you can split the window in two parts by using C-x 2. Both windows first point at the same part of the current buffer, but it is perfectly possible to move the cursor up. This changes the position of the current window, while leaving the cursor unchanged in the second window. I can then add my import statements, switch to the window with original cursor using C-x o, then make it the only window with C-x 1. Something similar can be done using bookmarks, but this requires other keyboard shortcuts harder to remember and each bookmark needs to be named while sometimes they are one-off save/restore scenarios.

Keyboard shortcut confusion

Since I gave up on Virtuawin because its behavior was too inconsistent on Windows 8, I ended up using Desktops from SysInternals which uses different keyboard shortcut than Ubuntu for switching desktops. I thought CTRL-ALT-F1 to F4 would be nice keys, but this ends up being a nightmare. As soon as I came back to Ubuntu after one week working with Desktops, I was screwed up, always making the same mistake of pressing CTRL-ALT-F1, which switched to the console. I then had to press CTRL-ALT-F7 to go back to X. This also happens in VirtualBox virtual machines.

There is no perfect solution for this. My current workaround is to use CTRL-F1 through F4 instead. At least, if I press that in Ubuntu, nothing happens, as opposed to wiping my screen away and forcing me to press CTRL-ALT-F7.

Windows 8 becoming increasingly disturbing

Even Desktops starts to be a nuisance because of Windows 8. As soon as I am on desktops 2, 3 or 4, I need to be careful not to press on the Windows key. If I do it, which I am used to, this switches to Metro which of course replaces my screen with the home screen. Then I am back to desktop 1. Metro is my most common way of starting applications on Windows 8: press Windows key, type a name, press enter. This also works great on Windows 7 and Ubuntu’s Unity.  But this fails with Desktops on Windows 8. There is no way out, other than putting EVERY application on the desktop and spending more than 30 seconds each time I want to start something! I just cannot do that, this is too inefficient, especially knowing that it will take half the time to find out the icons for a sighted person than for me with a visual impairment. I could work around by grouping the icons in some clever way, creating folders, but this remains clunky.

Windows 8 is also causing issues with Lync, the instant messaging software my company is using. Lync works correctly in text mode, but it intermittently fails for voice chat and screen sharing. When this happens, it consistently displays a network error, no matter how hard we try to establish the communication. We have to fall back on alternative ways, like traditional phone. The only solution has been to reinstall Lync.

IT people at my company were of no help. They would have liked me to try out Lync on Windows 8 in the office rather than finding what is going on. However,the Windows 8 ultrabook my company lent me is almost unusable for me unless I hook it to an external monitor and mouse. At the office, I don’t have an HDMI input I could use to plug my mini DisplayPort to HDMI adapter I use at home, and the mini DisplayPort to VGA adapter that could have worked went away a couple of months ago. Somebody borrowed it and never brought it back. Even if I had the adapter, performing this test would be tedious. I would have to wait for a voice chat and when it happens, quickly switch from my regular laptop to the ultrabook and try. This is just non sense stress that may just give no result: it will probably work correctly, showing that the problem comes from VPN, my router, etc.

There is no way out, other than running away from Windows 8. I thus stopped using the ultrabook, at least for the moment, and used the official laptop instead. That machines runs the good old Windows 7 OS. If Lync starts failing with that as well, then I will have to switch routers, try on somebody else’s Internet connection, etc.

SSH connection timing out for no reason

We recently switched from outdated Cisco’s IPSec VPN client to newer SSL-based AnyConnect. This seemed to go smooth, after I uninstalled and reinstalled VirtualBox (seems VirtualBox is interfering with Cisco’s VPN). However, I noticed that SSH started to time out at me when I wasn’t interacting with the shell for a moment. This was forcing me to reconnect, restart what I was working on, sometimes switching to a directory with a long path, sometimes Bash history was working, sometimes not.

My first attempt at working around unreliable connections was to make use of EmacsClient. For this, I started Emacs with the -daemon option, then used emacsclient instead of emacs. That works surprisingly well. With that, Emacs keeps running on the server and doesn’t loose track of unsaved files if connection drops. Of course, I keep the good habit of saving often, which is an additional elementary safety against unreliable connections.

Yesterday, I may have found out a way to get rid of these timeouts. My current hypothesis is that the new VPN client monitors the TCP connections going through the tunnel it establishes and shuts them off if they are inactive. Fortunately, SSH provides a way to keep connections alive: the ServerAliveInterval configuration option. By putting ServerAliveInterval 60 in my .ssh/config for my virtual machine, I am forcing SSH to send a TCP packet every 60 seconds so nor SSH server, nor VPN client, have a reason to kill the connection. This seems to help, but this is not fully tested yet.

The mouse making me mad!

Why does my Razor mouse was so jumpy? I feel I have less problem at the office than at home with the mouse? Will I really need to get myself a Dell mouse like the one I have at the office?  Is it because I am running crazy? Maybe not! I found out this week that my mouse pad is pretty suspect: it has multiple scratches on it that can well screw up the optical-based mouse tracking, and the surface has some patterns. I remember when I worked at my parents’ home having hard time with the mouse. That ended up to be because of the desk! Putting the mouse on top of a piece of white paper, yes, just a plain old piece of paper, nothing more sophisticated, cleared the issue! So I removed that mouse pad and that seemed to help! It is unbelievable how sometimes, simple solutions, almost non-sense stupid ones, can lead us far!